Groq speed lane via llm -m groq-llama-3.3-70b-versatile
Groq runs Llama 3.3 70B at ~300 tokens/sec — finishes a batch before Gemini-pro finishes the first one. Simon Willison's `llm` + `llm-groq` plugin gives you the same -p shape.
Setup
- → brew install llm
- → llm install llm-groq
- → llm keys set groq # paste GROQ_API_KEY
Cost per run
<$0.01
The one-liner
$ curl -s "https://hn.algolia.com/api/v1/search?query=duckdb&hitsPerPage=20" \
| jq -r '.hits[] | "- [\(.points // 0)] \(.title)"' \
| llm -m groq-llama-3.3-70b-versatile \
"Three bullets: dominant claim about DuckDB on HN, one underrated post (cite title), overall sentiment."What each stage does
- [01] curl
curl … hn.algolia.com/api/v1/search?query=duckdb …20 highest-relevance HN stories mentioning DuckDB. - [02] jq
jq -r '.hits[] | "- [\(.points // 0)] \(.title)"'Tight bullet list — points + title per line. - [03] llm
llm -m groq-llama-3.3-70b-versatileSimon Willison's `llm` CLI with the `llm-groq` plugin. Same `stdin → -p prompt` shape as gemini/claude. Models: `groq-llama-3.3-70b-versatile`, `groq-llama-3.1-8b-instant` (fastest), `groq-mixtral-8x7b-32768`. - [04] llm
"Three bullets: …"Last positional arg = prompt. stdin is appended automatically.
Expected output (sample)
- **Dominant claim**: DuckDB has displaced Pandas for laptop-scale analytics — multiple posts cite 10x speedups on identical queries. - **Underrated post**: "Querying NYC taxi data from a $5 VPS" (47 points) — quietly shows DuckDB's HTTP-streaming Parquet trick at scale. - **Sentiment**: pragmatically bullish — builders, not hype.
Caveats & tips
- Groq is the speed king for first-pass classification/summarization, not deep reasoning. Use it for fan-out, hand the consolidated result to claude/gemini-pro for synthesis.
- Free tier: ~30 req/min, ~14k tokens/min — fine for personal use.