← All one-liners·#028·synthesis·llm cli·power

Groq speed lane via llm -m groq-llama-3.3-70b-versatile

Groq runs Llama 3.3 70B at ~300 tokens/sec — finishes a batch before Gemini-pro finishes the first one. Simon Willison's `llm` + `llm-groq` plugin gives you the same -p shape.

Setup

→ brew install llm
→ llm install llm-groq
→ llm keys set groq # paste GROQ_API_KEY

Cost per run

<$0.01

The one-liner

$ curl -s "https://hn.algolia.com/api/v1/search?query=duckdb&hitsPerPage=20" \
  | jq -r '.hits[] | "- [\(.points // 0)] \(.title)"' \
  | llm -m groq-llama-3.3-70b-versatile \
      "Three bullets: dominant claim about DuckDB on HN, one underrated post (cite title), overall sentiment."

What each stage does

[01] curlcurl … hn.algolia.com/api/v1/search?query=duckdb …
20 highest-relevance HN stories mentioning DuckDB.
[02] jqjq -r '.hits[] | "- [\(.points // 0)] \(.title)"'
Tight bullet list — points + title per line.
[03] llmllm -m groq-llama-3.3-70b-versatile
Simon Willison's `llm` CLI with the `llm-groq` plugin. Same `stdin → -p prompt` shape as gemini/claude. Models: `groq-llama-3.3-70b-versatile`, `groq-llama-3.1-8b-instant` (fastest), `groq-mixtral-8x7b-32768`.
[04] llm"Three bullets: …"
Last positional arg = prompt. stdin is appended automatically.

Expected output (sample)

- **Dominant claim**: DuckDB has displaced Pandas for laptop-scale analytics — multiple posts cite 10x speedups on identical queries.
- **Underrated post**: "Querying NYC taxi data from a $5 VPS" (47 points) — quietly shows DuckDB's HTTP-streaming Parquet trick at scale.
- **Sentiment**: pragmatically bullish — builders, not hype.

Caveats & tips

Groq is the speed king for first-pass classification/summarization, not deep reasoning. Use it for fan-out, hand the consolidated result to claude/gemini-pro for synthesis.
Free tier: ~30 req/min, ~14k tokens/min — fine for personal use.

← #027

Capped fan-out across N inputs with --max-budget-usd

#029 →

SQS: drain a queue to local JSONL