← All one-liners·#024·synthesis·llm cli·power

Production-grade reasoning that doesn't stall — --fallback-model

Run Opus at max effort, but auto-failover to Sonnet when Opus is overloaded. The flag that keeps cron jobs alive.

Setup

→ claude /login OR export ANTHROPIC_API_KEY=sk-…

Cost per run

$0.01-0.10

The one-liner

$ curl -s "https://hacker-news.firebaseio.com/v0/topstories.json" \
  | jq -r '.[0:30][]' \
  | xargs -P 10 -I {} curl -s "https://hacker-news.firebaseio.com/v0/item/{}.json" \
  | jq -s -r 'sort_by(-.score) | .[0:10] | .[] | "- [\(.score)] \(.title) — \(.url // "discussion")"' \
  | claude -p \
      --model opus \
      --effort high \
      --fallback-model sonnet \
      "Today's top HN stories. Identify the one quietly important but under-discussed. 5 sentences for an outsider."

What each stage does

[01] curlcurl … topstories.json …
Same HN front-page fetch as recipe #1.
[02] claudeclaude -p
Headless print mode. Required for --max-budget-usd, --fallback-model, --no-session-persistence.
[03] claude--model opus
Pick the strongest model — opus, sonnet, haiku alias, or full ID like `claude-sonnet-4-6`.
[04] claude--effort high
Reasoning effort: low / medium / high / xhigh / max. High = real extended thinking. Latency rises with effort.
[05] claude--fallback-model sonnet
When opus returns 529 (overloaded), claude automatically retries with sonnet. Production-grade: the pipe never stalls. Only works with -p.

Expected output (sample)

The under-discussed story is "The hidden cost of LLM batching" (score 388). While the M5 Pro benchmarks and the Rust vector DB get the upvotes, this one answers a question every team running production LLMs hits within their first month: how aggressive batching changes the latency-vs-cost tradeoff in a way that breaks SLA monitoring...

Caveats & tips

--fallback-model only fires on overload (5xx), not on errors like rate limits.
Test it by setting `--model haiku --fallback-model nonsense-model` — you should still get a haiku response, since haiku never overloads.

← #023

Gemini structured JSON envelope with -o json

#025 →

Pure-LLM lockdown mode with --tools ""