← All one-liners·#027·scraping·llm cli·expert

Capped fan-out across N inputs with --max-budget-usd

GNU parallel + claude --max-budget-usd = predictable cost ceiling per slot. The bill is bounded before you start.

Setup
  • → brew install parallel
  • → claude /login OR export ANTHROPIC_API_KEY=sk-…
Cost per run
$0.01-0.10
The one-liner
$ SUBS=(MachineLearning LocalLLaMA singularity ClaudeAI OpenAI)
mkdir -p /tmp/sub

# Stage 1: scrape to disk (avoids claude's 3s stdin race)
printf '%s\n' "${SUBS[@]}" | parallel -j 4 '
  curl -s -A "oneliner101/1.0" "https://www.reddit.com/r/{}/top.json?limit=15&t=day" \
    | jq -r ".data.children[].data | \"- [\(.score)] \(.title)\"" > /tmp/sub/{}.txt
'

# Stage 2: summarize, capped per slot
printf '%s\n' "${SUBS[@]}" | parallel -j 4 --tag '
  claude -p \
    --max-budget-usd 0.05 \
    --no-session-persistence \
    "Three bullets on r/{}: theme, signal, noise." \
    < /tmp/sub/{}.txt
'
What each stage does
  1. [01] bashSUBS=(MachineLearning LocalLLaMA singularity ClaudeAI OpenAI)
    Array of subreddits. Change this one line to redirect the whole pipeline.
  2. [02] parallelparallel -j 4 '... > /tmp/sub/{}.txt'
    Stage 1 — scrape all 5 subreddits to disk in parallel (4 slots). Slow upstream isolated from claude's 3s stdin grace period.
  3. [03] parallelparallel -j 4 --tag '... < /tmp/sub/{}.txt'
    Stage 2 — 4 claude calls running concurrently, each reading from its own buffered file. --tag prefixes output with the subreddit name.
  4. [04] claude--max-budget-usd 0.05
    Hard $ cap per claude invocation. Process exits cleanly if it would exceed. Total spend ceiling = 0.05 × 5 slots = $0.25 max, no matter what the prompt does.
  5. [05] claude--no-session-persistence
    Don't save sessions to disk — 4 parallel slots otherwise create 4 session files.
Expected output (sample)
MachineLearning	- Theme: agentic RAG with hybrid retrieval ...
MachineLearning	- Signal: a working paper shows ...
MachineLearning	- Noise: yet another 'vibes-based' benchmark ...
LocalLLaMA	- Theme: Llama 4 70B fine-tuning ...
...
Caveats & tips
  • Two stages prevent claude's 3s stdin timeout vs slow upstream — without it, some slots silently get empty stdin and answer from general knowledge.
  • Bill ceiling = --max-budget-usd × concurrency, not × N inputs. Plan accordingly.