Lambda: tail recent errors → claude clusters them
One hour of Lambda errors, clustered by root cause via claude. Faster than reading CloudWatch by hand.
Setup
- → brew install awscli
- → aws configure
- → claude /login OR export ANTHROPIC_API_KEY=sk-…
Cost per run
<$0.01
The one-liner
$ aws logs tail /aws/lambda/my-function \
--since 1h --filter-pattern '?ERROR ?Exception ?Traceback' \
--format short \
| claude -p \
--append-system-prompt "You are an SRE. Be specific. Prefer fix-it-this-week over architectural rewrites." \
"Cluster these Lambda errors by root cause. For each cluster: count, one-line cause, one-line fix. Markdown table."What each stage does
- [01] aws
aws logs tail /aws/lambda/my-function --since 1hStreams the last hour of CloudWatch logs from the function's log group. --since accepts 1h / 30m / 1d / a timestamp. - [02] aws
--filter-pattern '?ERROR ?Exception ?Traceback'CloudWatch filter syntax: `?word` means 'matches word'. Space-separated is OR. Filters server-side, not in your terminal — much cheaper than grep on a firehose. - [03] aws
--format shortStrips the noisy timestamp/stream prefix. Just the message text. Ideal for piping to an LLM that doesn't need the metadata. - [04] claude
claude -p --append-system-prompt "You are an SRE. …"Persona via system prompt keeps the user prompt focused on the data. The 'fix-it-this-week' framing prevents claude from suggesting a microservices rewrite.
Expected output (sample)
| Count | Root cause | Fix | |-------|------------|-----| | 142 | DynamoDB ProvisionedThroughputExceededException | Switch to on-demand or raise WCU | | 38 | JSON.parse on truncated SQS body | Check SNS→SQS subscription's RawMessageDelivery setting | | 7 | Lambda timeout at 30s on cold start | Raise timeout to 60s or move to provisioned concurrency |
Caveats & tips
- If the log volume is huge, narrow with `--since 15m` first — claude has a context limit.
- Swap `claude` for `gemini -m gemini-3.1-pro-preview -p "…"` if you have free-tier Gemini credits and prefer to spend those.