← All one-liners·#040·cost·aws·expert

RDS audit (CPU + connections, 7d) → claude downsize plan

Pulls every RDS instance with its class and storage, batches 7-day average CPU and DatabaseConnections via a single `cloudwatch get-metric-data` call, and asks claude to flag idle/oversized DBs with a concrete next-size-down recommendation.

Setup

→ brew install awscli jq
→ aws configure
→ IAM: rds:DescribeDBInstances, cloudwatch:GetMetricData
→ claude /login OR export ANTHROPIC_API_KEY=sk-…

Cost per run

<$0.01

The one-liner

$ REGION=${AWS_REGION:-eu-central-1}; END=$(date -u +%Y-%m-%dT%H:%M:%SZ); START=$(date -u -v-7d +%Y-%m-%dT%H:%M:%SZ 2>/dev/null || date -u -d '-7 days' +%Y-%m-%dT%H:%M:%SZ); INSTANCES=$(aws rds describe-db-instances --region $REGION --query 'DBInstances[?DBInstanceStatus==`available`].{id:DBInstanceIdentifier,class:DBInstanceClass,engine:Engine,storage:AllocatedStorage,multiaz:MultiAZ}' --output json); QUERIES=$(echo "$INSTANCES" | jq -c '[ .[] | .id as $i | ({Id: ("cpu_" + ($i|gsub("-";"_"))), MetricStat: {Metric: {Namespace: "AWS/RDS", MetricName: "CPUUtilization", Dimensions: [{Name:"DBInstanceIdentifier",Value:$i}]}, Period: 3600, Stat: "Average"}}, {Id: ("con_" + ($i|gsub("-";"_"))), MetricStat: {Metric: {Namespace: "AWS/RDS", MetricName: "DatabaseConnections", Dimensions: [{Name:"DBInstanceIdentifier",Value:$i}]}, Period: 3600, Stat: "Average"}}) ]'); METRICS=$(aws cloudwatch get-metric-data --region $REGION --start-time $START --end-time $END --metric-data-queries "$QUERIES" --output json); echo "$INSTANCES" | jq --argjson m "$METRICS" -r '. as $inst | $m.MetricDataResults | (map({(.Id): (.Values | add / (.Values|length // 1))}) | add) as $avg | $inst[] | [.id, .class, .engine, .storage, (($avg["cpu_" + (.id|gsub("-";"_"))] // 0) * 100 | floor / 100), (($avg["con_" + (.id|gsub("-";"_"))] // 0) | floor), .multiaz] | @tsv' | sort -k5 -n | claude -p "You are a FinOps DBA. Columns: db_id, class, engine, storage_gib, cpu_avg_pct_7d, connections_avg_7d, multi_az. Flag any instance with cpu<20%% AND connections<5 as IDLE; cpu<40%% as OVERSIZED. For each flagged row recommend the next-smaller class in the same family (e.g. db.r6g.2xlarge → db.r6g.xlarge halves cost). End with a ranked table of monthly savings. Be conservative on Multi-AZ prod DBs."

What each stage does

[01] awsINSTANCES=$(aws rds describe-db-instances --query 'DBInstances[?DBInstanceStatus…
JMESPath filter drops instances mid-failover/backing-up so the CloudWatch batch doesn't waste slots on transients.
[02] jqQUERIES=$(echo "$INSTANCES" | jq -c '[ .[] | .id as $i | ({Id: ..., MetricStat: …
Builds the `--metric-data-queries` array entirely in jq. Two metrics per instance, hyphens replaced with underscores because GetMetricData Ids must match `^[a-z][a-zA-Z0-9_]*$`.
[03] awsaws cloudwatch get-metric-data --start-time $START --end-time $END --metric-data…
ONE API call returns 7d of CPU + connections for up to 250 RDS instances (500-metric limit ÷ 2 metrics each). Cheaper and faster than per-instance get-metric-statistics loops.
[04] jqjq --argjson m "$METRICS" -r '... $m.MetricDataResults | (map({(.Id): (.Values|a…
Folds the metric results into an `{id: avg}` lookup, joins it back onto each instance, flattens to TSV with both metrics on one row — the layout claude needs for its IDLE/OVERSIZED rules.
[05] claudeclaude -p "You are a FinOps DBA. ... cpu<20%% AND connections<5 as IDLE; cpu<40%…
Explicit numeric thresholds prevent vague answers; the Multi-AZ caveat stops claude from confidently recommending you halve a prod replica.

Expected output (sample)

analytics-replica-2	db.r6g.2xlarge	postgres	500	3.1	2	false
reporting-pg	db.m6i.xlarge	postgres	200	8.4	4	false
prod-orders	db.m6i.2xlarge	mysql	1000	34.2	180	true

# claude memo:
# IDLE  analytics-replica-2 → db.r6g.large (4x downsize, ~$420/mo saved)
# IDLE  reporting-pg        → db.m6i.large (2x downsize, ~$130/mo saved)
# OVERSIZED prod-orders     → db.m6i.xlarge (halve, ~$280/mo) — Multi-AZ, do in maintenance window
# Total: ~$830/mo, ~$9,960/yr

Caveats & tips

`date -u -v-7d` is BSD/macOS; `date -u -d '-7 days'` is GNU/Linux — the `2>/dev/null || …` fallback covers both.
CloudWatch `get-metric-data` allows 500 metrics and 100,800 data points per request — this recipe sends 2 metrics × 168 hourly points per instance, so the practical cap is ~250 RDS instances before you must page.
Dimension is `DBInstanceIdentifier` (not `DBInstanceId`) — using the wrong one returns empty results without erroring.
Aurora cluster CPU lives under `DBClusterIdentifier`; this recipe targets standard RDS instances. Add a separate query block for clusters.
Connection counts can be artificially low if your app uses a connection pool (pgbouncer/RDS Proxy) — review pool stats before downsizing.

← #039

Unattached EBS volumes + snapshots → claude cleanup plan

#041 →

SQS fleet health snapshot → claude wedge-vs-starve diagnosis