S3: top 20 biggest objects in a bucket
Bill surprises from S3 are almost always a few giant objects. This finds them.
Setup
- → brew install awscli
- → aws configure
Cost per run
free
The one-liner
$ aws s3api list-objects-v2 --bucket my-bucket --max-items 5000 \
| jq -r '.Contents | sort_by(-.Size) | .[0:20] | .[] |
[.Size, .LastModified[0:10], .Key] | @tsv' \
| awk -F'\t' '{ printf "%10.2f MB %s %s\n", $1/1024/1024, $2, $3 }'What each stage does
- [01] aws
aws s3api list-objects-v2 --bucket my-bucket --max-items 5000list-objects-v2 paginates at 1000 keys per page; --max-items caps the total. For huge buckets, switch to S3 Inventory (a daily manifest delivered to another bucket). - [02] jq
jq -r '.Contents | sort_by(-.Size) | .[0:20]'Sort descending by Size, take top 20. Negative sort is jq's idiom for descending. - [03] awk
awk -F'\t' '{ printf "%10.2f MB %s %s\n", $1/1024/1024, $2, $3 }'Convert bytes to MB with 2-decimal precision and right-align. The one awk trick worth knowing for size formatting.
Expected output (sample)
4218.42 MB 2026-04-23 exports/full-backup-2026-04-23.tar.gz
2841.07 MB 2026-05-09 measurements/raw/run-2026-05-09T14.parquet
1936.55 MB 2026-05-08 measurements/raw/run-2026-05-08T22.parquet
902.31 MB 2026-05-01 exports/monthly-2026-04.tar.gzCaveats & tips
- For buckets with >5000 objects, use S3 Inventory — list-objects-v2 gets slow and expensive on huge buckets.
- Pipe the Key column through `claude -p "Suggest a lifecycle policy that would have prevented these from accumulating"` for a quick cleanup plan.