S3: top 20 biggest objects in a bucket

Bill surprises from S3 are almost always a few giant objects. This finds them.

Setup

→ brew install awscli
→ aws configure

Cost per run

free

The one-liner

$ aws s3api list-objects-v2 --bucket my-bucket --max-items 5000 \
  | jq -r '.Contents | sort_by(-.Size) | .[0:20] | .[] |
             [.Size, .LastModified[0:10], .Key] | @tsv' \
  | awk -F'\t' '{ printf "%10.2f MB  %s  %s\n", $1/1024/1024, $2, $3 }'

What each stage does

[01] awsaws s3api list-objects-v2 --bucket my-bucket --max-items 5000
list-objects-v2 paginates at 1000 keys per page; --max-items caps the total. For huge buckets, switch to S3 Inventory (a daily manifest delivered to another bucket).
[02] jqjq -r '.Contents | sort_by(-.Size) | .[0:20]'
Sort descending by Size, take top 20. Negative sort is jq's idiom for descending.
[03] awkawk -F'\t' '{ printf "%10.2f MB %s %s\n", $1/1024/1024, $2, $3 }'
Convert bytes to MB with 2-decimal precision and right-align. The one awk trick worth knowing for size formatting.

Expected output (sample)

   4218.42 MB  2026-04-23  exports/full-backup-2026-04-23.tar.gz
   2841.07 MB  2026-05-09  measurements/raw/run-2026-05-09T14.parquet
   1936.55 MB  2026-05-08  measurements/raw/run-2026-05-08T22.parquet
    902.31 MB  2026-05-01  exports/monthly-2026-04.tar.gz

Caveats & tips

For buckets with >5000 objects, use S3 Inventory — list-objects-v2 gets slow and expensive on huge buckets.
Pipe the Key column through `claude -p "Suggest a lifecycle policy that would have prevented these from accumulating"` for a quick cleanup plan.

← #033

EC2: every running instance on one line each

#035 →

AWS Cost Explorer: top spend → gemini analysis