DuckDB: peek at any CSV's schema
Faster than opening it in Excel. `DESCRIBE` infers types from the first 16K rows.
Setup
- → brew install duckdb
Cost per run
free
The one-liner
$ duckdb -c "DESCRIBE FROM 'data.csv'"What each stage does
- [01] duckdb
duckdb -c "DESCRIBE FROM 'data.csv'"DuckDB's DESCRIBE on a file path produces column_name, column_type, null %, key. Works on .csv, .json, .parquet, .xlsx (with the spatial extension).
Expected output (sample)
┌──────────────┬─────────────┬──────┬──────┐ │ column_name │ column_type │ null │ key │ ├──────────────┼─────────────┼──────┼──────┤ │ order_id │ BIGINT │ NO │ NULL │ │ customer_id │ VARCHAR │ NO │ NULL │ │ amount_cents │ INTEGER │ NO │ NULL │ │ ordered_at │ TIMESTAMP │ NO │ NULL │ └──────────────┴─────────────┴──────┴──────┘
Caveats & tips
- Type inference can be wrong for the first 16K rows of edge cases — pass `read_csv('data.csv', sample_size=-1)` to scan the whole file.
- For JSON: `DESCRIBE FROM read_json_auto('data.json')`.