← All one-liners·#018·diagnostics·duckdb·beginner

DuckDB: peek at any CSV's schema

Faster than opening it in Excel. `DESCRIBE` infers types from the first 16K rows.

Setup

→ brew install duckdb

Cost per run

free

The one-liner

$ duckdb -c "DESCRIBE FROM 'data.csv'"

What each stage does

[01] duckdbduckdb -c "DESCRIBE FROM 'data.csv'"
DuckDB's DESCRIBE on a file path produces column_name, column_type, null %, key. Works on .csv, .json, .parquet, .xlsx (with the spatial extension).

Expected output (sample)

┌──────────────┬─────────────┬──────┬──────┐
│ column_name  │ column_type │ null │ key  │
├──────────────┼─────────────┼──────┼──────┤
│ order_id     │ BIGINT      │ NO   │ NULL │
│ customer_id  │ VARCHAR     │ NO   │ NULL │
│ amount_cents │ INTEGER     │ NO   │ NULL │
│ ordered_at   │ TIMESTAMP   │ NO   │ NULL │
└──────────────┴─────────────┴──────┴──────┘

Caveats & tips

Type inference can be wrong for the first 16K rows of edge cases — pass `read_csv('data.csv', sample_size=-1)` to scan the whole file.
For JSON: `DESCRIBE FROM read_json_auto('data.json')`.

← #017

DuckDB on NYC taxi public Parquet

#019 →

Should I read this HN thread? Claude TL;DR