Partitioned Data Debugging

Testing Hive partitioning with DuckDB WASM.

DuckDB Status
Initializing DuckDB WASM...
Test Queries

Test 1: Direct File Access

Direct file access works perfectly

Query SELECT * FROM read_parquet('https://storage.googleapis.com/climate_ts/partitioned/model_name=WRF-NARR_HIS/grid_name=R10C29/data_0.parquet') LIMIT 3;

Test 2: Manual Partition Columns

Manually add partition info to single file

Query SELECT 'WRF-NARR_HIS' as model_name, 'R10C29' as grid_name, * FROM read_parquet('https://storage.googleapis.com/climate_ts/partitioned/model_name=WRF-NARR_HIS/grid_name=R10C29/data_0.parquet') LIMIT 2;

Test 3: Array of URLs

Test if DuckDB can read from an array of specific URLs

Query SELECT * FROM read_parquet(['https://storage.googleapis.com/climate_ts/partitioned/model_name=WRF-NARR_HIS/grid_name=R10C29/data_0.parquet']) LIMIT 3;

Test 4: Multiple Files with UNION

Manually combine files from different partitions

Query SELECT 'WRF-NARR_HIS' as model_name, 'R10C29' as grid_name, * FROM read_parquet('https://storage.googleapis.com/climate_ts/partitioned/model_name=WRF-NARR_HIS/grid_name=R10C29/data_0.parquet') LIMIT 2 UNION ALL SELECT 'WRF-NARR_HIS' as model_name, 'R10C30' as grid_name, * FROM read_parquet('https://storage.googleapis.com/climate_ts/partitioned/model_name=WRF-NARR_HIS/grid_name=R10C30/data_0.parquet') LIMIT 2;

Test 5: Different Climate Model

Test accessing a different climate model partition

Query SELECT * FROM read_parquet('https://storage.googleapis.com/climate_ts/partitioned/model_name=access1.3_RCP85_PREC_6km/grid_name=R10C29/data_0.parquet') LIMIT 3;
Debug Info
Expected Partition Structure: climate_ts/partitioned/model_name=WRF-NARR_HIS/grid_name=R10C29/data_0.parquet climate_ts/partitioned/model_name=access1.3_RCP85_PREC_6km/grid_name=R10C29/data_0.parquet Key Points: • Uses Hive partitioning (key=value in folder names) • DuckDB should auto-detect partition columns • Filter pushdown should work on partition columns • Partition columns: model_name, grid_name