Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.14.0
Description
I was testing some queries with the 0.14 release and noticed that the projected schema for a table scan is completely wrong (however the results of the query are not necessarily wrong)
// schema for nyxtaxi csv files let schema = Schema::new(vec![ Field::new("VendorID", DataType::Utf8, true), Field::new("tpep_pickup_datetime", DataType::Utf8, true), Field::new("tpep_dropoff_datetime", DataType::Utf8, true), Field::new("passenger_count", DataType::Utf8, true), Field::new("trip_distance", DataType::Float64, true), Field::new("RatecodeID", DataType::Utf8, true), Field::new("store_and_fwd_flag", DataType::Utf8, true), Field::new("PULocationID", DataType::Utf8, true), Field::new("DOLocationID", DataType::Utf8, true), Field::new("payment_type", DataType::Utf8, true), Field::new("fare_amount", DataType::Float64, true), Field::new("extra", DataType::Float64, true), Field::new("mta_tax", DataType::Float64, true), Field::new("tip_amount", DataType::Float64, true), Field::new("tolls_amount", DataType::Float64, true), Field::new("improvement_surcharge", DataType::Float64, true), Field::new("total_amount", DataType::Float64, true), ]); let mut ctx = ExecutionContext::new(); ctx.register_csv("tripdata", "file.csv", &schema, true); let optimized_plan = ctx.create_logical_plan( "SELECT passenger_count, MIN(fare_amount), MAX(fare_amount) \ FROM tripdata GROUP BY passenger_count").unwrap();
The projected schema in the table scan has the first two columns from the schema (VendorID and tpetp_pickup_datetime) rather than passenger_count and fare_amount
Attachments
Issue Links
- links to