Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
4.0.0
Description
When we create a table using CSV on an existing file with a header and:
- a column has an default +
- enforceSchema is false - taking into account CSV header
then query a column with a default.
The example below shows the issue:
CREATE TABLE IF NOT EXISTS products ( product_id INT, name STRING, price FLOAT default 0.0, quantity INT default 0 ) USING CSV OPTIONS ( header 'true', inferSchema 'false', enforceSchema 'false', path '/Users/maximgekk/tmp/products.csv' );
The CSV file products.csv:
product_id,name,price,quantity 1,Apple,0.50,100 2,Banana,0.25,200 3,Orange,0.75,50
The query fails:
spark-sql (default)> SELECT price FROM products; 24/01/28 11:43:09 ERROR Executor: Exception in task 0.0 in stage 8.0 (TID 6) java.lang.IllegalArgumentException: Number of column in CSV header is not equal to number of fields in the schema: Header length: 4, schema size: 1 CSV file: file:///Users/maximgekk/tmp/products.csv
Attachments
Attachments
Issue Links
- is caused by
-
SPARK-39143 Support CSV file scans with DEFAULT values
- Resolved
- links to