Details
-
Task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.12.0
-
None
Description
PARQUET-1851 starts abandon to write parquet files with schema (meta information), but with 0 rows, aka empty files.
In result it prevent to store empty tables in DRILL by using parquet files, for example:
CREATE TABLE dfs.tmp.%s AS SELECT * FROM cp.`employee.json` WHERE 1=0
CREATE TABLE dfs.tmp.%s AS select * from dfs.`parquet/alltypes_required.parquet` where `col_int` = 0
create table dfs.tmp.%s as select * from dfs.`parquet/empty/complex/empty_complex.parquet`
So PARQUET-1851 breaks the following test cases:
TestUntypedNull.testParquetTableCreation TestParquetWriterEmptyFiles.testComplexEmptyFileSchema TestParquetWriterEmptyFiles.testWriteEmptyFile TestParquetWriterEmptyFiles.testWriteEmptyFileWithSchema TestParquetWriterEmptyFiles.testWriteEmptySchemaChange TestMetastoreCommands.testAnalyzeEmptyRequiredParquetTable TestMetastoreCommands.testSelectEmptyRequiredParquetTable
I suggest to use warning in the process of creating empty parquet files or create alternative endBlock for backward compatibility with other tools:
Attachments
Attachments
Issue Links
- is broken by
-
PARQUET-1851 ParquetMetadataConveter throws NPE in an Iceberg unit test
- Resolved
- Is contained by
-
DRILL-7825 Error: SYSTEM ERROR: RuntimeException: Unknown logical type <LogicalType UUID:UUIDType()>
- Resolved
-
DRILL-7907 Replace ParquetFileWriter in Drill with parquet version
- Open
- is required by
-
DRILL-7825 Error: SYSTEM ERROR: RuntimeException: Unknown logical type <LogicalType UUID:UUIDType()>
- Resolved