Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.12.0
Description
When store.hive.optimize_scan_with_native_readers is enabled, HiveDrillNativeScanBatchCreator is used to read data from Hive tables directly from file system. In case when table is empty or no row group are matched, empty HiveDefaultReader is called to output the schema.
If such situation happens, currently Drill fails with the following error:
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: NullPointerException Setup failed for null
This happens because instead of passing only table columns to the empty reader (as we do when creating non-empty reader), we passed all columns which may contain partition columns as well. Readers fails to find partition column in table schema. As mentioned in on lines 81 - 82 in HiveDrillNativeScanBatchCreator, we deliberately separate out partition columns and table columns to pass partition columns separately:
// Separate out the partition and non-partition columns. Non-partition columns are passed directly to the // ParquetRecordReader. Partition columns are passed to ScanBatch.
To fix the problem we need to pass table columns instead of all columns.
if (readers.size() == 0) { readers.add(new HiveDefaultReader(table, null, null, newColumns, context, conf, ImpersonationUtil.createProxyUgi(config.getUserName(), context.getQueryUserName()))); }
Attachments
Issue Links
- links to