[DRILL-6204] Pass tables columns without partition columns to empty Hive reader - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 1.12.0
Fix Version/s: 1.13.0
Component/s: Storage - Hive
Labels:
- ready-to-commit

Description

When store.hive.optimize_scan_with_native_readers is enabled, HiveDrillNativeScanBatchCreator is used to read data from Hive tables directly from file system. In case when table is empty or no row group are matched, empty HiveDefaultReader is called to output the schema.

If such situation happens, currently Drill fails with the following error:

org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: NullPointerException Setup failed for null

This happens because instead of passing only table columns to the empty reader (as we do when creating non-empty reader), we passed all columns which may contain partition columns as well. Readers fails to find partition column in table schema. As mentioned in on lines 81 - 82 in HiveDrillNativeScanBatchCreator, we deliberately separate out partition columns and table columns to pass partition columns separately:

      // Separate out the partition and non-partition columns. Non-partition columns are passed directly to the
      // ParquetRecordReader. Partition columns are passed to ScanBatch.

To fix the problem we need to pass table columns instead of all columns.

    if (readers.size() == 0) {
      readers.add(new HiveDefaultReader(table, null, null, newColumns, context, conf,
        ImpersonationUtil.createProxyUgi(config.getUserName(), context.getQueryUserName())));
    }

Attachments

Issue Links

links to

GitHub Pull Request #1146

Activity

People

Assignee:: Arina Ielchiieva

Reporter:: Arina Ielchiieva

Reviewer:: Parth Chandra

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 02/Mar/18 13:30

Updated:: 03/Mar/18 18:51

Resolved:: 03/Mar/18 18:51