Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6204

Pass tables columns without partition columns to empty Hive reader

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments



      When store.hive.optimize_scan_with_native_readers is enabled, HiveDrillNativeScanBatchCreator is used to read data from Hive tables directly from file system. In case when table is empty or no row group are matched, empty HiveDefaultReader is called to output the schema.

      If such situation happens, currently Drill fails with the following error:

      org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: NullPointerException Setup failed for null 

      This happens because instead of passing only table columns to the empty reader (as we do when creating non-empty reader), we passed all columns which may contain partition columns as well. Readers fails to find partition column in table schema. As mentioned in on lines 81 - 82 in HiveDrillNativeScanBatchCreator, we deliberately separate out partition columns and table columns to pass partition columns separately:

            // Separate out the partition and non-partition columns. Non-partition columns are passed directly to the
            // ParquetRecordReader. Partition columns are passed to ScanBatch.

      To fix the problem we need to pass table columns instead of all columns.

          if (readers.size() == 0) {
            readers.add(new HiveDefaultReader(table, null, null, newColumns, context, conf,
              ImpersonationUtil.createProxyUgi(config.getUserName(), context.getQueryUserName())));




            • Assignee:
              arina Arina Ielchiieva
              arina Arina Ielchiieva
              Parth Chandra


              • Created:

                Issue deployment