Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6204

Pass tables columns without partition columns to empty Hive reader

    XMLWordPrintableJSON

Details

    Description

      When store.hive.optimize_scan_with_native_readers is enabled, HiveDrillNativeScanBatchCreator is used to read data from Hive tables directly from file system. In case when table is empty or no row group are matched, empty HiveDefaultReader is called to output the schema.

      If such situation happens, currently Drill fails with the following error:

      org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: NullPointerException Setup failed for null 
      

      This happens because instead of passing only table columns to the empty reader (as we do when creating non-empty reader), we passed all columns which may contain partition columns as well. Readers fails to find partition column in table schema. As mentioned in on lines 81 - 82 in HiveDrillNativeScanBatchCreator, we deliberately separate out partition columns and table columns to pass partition columns separately:

            // Separate out the partition and non-partition columns. Non-partition columns are passed directly to the
            // ParquetRecordReader. Partition columns are passed to ScanBatch.
      

      To fix the problem we need to pass table columns instead of all columns.

          if (readers.size() == 0) {
            readers.add(new HiveDefaultReader(table, null, null, newColumns, context, conf,
              ImpersonationUtil.createProxyUgi(config.getUserName(), context.getQueryUserName())));
          }
      

      Attachments

        Issue Links

          Activity

            People

              arina Arina Ielchiieva
              arina Arina Ielchiieva
              Parth Chandra Parth Chandra
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: