Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-2262

selecting columns of certain datatypes from a dictionary encoded parquet file created by drill fails

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: Storage - Parquet
    • Labels:
      None

      Description

      After creating a parquet file containing all datatypes using CTAS from drill certain columns are not readable by doing a select in drill.

      These datatypes are :
      DECIMAL18_col: OPTIONAL INT64 O:DECIMAL R:0 D:1
      TIME_col: OPTIONAL INT32 O:TIME R:0 D:1
      TIMESTAMP_col: OPTIONAL INT64 O:TIMESTAMP R:0 D:1

      The select from these columns fails with a similar error

      0: jdbc:drill:> select DECIMAL18_col from parquet_all_default limit 1;
      Query failed: RemoteRpcException: Failure while running fragment., org.apache.drill.exec.vector.NullableDecimal18Vector cannot be cast to org.apache.drill.exec.vector.NullableBigIntVector [ 26094858-356a-4128-ba2a-aa1473f74c93 on 10.10.30.167:31010 ]
      [ 26094858-356a-4128-ba2a-aa1473f74c93 on 10.10.30.167:31010 ]
      
      
      Error: exception while executing query: Failure while executing query. (state=,code=0)
      
      0: jdbc:drill:> select TIME_col from parquet_all_default limit 1;
      Query failed: RemoteRpcException: Failure while running fragment., org.apache.drill.exec.vector.NullableTimeVector cannot be cast to org.apache.drill.exec.vector.NullableIntVector [ 8938245b-a3a8-4bf2-8b6a-7e3860c90e8e on 10.10.30.167:31010 ]
      [ 8938245b-a3a8-4bf2-8b6a-7e3860c90e8e on 10.10.30.167:31010 ]
      
      
      Error: exception while executing query: Failure while executing query. (state=,code=0)
      
      0: jdbc:drill:> select TIMESTAMP_col from parquet_all_default limit 1;
      Query failed: RemoteRpcException: Failure while running fragment., org.apache.drill.exec.vector.NullableTimeStampVector cannot be cast to org.apache.drill.exec.vector.NullableBigIntVector [ 5dff5b38-ece7-4159-acfe-78dacb563e20 on 10.10.30.167:31010 ]
      [ 5dff5b38-ece7-4159-acfe-78dacb563e20 on 10.10.30.167:31010 ]
      
      
      Error: exception while executing query: Failure while executing query. (state=,code=0)
      

      Note: Dictionary encoding is turned on by default as of commit : https://github.com/apache/drill/commit/00c08eff2a5de6e4334fb57e952aac9a852c3d37

      After setting session options to turn off dictionary encoding the same queries succeed successfully.

        Attachments

        1. 0_0_0.parquet
          946 kB
          Ramana Inukonda Nagaraj
        2. DRILL-2262.1.patch.txt
          12 kB
          Abdel Hakim Deneche
        3. DRILL-2262.2.patch.txt
          29 kB
          Abdel Hakim Deneche
        4. DRILL-2262.3.patch.txt
          30 kB
          Abdel Hakim Deneche

          Issue Links

            Activity

              People

              • Assignee:
                sphillips Steven Phillips
                Reporter:
                inramana Ramana Inukonda Nagaraj
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: