Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-23873

Querying Hive JDBCStorageHandler table fails with NPE when CBO is off

    XMLWordPrintableJSON

Details

    Description

      Scenario is Hive table having same schema as table in Oracle, however when we query the table with data it fails with NPE, below is the trace.

      Caused by: java.io.IOException: java.lang.NullPointerException
              at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:617) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
              at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
              at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
              at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
              at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
              at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473) ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
              ... 34 more
      Caused by: java.lang.NullPointerException
              at org.apache.hive.storage.jdbc.JdbcSerDe.deserialize(JdbcSerDe.java:164) ~[hive-jdbc-handler-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
              at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:598) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
              at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
              at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
              at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
              at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
              at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473) ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
              ... 34 more
      

      Problem appears when column names in Oracle are in Upper case and since in Hive, table and column names are forced to store in lowercase during creation. User runs into NPE error while fetching data.

      While deserializing data, input consists of column names in lower case which fails to get the value

      https://github.com/apache/hive/blob/rel/release-3.1.2/jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcSerDe.java#L136

      rowVal = ((ObjectWritable)value).get();
      

      Log Snio:
      =============

      2020-07-17T16:49:09,598 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 HiveServer2-Handler-Pool: Thread-104]: dao.GenericJdbcDatabaseAccessor (:()) - Query to execute is [select * from TESTHIVEJDBCSTORAGE]
      2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** ColumnKey = ID
      2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** Blob value = {fname=OW[class=class java.lang.String,value=Name1], id=OW[class=class java.lang.Integer,value=1]}
      

      Simple Reproducer for this case.
      =============
      1. Create table in Oracle

      create table TESTHIVEJDBCSTORAGE(ID INT, FNAME VARCHAR(20));
      

      2. Insert dummy data.

      Insert into TESTHIVEJDBCSTORAGE values (1, 'Name1');
      

      3. Create JDBCStorageHandler table in Hive.

      CREATE EXTERNAL TABLE default.TESTHIVEJDBCSTORAGE_HIVE_TBL (ID INT, FNAME VARCHAR(20)) 
      STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' 
      TBLPROPERTIES ( 
      "hive.sql.database.type" = "ORACLE", 
      "hive.sql.jdbc.driver" = "oracle.jdbc.OracleDriver", 
      "hive.sql.jdbc.url" = "jdbc:oracle:thin:@orachehostname/XE", 
      "hive.sql.dbcp.username" = "chiran", 
      "hive.sql.dbcp.password" = "supersecurepassword", 
      "hive.sql.table" = "TESTHIVEJDBCSTORAGE", 
      "hive.sql.dbcp.maxActive" = "1" 
      );
      

      4. Query Hive table, fails with NPE.

      > select * from default.TESTHIVEJDBCSTORAGE_HIVE_TBL;
      INFO  : Compiling command(queryId=hive_20200717164857_cd6f5020-4a69-4a2d-9e63-9db99d0121bc): select * from default.TESTHIVEJDBCSTORAGE_HIVE_TBL
      INFO  : Semantic Analysis Completed (retrial = false)
      INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:testhivejdbcstorage_hive_tbl.id, type:int, comment:null), FieldSchema(name:testhivejdbcstorage_hive_tbl.fname, type:varchar(20), comment:null)], properties:null)
      INFO  : Completed compiling command(queryId=hive_20200717164857_cd6f5020-4a69-4a2d-9e63-9db99d0121bc); Time taken: 9.914 seconds
      INFO  : Executing command(queryId=hive_20200717164857_cd6f5020-4a69-4a2d-9e63-9db99d0121bc): select * from default.TESTHIVEJDBCSTORAGE_HIVE_TBL
      INFO  : Completed executing command(queryId=hive_20200717164857_cd6f5020-4a69-4a2d-9e63-9db99d0121bc); Time taken: 0.019 seconds
      INFO  : OK
      Error: java.io.IOException: java.lang.NullPointerException (state=,code=0)
      

      Assuming that there are no repercussions, can we convert the column names to lowercase fetched from Database/Query pointing to table in JDBCStorageHandler?
      Attaching the patch for the case.

      Attachments

        1. HIVE-23873.4.patch
          6 kB
          Chiran Ravani
        2. HIVE-23873.3.patch
          3 kB
          Chiran Ravani
        3. HIVE-23873.02.patch
          2 kB
          Chiran Ravani
        4. HIVE-23873.01.patch
          0.8 kB
          Chiran Ravani

        Activity

          People

            chiran54321 Chiran Ravani
            chiran54321 Chiran Ravani
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 20m
                1h 20m