Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21900

Join operations across different data sources (Druid and JDBC StorageHandler)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.1.1
    • None
    • None

    Description

      We have a druid datasource and have external table created in hive for the same datasource.

      For example: 

       

      CREATE EXTERNAL TABLE druid_table_1
      STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
      TBLPROPERTIES ("druid.datasource" = "wikipedia");
      

       

       

      We have another table in mysql database, which also has an external table created in hive in this fashion: 

       

      CREATE EXTERNAL TABLE sample_table_1
      (
      old_id int,
      city_name string,
      new_id int
      )
      STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
      TBLPROPERTIES (
      "hive.sql.database.type" = "MYSQL",
      "hive.sql.jdbc.driver" = "com.mysql.jdbc.Driver",
      "hive.sql.jdbc.url" = "jdbc:mysql://172.16.0.15:3307/test",
      "hive.sql.dbcp.username" = "hive_user",
      "hive.sql.dbcp.password" = "hive_pass",
      "hive.sql.table" = "city_mapping"
      );
      

      So we are able to perform normal queries on the individual tables, but when we try to do join operation for both the above tables in this fashion: 

       

       

      SELECT *
      FROM druid_table_1 o
      JOIN sample_table_1 c
      ON (c.city_name = o.channel) limit 10;
      

      Then we are getting the error as follows: 

       

       

      TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : attempt_1560945328057_0022_2_01_000000_1:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
      at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
      at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
      at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
      at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
      at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:422)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
      at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
      at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
      at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
      at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
      at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
      at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
      at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
      at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
      at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
      ... 16 more
      Caused by: java.io.IOException: java.lang.NullPointerException
      at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
      at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
      at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
      at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
      at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
      at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
      at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
      at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
      at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
      ... 18 more
      Caused by: java.lang.NullPointerException
      at org.apache.hadoop.hive.druid.serde.DruidSelectQueryRecordReader.nextKeyValue(DruidSelectQueryRecordReader.java:62)
      at org.apache.hadoop.hive.druid.serde.DruidSelectQueryRecordReader.next(DruidSelectQueryRecordReader.java:85)
      at org.apache.hadoop.hive.druid.serde.DruidSelectQueryRecordReader.next(DruidSelectQueryRecordReader.java:38)
      at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
      ... 24 more
      ],

      We are running 

      hive - v3.1.1

      tez - v0.9.2

      druid - v0.14.2

      hadoop - v2.8.5

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              subramaniraju Subramani Raju V
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: