Hive
  1. Hive
  2. HIVE-7248

UNION ALL in hive returns incorrect results on Hbase backed table

    Details

    • Type: Bug Bug
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.12.0, 0.13.0, 0.13.1
    • Fix Version/s: None
    • Component/s: HBase Handler
    • Labels:
      None

      Description

      The issue can be recreated with following steps

      1) In hbase
      create 'TABLE_EMP','default'

      2) On hive
      sudo -u hive hive

      CREATE EXTERNAL TABLE TABLE_EMP(FIRST_NAME string,LAST_NAME string,CDS_UPDATED_DATE string,CDS_PK string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES("hbase.columns.mapping" = "default:FIRST_NAME,default:LAST_NAME,default:CDS_UPDATED_DATE,:key", "hbase.scan.cache" = "500", "hbase.scan.cacheblocks" = "false" ) TBLPROPERTIES("hbase.table.name" = "TABLE_EMP",'serialization.null.format'='');

      3) On hbase insert the following data

      put 'TABLE_EMP', '1', 'default:FIRST_NAME', 'Srini'
      put 'TABLE_EMP', '1', 'default:LAST_NAME', 'P'
      put 'TABLE_EMP', '1', 'default:CDS_UPDATED_DATE', '2014-06-16 00:00:00'

      put 'TABLE_EMP', '2', 'default:FIRST_NAME', 'Aravind'
      put 'TABLE_EMP', '2', 'default:LAST_NAME', 'K'
      put 'TABLE_EMP', '2', 'default:CDS_UPDATED_DATE', '2014-06-16 00:00:00'

      4) On hive execute the following query
      hive
      SELECT *
      FROM (
      SELECT CDS_PK
      FROM TABLE_EMP
      WHERE
      CDS_PK >= '0'
      AND CDS_PK <= '9'
      AND CDS_UPDATED_DATE IS NOT NULL
      UNION ALL SELECT CDS_PK
      FROM TABLE_EMP
      WHERE
      CDS_PK >= 'a'
      AND CDS_PK <= 'z'
      AND CDS_UPDATED_DATE IS NOT NULL
      )t ;

      5) Output of the query

      1
      1
      2
      2

      6) Output of just

      SELECT CDS_PK
      FROM TABLE_EMP
      WHERE
      CDS_PK >= '0'
      AND CDS_PK <= '9'
      AND CDS_UPDATED_DATE IS NOT NULL

      is

      1
      2

      7) Output of just

      SELECT CDS_PK
      FROM TABLE_EMP
      WHERE
      CDS_PK >= 'a'
      AND CDS_PK <= 'z'
      AND CDS_UPDATED_DATE IS NOT NULL

      Empty

      8) UNION is used to combine the result from multiple SELECT statements into a single result set. Hive currently only supports UNION ALL (bag union), in which duplicates are not eliminated

      Accordingly above query should return output
      1
      2

      instead it is giving wrong output
      1
      1
      2
      2

      1. HIVE-7248.3.patch.txt
        19 kB
        Navis
      2. HIVE-7248.2.patch.txt
        16 kB
        Navis
      3. HIVE-7248.1.patch.txt
        15 kB
        Navis

        Activity

        Mala Chikka Kempanna created issue -
        Hide
        Mala Chikka Kempanna added a comment -

        Current work-around for the problem is to use DISTINCT in one of the sub queries.

        Show
        Mala Chikka Kempanna added a comment - Current work-around for the problem is to use DISTINCT in one of the sub queries.
        Swarnim Kulkarni made changes -
        Field Original Value New Value
        Component/s HBase Handler [ 12313461 ]
        Navis made changes -
        Attachment HIVE-7248.1.patch.txt [ 12655505 ]
        Navis made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Assignee Navis [ navis ]
        Hide
        Hive QA added a comment -

        Overall: -1 at least one tests failed

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12655505/HIVE-7248.1.patch.txt

        ERROR: -1 due to 2 failed/errored test(s), 5730 tests executed
        Failed tests:

        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_self_join
        org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx
        

        Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/776/testReport
        Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/776/console
        Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-776/

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        Tests exited with: TestsFailedException: 2 tests failed
        

        This message is automatically generated.

        ATTACHMENT ID: 12655505

        Show
        Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12655505/HIVE-7248.1.patch.txt ERROR: -1 due to 2 failed/errored test(s), 5730 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_self_join org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/776/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/776/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-776/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed This message is automatically generated. ATTACHMENT ID: 12655505
        Hide
        Navis added a comment -

        Updated result file. Not effective filterExpr in TS should be removed.

        Show
        Navis added a comment - Updated result file. Not effective filterExpr in TS should be removed.
        Navis made changes -
        Attachment HIVE-7248.2.patch.txt [ 12655656 ]
        Hide
        Hive QA added a comment -

        Overall: -1 at least one tests failed

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12655656/HIVE-7248.2.patch.txt

        ERROR: -1 due to 4 failed/errored test(s), 5732 tests executed
        Failed tests:

        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_temp_table
        org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx
        org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes
        org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
        

        Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/788/testReport
        Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/788/console
        Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-788/

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        Tests exited with: TestsFailedException: 4 tests failed
        

        This message is automatically generated.

        ATTACHMENT ID: 12655656

        Show
        Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12655656/HIVE-7248.2.patch.txt ERROR: -1 due to 4 failed/errored test(s), 5732 tests executed Failed tests: org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_temp_table org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/788/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/788/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-788/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed This message is automatically generated. ATTACHMENT ID: 12655656
        Hide
        Navis added a comment -

        Rebased to trunk

        Show
        Navis added a comment - Rebased to trunk
        Navis made changes -
        Attachment HIVE-7248.3.patch.txt [ 12687391 ]
        Hide
        Hive QA added a comment -

        Overall: -1 at least one tests failed

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12687391/HIVE-7248.3.patch.txt

        ERROR: -1 due to 3 failed/errored test(s), 6705 tests executed
        Failed tests:

        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_partition_diff_num_cols
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_partition_diff_num_cols
        

        Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2093/testReport
        Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2093/console
        Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2093/

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        Tests exited with: TestsFailedException: 3 tests failed
        

        This message is automatically generated.

        ATTACHMENT ID: 12687391 - PreCommit-HIVE-TRUNK-Build

        Show
        Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12687391/HIVE-7248.3.patch.txt ERROR: -1 due to 3 failed/errored test(s), 6705 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_partition_diff_num_cols org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_partition_diff_num_cols Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2093/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2093/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2093/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed This message is automatically generated. ATTACHMENT ID: 12687391 - PreCommit-HIVE-TRUNK-Build
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Patch Available Patch Available
        26d 8h 48m 1 Navis 14/Jul/14 08:48

          People

          • Assignee:
            Navis
            Reporter:
            Mala Chikka Kempanna
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development