Hive
  1. Hive
  2. HIVE-7441

Custom partition scheme gets rewritten with hive scheme upon concatenate

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.10.0, 0.11.0, 0.12.0
    • Fix Version/s: 0.14.0
    • Component/s: CLI
    • Labels:
      None
    • Environment:

      CDH4.5 and CDH5.0

      Description

      If I take a given data directories. The directories contain a data file that is rc format and only contains one character "1".

      /j1/part1
      /j1/part2
      

      Create the table over the directories using the following command:

      create table j1 (a int) partitioned by (b string) stored as rcfile location '/j1' ;
      

      I add these directories to a table for example j1 using the following commands:

      alter table j1 add partition (b = 'part1') location '/j1/part1';
      alter table j1 add partition (b = 'part2') location '/j1/part2';
      

      I then do the following command to the first partition:

      alter table j1 partition (b = 'part1') concatenate;
      

      Hive changes the partition location from on hdfs

      /j1/part1
      

      to

      /j1/b=part1
      

      However it does not update the partition location in the metastore and partition is then lost to the table. It is hard to find this out until you start querying your data and notice there is missing data. The table even still shows the partition when you do "show partitions".

      1. HIVE-7441.patch
        10 kB
        Chaoyu Tang
      2. HIVE-7441.1.patch
        10 kB
        Chaoyu Tang

        Activity

        Hide
        Chaoyu Tang added a comment -

        It is a defect from hive and I think that the concatenated partition file should be under the original partition location if they are from same file system.

        Show
        Chaoyu Tang added a comment - It is a defect from hive and I think that the concatenated partition file should be under the original partition location if they are from same file system.
        Hide
        Chaoyu Tang added a comment -

        Please review the attached patch in https://reviews.apache.org/r/24284/, thanks.

        Show
        Chaoyu Tang added a comment - Please review the attached patch in https://reviews.apache.org/r/24284/ , thanks.
        Hide
        Hive QA added a comment -

        Overall: -1 at least one tests failed

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12659786/HIVE-7441.patch

        ERROR: -1 due to 4 failed/errored test(s), 5850 tests executed
        Failed tests:

        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_3
        org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx
        org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode
        org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
        

        Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/176/testReport
        Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/176/console
        Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-176/

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        Tests exited with: TestsFailedException: 4 tests failed
        

        This message is automatically generated.

        ATTACHMENT ID: 12659786

        Show
        Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12659786/HIVE-7441.patch ERROR: -1 due to 4 failed/errored test(s), 5850 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_3 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/176/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/176/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-176/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed This message is automatically generated. ATTACHMENT ID: 12659786
        Hide
        Szehon Ho added a comment -

        This looks good to me, great find, but the newly-added test fails. Can you take a look?

        Also left a very trivial comment on the review board. Thanks.

        Show
        Szehon Ho added a comment - This looks good to me, great find, but the newly-added test fails. Can you take a look? Also left a very trivial comment on the review board. Thanks.
        Hide
        Chaoyu Tang added a comment -

        Thanks, Szehon. I made changes and uploaded the new patch to review board.
        Also I fixed the tests and the failures I think were from the order of the selected rows.

        Show
        Chaoyu Tang added a comment - Thanks, Szehon. I made changes and uploaded the new patch to review board. Also I fixed the tests and the failures I think were from the order of the selected rows.
        Hide
        Hive QA added a comment -

        Overall: -1 at least one tests failed

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12659967/HIVE-7441.1.patch

        ERROR: -1 due to 5 failed/errored test(s), 5878 tests executed
        Failed tests:

        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_mixed_case
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
        org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx
        org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode
        

        Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/190/testReport
        Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/190/console
        Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-190/

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        Tests exited with: TestsFailedException: 5 tests failed
        

        This message is automatically generated.

        ATTACHMENT ID: 12659967

        Show
        Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12659967/HIVE-7441.1.patch ERROR: -1 due to 5 failed/errored test(s), 5878 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_mixed_case org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/190/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/190/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-190/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed This message is automatically generated. ATTACHMENT ID: 12659967
        Hide
        Szehon Ho added a comment -

        +1

        Show
        Szehon Ho added a comment - +1
        Hide
        Szehon Ho added a comment -

        Committed to trunk. Thanks Chaoyu for the contribution!

        Show
        Szehon Ho added a comment - Committed to trunk. Thanks Chaoyu for the contribution!
        Hide
        Chaoyu Tang added a comment -

        Szehon Ho Thanks for reviewing the patch and committing it. Appreciate it.

        Show
        Chaoyu Tang added a comment - Szehon Ho Thanks for reviewing the patch and committing it. Appreciate it.
        Hide
        Thejas M Nair added a comment -

        This has been fixed in 0.14 release. Please open new jira if you see any issues.

        Show
        Thejas M Nair added a comment - This has been fixed in 0.14 release. Please open new jira if you see any issues.

          People

          • Assignee:
            Chaoyu Tang
            Reporter:
            Johndee Burks
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development