Details

      Description

      This JIRA is to implement timestamp support in Parquet SerDe.

      1. HIVE-6394.2.patch
        28 kB
        Szehon Ho
      2. HIVE-6394.3.patch
        28 kB
        Szehon Ho
      3. HIVE-6394.4.patch
        28 kB
        Szehon Ho
      4. HIVE-6394.5.patch
        28 kB
        Szehon Ho
      5. HIVE-6394.6.patch
        29 kB
        Szehon Ho
      6. HIVE-6394.6.patch
        30 kB
        Szehon Ho
      7. HIVE-6394.7.patch
        29 kB
        Szehon Ho
      8. HIVE-6394.patch
        21 kB
        Szehon Ho

        Issue Links

          Activity

          Hide
          sandeep chaturvedi added a comment -

          hey guys.. is it something I can take a look at?

          Show
          sandeep chaturvedi added a comment - hey guys.. is it something I can take a look at?
          Hide
          Szehon Ho added a comment -

          I'll take a look at this issue, there has been a decision by the parquet community of the data type to use.

          https://github.com/Parquet/parquet-mr/issues/218

          Show
          Szehon Ho added a comment - I'll take a look at this issue, there has been a decision by the parquet community of the data type to use. https://github.com/Parquet/parquet-mr/issues/218
          Hide
          Szehon Ho added a comment -

          This is blocked by HIVE-6386 as the new Int96 data type and libraries are in new version of parquet.

          Show
          Szehon Ho added a comment - This is blocked by HIVE-6386 as the new Int96 data type and libraries are in new version of parquet.
          Hide
          Szehon Ho added a comment -

          Typo , it is HIVE-6836.

          Show
          Szehon Ho added a comment - Typo , it is HIVE-6836 .
          Hide
          Szehon Ho added a comment -

          We upgraded parquet to get the new Int96 libraries, but there is a parquet exception when writing an actual Int96 type, with dictionary encoding on.

          Filed https://github.com/Parquet/parquet-mr/issues/350 which is being worked on. Will need to wait for the fix + new version of parquet before we can proceed.

          Show
          Szehon Ho added a comment - We upgraded parquet to get the new Int96 libraries, but there is a parquet exception when writing an actual Int96 type, with dictionary encoding on. Filed https://github.com/Parquet/parquet-mr/issues/350 which is being worked on. Will need to wait for the fix + new version of parquet before we can proceed.
          Hide
          Szehon Ho added a comment -

          Fix has been pulled to parquet, but still waiting on parquet release with this fix. Manually built parquet with fix, to do implementation on hive side. Attaching as work-in-progress.

          Show
          Szehon Ho added a comment - Fix has been pulled to parquet, but still waiting on parquet release with this fix. Manually built parquet with fix, to do implementation on hive side. Attaching as work-in-progress.
          Hide
          Szehon Ho added a comment -

          Adding unit tests.

          Show
          Szehon Ho added a comment - Adding unit tests.
          Hide
          Andrew Ash added a comment -

          Szehon Ho it looks like Parquet v1.5.0 includes the fix for that blocking bug https://github.com/Parquet/parquet-mr/issues/350

          How is the work-in-progress coming?

          Also my apologies for all the emails you probably got as I linked together the various issues across Jira and GitHub.

          Show
          Andrew Ash added a comment - Szehon Ho it looks like Parquet v1.5.0 includes the fix for that blocking bug https://github.com/Parquet/parquet-mr/issues/350 How is the work-in-progress coming? Also my apologies for all the emails you probably got as I linked together the various issues across Jira and GitHub.
          Hide
          Szehon Ho added a comment -

          Hi, thanks for notifying me. This change was working, but now will probably need a rebase due to the parquet-decimal changes. I can take a look this week to submit the patch for review. But if its not straightforward I get to it only next week. Hope that is ok

          Show
          Szehon Ho added a comment - Hi, thanks for notifying me. This change was working, but now will probably need a rebase due to the parquet-decimal changes. I can take a look this week to submit the patch for review. But if its not straightforward I get to it only next week. Hope that is ok
          Hide
          Andrew Ash added a comment -

          It's not a huge rush for me, I just didn't want this to sit idle as I'm hoping to use Timestamps heavily in future versions of Hive. I highly appreciate all your work on this!

          Show
          Andrew Ash added a comment - It's not a huge rush for me, I just didn't want this to sit idle as I'm hoping to use Timestamps heavily in future versions of Hive. I highly appreciate all your work on this!
          Hide
          Szehon Ho added a comment -

          Rebased back to working test.

          This is a working cut is good to go, but for now I am putting the timestamp<->parquet-byte conversion functions in the code. I couldn't find any equivalent in joda library. I'm going to try the Jodd library in next cut.

          Show
          Szehon Ho added a comment - Rebased back to working test. This is a working cut is good to go, but for now I am putting the timestamp<->parquet-byte conversion functions in the code. I couldn't find any equivalent in joda library. I'm going to try the Jodd library in next cut.
          Hide
          Szehon Ho added a comment -

          First patch for review. Use the Jodd library.

          Show
          Szehon Ho added a comment - First patch for review. Use the Jodd library.
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12648057/HIVE-6394.4.patch

          ERROR: -1 due to 9 failed/errored test(s), 5514 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_display_colstats_tbllvl
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_parquet_timestamp
          org.apache.hadoop.hive.ql.exec.tez.TestTezTask.testSubmit
          org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimal
          org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalX
          org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalXY
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/377/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/377/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-377/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 9 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12648057

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12648057/HIVE-6394.4.patch ERROR: -1 due to 9 failed/errored test(s), 5514 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_display_colstats_tbllvl org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_parquet_timestamp org.apache.hadoop.hive.ql.exec.tez.TestTezTask.testSubmit org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimal org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalX org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalXY Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/377/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/377/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-377/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed This message is automatically generated. ATTACHMENT ID: 12648057
          Hide
          Brock Noland added a comment -

          Getting a Calendar can be expensive. Is it thread safe? If so can you cache it?

          Show
          Brock Noland added a comment - Getting a Calendar can be expensive. Is it thread safe? If so can you cache it?
          Hide
          Szehon Ho added a comment -

          I don't think so, as I am modifying the values with the given timestamp. I added a thread-local cache of calendar that is lazily-created.

          Show
          Szehon Ho added a comment - I don't think so, as I am modifying the values with the given timestamp. I added a thread-local cache of calendar that is lazily-created.
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12648439/HIVE-6394.5.patch

          ERROR: -1 due to 16 failed/errored test(s), 5589 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_predicate_pushdown
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadata_only_queries
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ptf
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_schema_evolution
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_parquet_timestamp
          org.apache.hadoop.hive.ql.exec.tez.TestTezTask.testSubmit
          org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges
          org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
          org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimal
          org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalX
          org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalXY
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/393/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/393/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-393/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 16 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12648439

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12648439/HIVE-6394.5.patch ERROR: -1 due to 16 failed/errored test(s), 5589 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_predicate_pushdown org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadata_only_queries org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ptf org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_schema_evolution org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_parquet_timestamp org.apache.hadoop.hive.ql.exec.tez.TestTezTask.testSubmit org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimal org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalX org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalXY Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/393/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/393/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-393/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed This message is automatically generated. ATTACHMENT ID: 12648439
          Hide
          Szehon Ho added a comment -

          Attaching another patch. Was using a parquet-example class, now explicitly adding that logic in the serde layer.

          Show
          Szehon Ho added a comment - Attaching another patch. Was using a parquet-example class, now explicitly adding that logic in the serde layer.
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12648767/HIVE-6394.6.patch

          ERROR: -1 due to 13 failed/errored test(s), 5589 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dml
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_parquet_timestamp
          org.apache.hadoop.hive.metastore.TestMetastoreVersion.testDefaults
          org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes
          org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimal
          org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalX
          org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalXY
          org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/404/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/404/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-404/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 13 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12648767

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12648767/HIVE-6394.6.patch ERROR: -1 due to 13 failed/errored test(s), 5589 tests executed Failed tests: org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dml org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_parquet_timestamp org.apache.hadoop.hive.metastore.TestMetastoreVersion.testDefaults org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimal org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalX org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalXY org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/404/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/404/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-404/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed This message is automatically generated. ATTACHMENT ID: 12648767
          Hide
          Brock Noland added a comment -

          Szehon Ho I see parquet_timestamp failed.

          Show
          Brock Noland added a comment - Szehon Ho I see parquet_timestamp failed.
          Hide
          Szehon Ho added a comment -

          Test was asserting that parquet is not supporting timestamp type, removing it.

          Show
          Szehon Ho added a comment - Test was asserting that parquet is not supporting timestamp type, removing it.
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12649609/HIVE-6394.6.patch

          ERROR: -1 due to 8 failed/errored test(s), 5612 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dml
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
          org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes
          org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/431/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/431/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-431/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 8 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12649609

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12649609/HIVE-6394.6.patch ERROR: -1 due to 8 failed/errored test(s), 5612 tests executed Failed tests: org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dml org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/431/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/431/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-431/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed This message is automatically generated. ATTACHMENT ID: 12649609
          Hide
          Brock Noland added a comment -

          Tests appear to be unrelated. LGTM +1

          Show
          Brock Noland added a comment - Tests appear to be unrelated. LGTM +1
          Hide
          Szehon Ho added a comment -

          Rebase after Xuefu's commit

          Show
          Szehon Ho added a comment - Rebase after Xuefu's commit
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12650102/HIVE-6394.7.patch

          ERROR: -1 due to 6 failed/errored test(s), 5613 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
          org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing
          org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/455/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/455/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-455/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 6 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12650102

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12650102/HIVE-6394.7.patch ERROR: -1 due to 6 failed/errored test(s), 5613 tests executed Failed tests: org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/455/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/455/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-455/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed This message is automatically generated. ATTACHMENT ID: 12650102
          Hide
          Szehon Ho added a comment -

          Brock Noland Hi Brock, these test failures dont look related, can we commit this if you have the chance? Thanks

          Show
          Szehon Ho added a comment - Brock Noland Hi Brock, these test failures dont look related, can we commit this if you have the chance? Thanks
          Hide
          Brock Noland added a comment -

          +1

          Show
          Brock Noland added a comment - +1
          Hide
          Brock Noland added a comment -

          Thank you for the contribution! I have committed this to trunk.

          Show
          Brock Noland added a comment - Thank you for the contribution! I have committed this to trunk.
          Hide
          Lefty Leverenz added a comment -

          Document this for 0.14.0 here:

          Show
          Lefty Leverenz added a comment - Document this for 0.14.0 here: Language Manual – Parquet – Limitations
          Hide
          Szehon Ho added a comment -

          Lefty Leverenz Do we just need to remove 'timestamp' from the following sentence?

          Binary, timestamp, date, char, varchar or decimal support are pending (HIVE-6384)
          
          Show
          Szehon Ho added a comment - Lefty Leverenz Do we just need to remove 'timestamp' from the following sentence? Binary, timestamp, date, char, varchar or decimal support are pending (HIVE-6384)
          Hide
          Lefty Leverenz added a comment -

          Not quite, because 'timestamp' is still a limitation for releases prior to 0.14.

          I'll make a change and you can review it. (That'll be quicker than writing my suggestion here.)

          Show
          Lefty Leverenz added a comment - Not quite, because 'timestamp' is still a limitation for releases prior to 0.14. I'll make a change and you can review it. (That'll be quicker than writing my suggestion here.)
          Hide
          Lefty Leverenz added a comment -

          How's this? I added decimal too (HIVE-6367).

          Show
          Lefty Leverenz added a comment - How's this? I added decimal too ( HIVE-6367 ). Language Manual – Parquet – Limitations
          Hide
          Szehon Ho added a comment -

          Ah got it, thanks. Looks good, just one (unrelated) note, as HIVE-6375 is committed in 0.13, should we qualify the CTAS limitation?

          Show
          Szehon Ho added a comment - Ah got it, thanks. Looks good, just one (unrelated) note, as HIVE-6375 is committed in 0.13, should we qualify the CTAS limitation?
          Hide
          Lefty Leverenz added a comment -

          Yes, good catch. But apparently HIVE-6375 doesn't provide column rename support for Parquet – is there another JIRA ticket for that? (I'll edit the wiki and continue this discussion in HIVE-6375 comments.)

          Show
          Lefty Leverenz added a comment - Yes, good catch. But apparently HIVE-6375 doesn't provide column rename support for Parquet – is there another JIRA ticket for that? (I'll edit the wiki and continue this discussion in HIVE-6375 comments.)
          Hide
          Thejas M Nair added a comment -

          This has been fixed in 0.14 release. Please open new jira if you see any issues.

          Show
          Thejas M Nair added a comment - This has been fixed in 0.14 release. Please open new jira if you see any issues.
          Hide
          Yang Yang added a comment -

          the parquet spec about logical types and Timestamp specifically, seems to say
          https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md
          "TIMESTAMP_MILLIS is used for a combined logical date and time type. It must annotate an int64 that stores the number of milliseconds from the Unix epoch, 00:00:00.000 on 1 January 1970, UTC.

          "

          i.e. here it says that the type is only precise to the point of miliseconds and it starts from 1970.

          but if u look at the hive-parquet code in
          https://github.com/apache/hive/blob/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java#L142
          https://github.com/apache/hive/blob/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTime.java#L54
          it seems that hive's encoding of timestamp on parquet is of a different spec, precise to the point of nano seconds, and starting from "Monday, January 1, 4713 " (defined in jodd.datetime.JDateTime)

          so Hive's parquet timestamp storage is completely different from the above spec ?

          Show
          Yang Yang added a comment - the parquet spec about logical types and Timestamp specifically, seems to say https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md "TIMESTAMP_MILLIS is used for a combined logical date and time type. It must annotate an int64 that stores the number of milliseconds from the Unix epoch, 00:00:00.000 on 1 January 1970, UTC. " i.e. here it says that the type is only precise to the point of miliseconds and it starts from 1970. but if u look at the hive-parquet code in https://github.com/apache/hive/blob/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java#L142 https://github.com/apache/hive/blob/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTime.java#L54 it seems that hive's encoding of timestamp on parquet is of a different spec, precise to the point of nano seconds, and starting from "Monday, January 1, 4713 " (defined in jodd.datetime.JDateTime) so Hive's parquet timestamp storage is completely different from the above spec ?
          Hide
          Szehon Ho added a comment -

          Hi Yang, thanks for the observation. What you pointed is a type called 'timestamp_milis', whereas this is about 'timestamp', which has to have nanosecond precision.

          The spec at the time of implementation was based on this parquet discussion https://github.com/Parquet/parquet-mr/issues/218, as it was followed that way to get compatibility between Hive/Impala parquet timestamps.

          Now on the other hand, maybe parquet will soon come up with a proper Timestamp logical type, at that time the tools can change implementation to that one, although for now this works if you are using Hive/Impala.

          Show
          Szehon Ho added a comment - Hi Yang, thanks for the observation. What you pointed is a type called 'timestamp_milis', whereas this is about 'timestamp', which has to have nanosecond precision. The spec at the time of implementation was based on this parquet discussion https://github.com/Parquet/parquet-mr/issues/218 , as it was followed that way to get compatibility between Hive/Impala parquet timestamps. Now on the other hand, maybe parquet will soon come up with a proper Timestamp logical type, at that time the tools can change implementation to that one, although for now this works if you are using Hive/Impala.

            People

            • Assignee:
              Szehon Ho
              Reporter:
              Jarek Jarcec Cecho
            • Votes:
              2 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development