Hive
  1. Hive
  2. HIVE-7050

Display table level column stats in DESCRIBE FORMATTED TABLE

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.14.0
    • Component/s: Statistics
    • Labels:
      None

      Description

      There is currently no way to display the column level stats from hive CLI. It will be good to show them in DESCRIBE EXTENDED/FORMATTED TABLE

      1. HIVE-7050.1.patch
        47 kB
        Prasanth Jayachandran
      2. HIVE-7050.2.patch
        48 kB
        Prasanth Jayachandran
      3. HIVE-7050.3.patch
        52 kB
        Prasanth Jayachandran
      4. HIVE-7050.4.patch
        52 kB
        Prasanth Jayachandran
      5. HIVE-7050.5.patch
        52 kB
        Prasanth Jayachandran
      6. HIVE-7050.6.patch
        60 kB
        Prasanth Jayachandran

        Issue Links

          Activity

          Hide
          Thejas M Nair added a comment -

          This has been fixed in 0.14 release. Please open new jira if you see any issues.

          Show
          Thejas M Nair added a comment - This has been fixed in 0.14 release. Please open new jira if you see any issues.
          Hide
          Lefty Leverenz added a comment -

          Besides the doc changes already mentioned, I added the FOR COLUMNS option to ANALYZE in the Statistics doc for HIVE-1362:

          Show
          Lefty Leverenz added a comment - Besides the doc changes already mentioned, I added the FOR COLUMNS option to ANALYZE in the Statistics doc for HIVE-1362 : Statistics in Hive – Existing Tables DDL – Display Column Statistics
          Hide
          Prasanth Jayachandran added a comment -

          Done. Updated release notes and title of both JIRAs to say only FORMATTED.

          Show
          Prasanth Jayachandran added a comment - Done. Updated release notes and title of both JIRAs to say only FORMATTED.
          Hide
          Lefty Leverenz added a comment -

          Thanks, Prasanth Jayachandran, especially for HIVE-7051 which had slipped through my net.

          Your new section is good, but needs version information – I'll add that, plus a link to the ANALYZE syntax, and a bit of editorial tinkering. Then you can verify my changes.

          Would you please change the release note (which currently says "Please document the new functionality") and add one on HIVE-7051 too?

          Q: Does this only work for FORMATTED, not EXTENDED (although it's in both jira titles)?

          Show
          Lefty Leverenz added a comment - Thanks, Prasanth Jayachandran , especially for HIVE-7051 which had slipped through my net. Your new section is good, but needs version information – I'll add that, plus a link to the ANALYZE syntax, and a bit of editorial tinkering. Then you can verify my changes. Would you please change the release note (which currently says "Please document the new functionality") and add one on HIVE-7051 too? Q: Does this only work for FORMATTED, not EXTENDED (although it's in both jira titles)?
          Hide
          Prasanth Jayachandran added a comment -

          Lefty Leverenz I added "Display column statistics" section here https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Describe
          which should cover the features added in this jira as well as HIVE-7051. Can you take a look to see if its good?

          Show
          Prasanth Jayachandran added a comment - Lefty Leverenz I added "Display column statistics" section here https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Describe which should cover the features added in this jira as well as HIVE-7051 . Can you take a look to see if its good?
          Hide
          Ashutosh Chauhan added a comment -

          Ok..cool

          Show
          Ashutosh Chauhan added a comment - Ok..cool
          Hide
          Prasanth Jayachandran added a comment -

          No it is not supported yet. HIVE-7051 is created to support it.

          Show
          Prasanth Jayachandran added a comment - No it is not supported yet. HIVE-7051 is created to support it.
          Hide
          Ashutosh Chauhan added a comment -

          Prasanth Jayachandran Does this also support display of column stats for a particular partition of table? Test case doesnt cover it, so not sure. I was hoping following syntax to work, but seems like not supported yet.

          describe formatted T partition (k1=v1) c1;
          
          Show
          Ashutosh Chauhan added a comment - Prasanth Jayachandran Does this also support display of column stats for a particular partition of table? Test case doesnt cover it, so not sure. I was hoping following syntax to work, but seems like not supported yet. describe formatted T partition (k1=v1) c1;
          Hide
          Xuefu Zhang added a comment -

          Patch committed to trunk. Thanks to Prasanth J for the contribution.

          Show
          Xuefu Zhang added a comment - Patch committed to trunk. Thanks to Prasanth J for the contribution.
          Hide
          Xuefu Zhang added a comment -

          +1

          Show
          Xuefu Zhang added a comment - +1
          Hide
          Prasanth Jayachandran added a comment -

          This patch fixes the relevant test failures. Another change is when column stats is not available then empty string is displayed instead of "null" to be inline with how comments are handled (empty instead of null).

          Xuefu Zhang sorry for getting back late on this. Can you please take a look again at this patch?

          Show
          Prasanth Jayachandran added a comment - This patch fixes the relevant test failures. Another change is when column stats is not available then empty string is displayed instead of "null" to be inline with how comments are handled (empty instead of null). Xuefu Zhang sorry for getting back late on this. Can you please take a look again at this patch?
          Hide
          Prasanth Jayachandran added a comment -

          Xuefu Zhang only describe_syntax and describe_table seems to be related. Other test failures are unrelated and are tracked else where. I quickly ran describe_syntax and describe_table and found that the diffs are showing additional spaces. I will analyze more to see if those spaces valid or added as a side effect. Will post a new patch later today with the fix.

          Show
          Prasanth Jayachandran added a comment - Xuefu Zhang only describe_syntax and describe_table seems to be related. Other test failures are unrelated and are tracked else where. I quickly ran describe_syntax and describe_table and found that the diffs are showing additional spaces. I will analyze more to see if those spaces valid or added as a side effect. Will post a new patch later today with the fix.
          Hide
          Xuefu Zhang added a comment -

          Prasanth Jayachandran Could you give an analysis of the above test failures?

          Show
          Xuefu Zhang added a comment - Prasanth Jayachandran Could you give an analysis of the above test failures?
          Hide
          Xuefu Zhang added a comment -

          Sorry, but the above comment was intended for Prasanth Jayachandran.

          Show
          Xuefu Zhang added a comment - Sorry, but the above comment was intended for Prasanth Jayachandran .
          Hide
          Xuefu Zhang added a comment -

          Prashanth Jonnalagadda Could you take a look at the test failures above to see if they are related to your patch?

          Show
          Xuefu Zhang added a comment - Prashanth Jonnalagadda Could you take a look at the test failures above to see if they are related to your patch?
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12645388/HIVE-7050.5.patch

          ERROR: -1 due to 19 failed/errored test(s), 5451 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_describe_syntax
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_describe_table
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_java_method
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_reflect
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_math_funcs
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_input20
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_input4
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_input5
          org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimal
          org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalX
          org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalXY
          org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHadoopVersion
          org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion
          org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getPigVersion
          org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getStatus
          org.apache.hive.hcatalog.templeton.TestWebHCatE2e.invalidPath
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/226/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/226/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 19 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12645388

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12645388/HIVE-7050.5.patch ERROR: -1 due to 19 failed/errored test(s), 5451 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_describe_syntax org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_describe_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_java_method org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_reflect org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_math_funcs org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter org.apache.hadoop.hive.ql.parse.TestParse.testParse_input20 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input5 org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimal org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalX org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalXY org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHadoopVersion org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getPigVersion org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getStatus org.apache.hive.hcatalog.templeton.TestWebHCatE2e.invalidPath Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/226/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/226/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 19 tests failed This message is automatically generated. ATTACHMENT ID: 12645388
          Hide
          Xuefu Zhang added a comment -

          +1, pending on test result.

          Show
          Xuefu Zhang added a comment - +1, pending on test result.
          Hide
          Prasanth Jayachandran added a comment -

          Addressed Xuefu's review comments

          Show
          Prasanth Jayachandran added a comment - Addressed Xuefu's review comments
          Hide
          Hive QA added a comment -

          Overall: -1 no tests executed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12644978/HIVE-7050.4.patch

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/201/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/201/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          

          This message is automatically generated.

          ATTACHMENT ID: 12644978

          Show
          Hive QA added a comment - Overall : -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12644978/HIVE-7050.4.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/201/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/201/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase This message is automatically generated. ATTACHMENT ID: 12644978
          Hide
          Prasanth Jayachandran added a comment -

          Addressed Xuefu's comments in RB.

          Show
          Prasanth Jayachandran added a comment - Addressed Xuefu's comments in RB.
          Hide
          Prasanth Jayachandran added a comment -

          Addressed Xuefu Zhang's review comments. Left reply in RB. RB is flaky now will update the patch in RB later.

          Show
          Prasanth Jayachandran added a comment - Addressed Xuefu Zhang 's review comments. Left reply in RB. RB is flaky now will update the patch in RB later.
          Hide
          Xuefu Zhang added a comment -

          Patch looks good. I left a couple of minor comments on rb.

          Show
          Xuefu Zhang added a comment - Patch looks good. I left a couple of minor comments on rb.
          Hide
          Prasanth Jayachandran added a comment -

          Column stats are stored only when a column is specified and only when FORMATTED is specified. It does NOT show for EXTENDED because extended output does not show the column names at the top which makes it difficult to comprehend the column stats output.

          Show
          Prasanth Jayachandran added a comment - Column stats are stored only when a column is specified and only when FORMATTED is specified. It does NOT show for EXTENDED because extended output does not show the column names at the top which makes it difficult to comprehend the column stats output.
          Hide
          Prasanth Jayachandran added a comment -

          Addressed Xuefu Zhang's review comments.

          Show
          Prasanth Jayachandran added a comment - Addressed Xuefu Zhang 's review comments.
          Hide
          Xuefu Zhang added a comment -

          Thanks for the patch. Minor comments/questions on RB.

          One clarification: are column stats shown when either EXTENDED or FORMATTED is specified? And only when column is specified? I think this is important for documentation purpose. It would be good if functional details can be put in the description area.

          Show
          Xuefu Zhang added a comment - Thanks for the patch. Minor comments/questions on RB. One clarification: are column stats shown when either EXTENDED or FORMATTED is specified? And only when column is specified? I think this is important for documentation purpose. It would be good if functional details can be put in the description area.
          Hide
          Prasanth Jayachandran added a comment -

          attaching RB link

          Show
          Prasanth Jayachandran added a comment - attaching RB link

            People

            • Assignee:
              Prasanth Jayachandran
              Reporter:
              Prasanth Jayachandran
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development