Hive
  1. Hive
  2. HIVE-4858

Sort "show grant" result to improve usability and testability

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.10.0, 0.11.0
    • Fix Version/s: 0.12.0
    • Component/s: CLI
    • Labels:
      None

      Description

      Currently Hive outputs the result of "show grant" command in no deterministic order. It outputs the set of each privilege type in the order of whatever returned from DB (DataNucleus). Randomness can arise and tests (depending on the order) can fail, especially in events of library upgrade (DN or JVM upgrade). Sorting the result will avoid the potential randomness and make the output more deterministic, thus not only improving the readability of the output but also making the test more robust.

      1. HIVE-4858.patch
        10 kB
        Xuefu Zhang
      2. HIVE-4858.1.patch
        10 kB
        Xuefu Zhang

        Activity

        Hide
        Xuefu Zhang added a comment -
        Show
        Xuefu Zhang added a comment - Review board: https://reviews.apache.org/r/12562/
        Hide
        Xuefu Zhang added a comment -

        Patch updated based on rb comments.

        Show
        Xuefu Zhang added a comment - Patch updated based on rb comments.
        Hide
        Hive QA added a comment -

        Overall: +1 all checks pass

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12592841/HIVE-4858.1.patch

        SUCCESS: +1 all tests passed

        Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/72/testReport
        Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/72/console

        Messages:
        Executing org.apache.hive.ptest.execution.CleanupPhase
        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase

        This message is automatically generated.

        Show
        Hive QA added a comment - Overall : +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12592841/HIVE-4858.1.patch SUCCESS: +1 all tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/72/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/72/console Messages: Executing org.apache.hive.ptest.execution.CleanupPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase This message is automatically generated.
        Hide
        Brock Noland added a comment -

        +1

        Show
        Brock Noland added a comment - +1
        Hide
        Edward Capriolo added a comment -

        Do not have time for the review but +1 for the idea.

        Show
        Edward Capriolo added a comment - Do not have time for the review but +1 for the idea.
        Hide
        Ashutosh Chauhan added a comment -

        HIVE-3775 introduced a directive – SORT_BEFORE_DIFF specifically for this purpose which we can add in .q files. Advantage of that is we need not to modify source specifically to take care of test issues. Xuefu Zhang Wondering if you have looked at that? Looks like that should be sufficient here as well.

        Show
        Ashutosh Chauhan added a comment - HIVE-3775 introduced a directive – SORT_BEFORE_DIFF specifically for this purpose which we can add in .q files. Advantage of that is we need not to modify source specifically to take care of test issues. Xuefu Zhang Wondering if you have looked at that? Looks like that should be sufficient here as well.
        Hide
        Xuefu Zhang added a comment -

        Ashutosh Chauhan I didn't know about HIVE-3775, but it seems a good way to solve random order problem for testing purpose. However, do you think also it's a good idea to have a deterministic ordering for SQL outputs? For instance, it might seems more usable with an ordered output if user does a "show tables", etc.

        Having a deterministic ordering in sql output is another motivation of this JIRA, though it only takes care of "show grants". Hive is a little inconsistent in this. I think "show databases" and "show tables" diffs in this behavior.

        Show
        Xuefu Zhang added a comment - Ashutosh Chauhan I didn't know about HIVE-3775 , but it seems a good way to solve random order problem for testing purpose. However, do you think also it's a good idea to have a deterministic ordering for SQL outputs? For instance, it might seems more usable with an ordered output if user does a "show tables", etc. Having a deterministic ordering in sql output is another motivation of this JIRA, though it only takes care of "show grants". Hive is a little inconsistent in this. I think "show databases" and "show tables" diffs in this behavior.
        Hide
        Edward Capriolo added a comment -

        Ashutosh Chauhan That is interesting. I think we are better off fixing the problem at the source. Most things are easier to understand if they are sorted, unless there is a performance hit, or there is some other benefit to the a non sorted order, for example if you want to know which grant was created first.

        Do you get what I am saying? Like if you had to manually audit 1000 grants and they were not sorted the first thing you would probably do is drop them into a spread sheet and sort them.

        Some things should not be sorted, like hives column definitions are in the order of the columns in the text file.

        Show
        Edward Capriolo added a comment - Ashutosh Chauhan That is interesting. I think we are better off fixing the problem at the source. Most things are easier to understand if they are sorted, unless there is a performance hit, or there is some other benefit to the a non sorted order, for example if you want to know which grant was created first. Do you get what I am saying? Like if you had to manually audit 1000 grants and they were not sorted the first thing you would probably do is drop them into a spread sheet and sort them. Some things should not be sorted, like hives column definitions are in the order of the columns in the text file.
        Hide
        Brock Noland added a comment -

        Interesting, I had not seen SORT_BEFORE_DIFF as well. I don't have a strong opinion either way but I'd prefer deterministic output from hive itself unless it's expensive.

        Show
        Brock Noland added a comment - Interesting, I had not seen SORT_BEFORE_DIFF as well. I don't have a strong opinion either way but I'd prefer deterministic output from hive itself unless it's expensive.
        Hide
        Ashutosh Chauhan added a comment -

        In general I don't think its good idea to change software to appease test infra, though we do that quite a few places in Hive. In this particular instance argument of sorted output for ddl operations is weak in my opinion (no other query engine I know of promises this), but I agree there is no harm in that. Since, source changes are pretty straight forward, I am ok to let this slip in.

        Show
        Ashutosh Chauhan added a comment - In general I don't think its good idea to change software to appease test infra, though we do that quite a few places in Hive. In this particular instance argument of sorted output for ddl operations is weak in my opinion (no other query engine I know of promises this), but I agree there is no harm in that. Since, source changes are pretty straight forward, I am ok to let this slip in.
        Hide
        Edward Capriolo added a comment -

        Well the root of the problem is that our test infrastructure uses "diff" so we are always monkeying around deterministic output . The diff approach has it's merits but it is far from sexy.

        Show
        Edward Capriolo added a comment - Well the root of the problem is that our test infrastructure uses "diff" so we are always monkeying around deterministic output . The diff approach has it's merits but it is far from sexy.
        Hide
        Brock Noland added a comment -

        Since the discussion on this appears to have wrapped up, I committed it to trunk. Thank you for your contribution Xuefu!

        Show
        Brock Noland added a comment - Since the discussion on this appears to have wrapped up, I committed it to trunk. Thank you for your contribution Xuefu!
        Hide
        Ashutosh Chauhan added a comment -

        This issue has been fixed and released as part of 0.12 release. If you find further issues, please create a new jira and link it to this one.

        Show
        Ashutosh Chauhan added a comment - This issue has been fixed and released as part of 0.12 release. If you find further issues, please create a new jira and link it to this one.

          People

          • Assignee:
            Xuefu Zhang
            Reporter:
            Xuefu Zhang
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development