Hive
  1. Hive
  2. HIVE-2870

Throw an error when a nonexistent partition is accessed in strict mode

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Query Processor
    • Labels:
      None
    • Release Note:
      Throw an error when a nonexistent partition is accessed in strict mode

      Description

      When a table does not exist and someone tries to read from it in a query, Hive throws an error.

      But if a partition is directly accessed that does not exist, an error is not thrown. This is inconsistent and also leads to a lot of confused users who get no output.

      This task is to cause Hive to throw an error when the partition pruner for a query eliminates ALL existing partitions for some table when running in strict mode.

        Activity

        Hide
        Phabricator added a comment -

        lucian requested code review of "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".
        Reviewers: JIRA

        HIVE-2870: Throw an error when a nonexistent partition is accessed.

        When a table does not exist and someone tries to read from it
        in a query, Hive throws an error.

        But if a partition is directly accessed that does not exist, an error
        is not thrown. This is inconsistent and also leads to a lot of
        confused users who get no output.

        This patch is to cause Hive to throw an error when the partition pruner
        for a query eliminates ALL existing partitions for some table when
        running in strict mode.

        Task ID: #

        Blame Rev:

        When a table does not exist and someone tries to read from it in a query, Hive throws an error.

        But if a partition is directly accessed that does not exist, an error is not thrown. This is inconsistent and also leads to a lot of confused users who get no output.

        This task is to cause Hive to throw an error when the partition pruner for a query eliminates ALL existing partitions for some table when running in strict mode.

        TEST PLAN
        Revert Plan:

        Tags:

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        AFFECTED FILES
        ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
        ql/src/java/org/apache/hadoop/hive/ql/parse/ErrorMsg.java

        MANAGE HERALD DIFFERENTIAL RULES
        https://reviews.facebook.net/herald/view/differential/

        WHY DID I GET THIS EMAIL?
        https://reviews.facebook.net/herald/transcript/5145/

        Tip: use the X-Herald-Rules header to filter Herald messages in your client.

        Show
        Phabricator added a comment - lucian requested code review of " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". Reviewers: JIRA HIVE-2870 : Throw an error when a nonexistent partition is accessed. When a table does not exist and someone tries to read from it in a query, Hive throws an error. But if a partition is directly accessed that does not exist, an error is not thrown. This is inconsistent and also leads to a lot of confused users who get no output. This patch is to cause Hive to throw an error when the partition pruner for a query eliminates ALL existing partitions for some table when running in strict mode. Task ID: # Blame Rev: When a table does not exist and someone tries to read from it in a query, Hive throws an error. But if a partition is directly accessed that does not exist, an error is not thrown. This is inconsistent and also leads to a lot of confused users who get no output. This task is to cause Hive to throw an error when the partition pruner for a query eliminates ALL existing partitions for some table when running in strict mode. TEST PLAN Revert Plan: Tags: REVISION DETAIL https://reviews.facebook.net/D2319 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java ql/src/java/org/apache/hadoop/hive/ql/parse/ErrorMsg.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/5145/ Tip: use the X-Herald-Rules header to filter Herald messages in your client.
        Hide
        Phabricator added a comment -

        akramer has commented on the revision "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".

        ...is the clarification that this is a "strict mode" restriction automatically included in the error message?

        INLINE COMMENTS
        ql/src/java/org/apache/hadoop/hive/ql/parse/ErrorMsg.java:82 "partitions" should be plural

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        Show
        Phabricator added a comment - akramer has commented on the revision " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". ...is the clarification that this is a "strict mode" restriction automatically included in the error message? INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/parse/ErrorMsg.java:82 "partitions" should be plural REVISION DETAIL https://reviews.facebook.net/D2319
        Hide
        Phabricator added a comment -

        lucian has commented on the revision "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".

        > ...is the clarification that this is a "strict mode" restriction automatically included in the error message?

        No, but neither is it if for other strict-only errors:

        hive> select * from tmp_table_with_partitions;
        FAILED: Error in semantic analysis: No partition predicate found for Alias "tmp_table_with_partitions" Table "tmp_table_with_partitions"

        hive> select count from tmp_table_with_partitions where ds="asdf";
        FAILED: Error in semantic analysis: No valid partition left after pruning for Alias "tmp_table_with_partitions" Table "tmp_table_with_partitions"

        I can change it, but I wanted to be consistent with that other one I tested.

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        Show
        Phabricator added a comment - lucian has commented on the revision " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". > ...is the clarification that this is a "strict mode" restriction automatically included in the error message? No, but neither is it if for other strict-only errors: hive> select * from tmp_table_with_partitions; FAILED: Error in semantic analysis: No partition predicate found for Alias "tmp_table_with_partitions" Table "tmp_table_with_partitions" hive> select count from tmp_table_with_partitions where ds="asdf"; FAILED: Error in semantic analysis: No valid partition left after pruning for Alias "tmp_table_with_partitions" Table "tmp_table_with_partitions" I can change it, but I wanted to be consistent with that other one I tested. REVISION DETAIL https://reviews.facebook.net/D2319
        Hide
        Phabricator added a comment -

        lucian has commented on the revision "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".

        I checked the error message list and I see other errors advertise the nonstrict mode in case the user really wants to run the operation. I'll change the message.

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        Show
        Phabricator added a comment - lucian has commented on the revision " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". I checked the error message list and I see other errors advertise the nonstrict mode in case the user really wants to run the operation. I'll change the message. REVISION DETAIL https://reviews.facebook.net/D2319
        Hide
        Phabricator added a comment -

        akramer has requested changes to the revision "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".

        > I checked the error message list and I see other errors advertise the nonstrict mode in case the user really wants to run the operation. I'll change the message.

        Thanks. It is always better to do something right than wrong. Consistency only matter when there are multiple "right" things to do or no "right" thing to do.

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        BRANCH
        HIVE-2870-dev-branch

        Show
        Phabricator added a comment - akramer has requested changes to the revision " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". > I checked the error message list and I see other errors advertise the nonstrict mode in case the user really wants to run the operation. I'll change the message. Thanks. It is always better to do something right than wrong. Consistency only matter when there are multiple "right" things to do or no "right" thing to do. REVISION DETAIL https://reviews.facebook.net/D2319 BRANCH HIVE-2870 -dev-branch
        Hide
        Phabricator added a comment -

        lucian updated the revision "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".
        Reviewers: JIRA, njain, kevinwilfong, akramer

        found that the error message was already present

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        AFFECTED FILES
        ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java

        Show
        Phabricator added a comment - lucian updated the revision " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". Reviewers: JIRA, njain, kevinwilfong, akramer found that the error message was already present REVISION DETAIL https://reviews.facebook.net/D2319 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
        Hide
        Phabricator added a comment -

        kevinwilfong has commented on the revision "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".

        Could you add a test case for this, e.g. look at the files in ql/src/test/queries/clientnegative

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        Show
        Phabricator added a comment - kevinwilfong has commented on the revision " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". Could you add a test case for this, e.g. look at the files in ql/src/test/queries/clientnegative REVISION DETAIL https://reviews.facebook.net/D2319
        Hide
        Phabricator added a comment -

        kevinwilfong has commented on the revision "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".

        By a test case, I meant add a query to that list to test that the query actually does fail if there are no partitions.

        Also, it might be a good idea to add a test case in client positive to make sure you can turn it off.

        See https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-AddaUnitTest

        INLINE COMMENTS
        ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java:246-252 Sorry to be picky, but could you also add a small comment here just giving a quick explanation of what it does.

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        Show
        Phabricator added a comment - kevinwilfong has commented on the revision " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". By a test case, I meant add a query to that list to test that the query actually does fail if there are no partitions. Also, it might be a good idea to add a test case in client positive to make sure you can turn it off. See https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-AddaUnitTest INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java:246-252 Sorry to be picky, but could you also add a small comment here just giving a quick explanation of what it does. REVISION DETAIL https://reviews.facebook.net/D2319
        Hide
        Phabricator added a comment -

        lucian updated the revision "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".
        Reviewers: JIRA, njain, kevinwilfong, akramer

        added positive and negative test and a comment

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        AFFECTED FILES
        ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
        ql/src/test/queries/clientnegative/nopart_after_pruning.q
        ql/src/test/queries/clientpositive/nopart_after_pruning.q
        ql/src/test/results/clientnegative/nopart_after_pruning.q.out
        ql/src/test/results/clientpositive/nopart_after_pruning.q.out

        Show
        Phabricator added a comment - lucian updated the revision " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". Reviewers: JIRA, njain, kevinwilfong, akramer added positive and negative test and a comment REVISION DETAIL https://reviews.facebook.net/D2319 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java ql/src/test/queries/clientnegative/nopart_after_pruning.q ql/src/test/queries/clientpositive/nopart_after_pruning.q ql/src/test/results/clientnegative/nopart_after_pruning.q.out ql/src/test/results/clientpositive/nopart_after_pruning.q.out
        Hide
        Phabricator added a comment -

        kevinwilfong has accepted the revision "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".

        +1 Will commit if tests pass

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        BRANCH
        HIVE-2870-dev-branch

        Show
        Phabricator added a comment - kevinwilfong has accepted the revision " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". +1 Will commit if tests pass REVISION DETAIL https://reviews.facebook.net/D2319 BRANCH HIVE-2870 -dev-branch
        Hide
        Phabricator added a comment -

        kevinwilfong has requested changes to the revision "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".

        -1

        Sorry to have to go back after accepting, but I spoke with Namit Jain (another Hive committer) about this diff.

        This diff has the potential to break queries which are otherwise working, such as a union all where one of the partitions being unioned doesn't exist. Particularly, if that partition is only sometimes generated and the query is scripted.

        To avoid this, could you change the exception to a warning which is printed to the console. You could still use a variable which writes to the console in strict mode, and logs to info in nonstrict mode, although it would be better to create a new Hive configuration variable to do this, rather than reusing HIVEMAPREDMODE which is used far too much.

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        BRANCH
        HIVE-2870-dev-branch

        Show
        Phabricator added a comment - kevinwilfong has requested changes to the revision " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". -1 Sorry to have to go back after accepting, but I spoke with Namit Jain (another Hive committer) about this diff. This diff has the potential to break queries which are otherwise working, such as a union all where one of the partitions being unioned doesn't exist. Particularly, if that partition is only sometimes generated and the query is scripted. To avoid this, could you change the exception to a warning which is printed to the console. You could still use a variable which writes to the console in strict mode, and logs to info in nonstrict mode, although it would be better to create a new Hive configuration variable to do this, rather than reusing HIVEMAPREDMODE which is used far too much. REVISION DETAIL https://reviews.facebook.net/D2319 BRANCH HIVE-2870 -dev-branch
        Hide
        Phabricator added a comment -

        lucian updated the revision "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".
        Reviewers: JIRA, njain, kevinwilfong, akramer

        found a suitable ConfVar

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        AFFECTED FILES
        ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
        ql/src/test/queries/clientnegative/nopart_after_pruning.q
        ql/src/test/queries/clientpositive/nopart_after_pruning.q
        ql/src/test/results/clientnegative/nopart_after_pruning.q.out
        ql/src/test/results/clientpositive/nopart_after_pruning.q.out

        Show
        Phabricator added a comment - lucian updated the revision " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". Reviewers: JIRA, njain, kevinwilfong, akramer found a suitable ConfVar REVISION DETAIL https://reviews.facebook.net/D2319 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java ql/src/test/queries/clientnegative/nopart_after_pruning.q ql/src/test/queries/clientpositive/nopart_after_pruning.q ql/src/test/results/clientnegative/nopart_after_pruning.q.out ql/src/test/results/clientpositive/nopart_after_pruning.q.out
        Hide
        Phabricator added a comment -

        kevinwilfong has commented on the revision "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".

        That conf var is currently being used to throw an error if a query is written to create dynamic partitions and it doesn't produce any partitions. That is a weird dependency to introduce, I think it would be better to create a new variable.

        Could you print an warning or at least log one if strict mode is set to false and a partition is empty. This would be useful because users could still see that they are querying a partition unnecessarily, but without the risk of breaking existing queries.

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        Show
        Phabricator added a comment - kevinwilfong has commented on the revision " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". That conf var is currently being used to throw an error if a query is written to create dynamic partitions and it doesn't produce any partitions. That is a weird dependency to introduce, I think it would be better to create a new variable. Could you print an warning or at least log one if strict mode is set to false and a partition is empty. This would be useful because users could still see that they are querying a partition unnecessarily, but without the risk of breaking existing queries. REVISION DETAIL https://reviews.facebook.net/D2319
        Hide
        Phabricator added a comment -

        lucian updated the revision "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".
        Reviewers: JIRA, njain, kevinwilfong, akramer

        add new confvar, throw exception or log message

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        AFFECTED FILES
        common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
        ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
        ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
        ql/src/java/org/apache/hadoop/hive/ql/parse/ErrorMsg.java
        ql/src/test/queries/clientnegative/nopart_after_pruning.q
        ql/src/test/queries/clientpositive/nopart_after_pruning.q
        ql/src/test/results/clientnegative/nopart_after_pruning.q.out
        ql/src/test/results/clientpositive/nopart_after_pruning.q.out

        Show
        Phabricator added a comment - lucian updated the revision " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". Reviewers: JIRA, njain, kevinwilfong, akramer add new confvar, throw exception or log message REVISION DETAIL https://reviews.facebook.net/D2319 AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/parse/ErrorMsg.java ql/src/test/queries/clientnegative/nopart_after_pruning.q ql/src/test/queries/clientpositive/nopart_after_pruning.q ql/src/test/results/clientnegative/nopart_after_pruning.q.out ql/src/test/results/clientpositive/nopart_after_pruning.q.out
        Hide
        Phabricator added a comment -

        kevinwilfong has commented on the revision "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".

        Sorry, I should have been clearer earlier, by strict mode is set to false, I meant HIVE_ERROR_NO_VALID_PARTITIONS_LEFT is set to false.

        Also, could you add an entry to conf/hive-default.xml.template with your new variable.

        Finally, I'm sorry, I know I've been waffling on this point, but I think the way to go is to print a warning to the console if HIVE_ERROR_NO_VALID_PARTITIONS_LEFT is set to true (using SessionState.get().getConsole().printInfo()) and otherwise log it. This way you can default it to true.

        To give you better context, my reasoning is that if an administrator sets the variable to false, users will be unlikely to set it to true, and if this might break valid queries, it would be difficult for administrators to justify setting it to true, so the error will probably not get used. If you make it simply print a warning, it will not break queries, so the variable can be defaulted to true. If the users have a valid query and are annoyed by the warning, they can turn it off using that variable, but the information won't totally be lost if you log it in that case.

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        Show
        Phabricator added a comment - kevinwilfong has commented on the revision " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". Sorry, I should have been clearer earlier, by strict mode is set to false, I meant HIVE_ERROR_NO_VALID_PARTITIONS_LEFT is set to false. Also, could you add an entry to conf/hive-default.xml.template with your new variable. Finally, I'm sorry, I know I've been waffling on this point, but I think the way to go is to print a warning to the console if HIVE_ERROR_NO_VALID_PARTITIONS_LEFT is set to true (using SessionState.get().getConsole().printInfo()) and otherwise log it. This way you can default it to true. To give you better context, my reasoning is that if an administrator sets the variable to false, users will be unlikely to set it to true, and if this might break valid queries, it would be difficult for administrators to justify setting it to true, so the error will probably not get used. If you make it simply print a warning, it will not break queries, so the variable can be defaulted to true. If the users have a valid query and are annoyed by the warning, they can turn it off using that variable, but the information won't totally be lost if you log it in that case. REVISION DETAIL https://reviews.facebook.net/D2319
        Hide
        Phabricator added a comment -

        kevinwilfong has commented on the revision "HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode".

        INLINE COMMENTS
        ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java:252 Also, I think this line will be removed based on my comments above, but could you make sure that no lines in the future exceed 100 chars.

        REVISION DETAIL
        https://reviews.facebook.net/D2319

        Show
        Phabricator added a comment - kevinwilfong has commented on the revision " HIVE-2870 [jira] Throw an error when a nonexistent partition is accessed in strict mode". INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java:252 Also, I think this line will be removed based on my comments above, but could you make sure that no lines in the future exceed 100 chars. REVISION DETAIL https://reviews.facebook.net/D2319

          People

          • Assignee:
            Unassigned
            Reporter:
            Lucian Adrian Grijincu
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Time Tracking

              Estimated:
              Original Estimate - 24h
              24h
              Remaining:
              Remaining Estimate - 24h
              24h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development