Hive
  1. Hive
  2. HIVE-4051

Hive's metastore suffers from 1+N queries when querying partitions & is slow

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.12.0
    • Component/s: Clients, Metastore
    • Labels:
      None
    • Environment:

      RHEL 6.3 / EC2 C1.XL

      Description

      Hive's query client takes a long time to initialize & start planning queries because of delays in creating all the MTable/MPartition objects.

      For a hive db with 1800 partitions, the metastore took 6-7 seconds to initialize - firing approximately 5900 queries to the mysql database.

      Several of those queries fetch exactly one row to create a single object on the client.

      The following 12 queries were repeated for each partition, generating a storm of SQL queries

      4 Query     SELECT `A0`.`SD_ID`,`B0`.`INPUT_FORMAT`,`B0`.`IS_COMPRESSED`,`B0`.`IS_STOREDASSUBDIRECTORIES`,`B0`.`LOCATION`,`B0`.`NUM_BUCKETS`,`B0`.`OUTPUT_FORMAT`,`B0`.`SD_ID` FROM `PARTITIONS` `A0` LEFT OUTER JOIN `SDS` `B0` ON `A0`.`SD_ID` = `B0`.`SD_ID` WHERE `A0`.`PART_ID` = 3945
      4 Query     SELECT `A0`.`CD_ID`,`B0`.`CD_ID` FROM `SDS` `A0` LEFT OUTER JOIN `CDS` `B0` ON `A0`.`CD_ID` = `B0`.`CD_ID` WHERE `A0`.`SD_ID` =4871
      4 Query     SELECT COUNT(*) FROM `COLUMNS_V2` THIS WHERE THIS.`CD_ID`=1546 AND THIS.`INTEGER_IDX`>=0
      4 Query     SELECT `A0`.`COMMENT`,`A0`.`COLUMN_NAME`,`A0`.`TYPE_NAME`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `COLUMNS_V2` `A0` WHERE `A0`.`CD_ID` = 1546 AND `A0`.`INTEGER_IDX` >= 0 ORDER BY NUCORDER0
      4 Query     SELECT `A0`.`SERDE_ID`,`B0`.`NAME`,`B0`.`SLIB`,`B0`.`SERDE_ID` FROM `SDS` `A0` LEFT OUTER JOIN `SERDES` `B0` ON `A0`.`SERDE_ID` = `B0`.`SERDE_ID` WHERE `A0`.`SD_ID` =4871
      4 Query     SELECT COUNT(*) FROM `SORT_COLS` THIS WHERE THIS.`SD_ID`=4871 AND THIS.`INTEGER_IDX`>=0
      4 Query     SELECT `A0`.`COLUMN_NAME`,`A0`.`ORDER`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `SORT_COLS` `A0` WHERE `A0`.`SD_ID` =4871 AND `A0`.`INTEGER_IDX` >= 0 ORDER BY NUCORDER0
      4 Query     SELECT COUNT(*) FROM `SKEWED_VALUES` THIS WHERE THIS.`SD_ID_OID`=4871 AND THIS.`INTEGER_IDX`>=0
      4 Query     SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS NUCLEUS_TYPE,`A1`.`STRING_LIST_ID`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `SKEWED_VALUES` `A0` INNER JOIN `SKEWED_STRING_LIST` `A1` ON `A0`.`STRING_LIST_ID_EID` = `A1`.`STRING_LIST_ID` WHERE `A0`.`SD_ID_OID` =4871 AND `A0`.`INTEGER_IDX` >= 0 ORDER BY NUCORDER0
      4 Query     SELECT COUNT(*) FROM `SKEWED_COL_VALUE_LOC_MAP` WHERE `SD_ID` =4871 AND `STRING_LIST_ID_KID` IS NOT NULL
      4 Query     SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS NUCLEUS_TYPE,`A0`.`STRING_LIST_ID` FROM `SKEWED_STRING_LIST` `A0` INNER JOIN `SKEWED_COL_VALUE_LOC_MAP` `B0` ON `A0`.`STRING_LIST_ID` = `B0`.`STRING_LIST_ID_KID` WHERE `B0`.`SD_ID` =4871
      4 Query     SELECT `A0`.`STRING_LIST_ID_KID`,`A0`.`LOCATION` FROM `SKEWED_COL_VALUE_LOC_MAP` `A0` WHERE `A0`.`SD_ID` =4871 AND NOT (`A0`.`STRING_LIST_ID_KID` IS NULL)
      

      This data is not detached or cached, so this operation is performed during every query plan for the partitions, even in the same hive client.

      The queries are automatically generated by JDO/DataNucleus which makes it nearly impossible to rewrite it into a single denormalized join operation & process it locally.

      Attempts to optimize this with JDO fetch-groups did not bear fruit in improving the query count.

      1. HIVE-4051.D11805.1.patch
        36 kB
        Phabricator
      2. HIVE-4051.D11805.2.patch
        37 kB
        Phabricator
      3. HIVE-4051.D11805.3.patch
        176 kB
        Phabricator
      4. HIVE-4051.D11805.4.patch
        181 kB
        Phabricator
      5. HIVE-4051.D11805.5.patch
        182 kB
        Phabricator
      6. HIVE-4051.D11805.6.patch
        40 kB
        Phabricator
      7. HIVE-4051.D11805.7.patch
        43 kB
        Phabricator
      8. HIVE-4051.D11805.8.patch
        51 kB
        Phabricator
      9. HIVE-4051.D11805.9.patch
        51 kB
        Phabricator

        Issue Links

          Activity

          Hide
          Ashutosh Chauhan added a comment -

          This issue has been fixed and released as part of 0.12 release. If you find further issues, please create a new jira and link it to this one.

          Show
          Ashutosh Chauhan added a comment - This issue has been fixed and released as part of 0.12 release. If you find further issues, please create a new jira and link it to this one.
          Hide
          Sergey Shelukhin added a comment -

          this and the followup patches (HIVE-5158) will take care of pre-map-reduce-job slowdown on select * with many partitions, but it's hard to tell whether that's the main culprit from just looking at the query.
          show table I am not sure, if not it should be easy to extend.

          Show
          Sergey Shelukhin added a comment - this and the followup patches ( HIVE-5158 ) will take care of pre-map-reduce-job slowdown on select * with many partitions, but it's hard to tell whether that's the main culprit from just looking at the query. show table I am not sure, if not it should be easy to extend.
          Hide
          Doug Sedlak added a comment -

          I've noticed that the more partitions in a Hive table, the slower the following operations come back. With thousands of partitions they approach painfully slow:
          SELECT * FROM TABNAME
          SHOW TABLE EXTENDED LIKE `TABNAME`

          Do you know if this fix takes case of these issues? If not is it something you could test?
          If not, I'll enter a new case.

          Thanks, Doug doug.sedlak@sas.com

          Show
          Doug Sedlak added a comment - I've noticed that the more partitions in a Hive table, the slower the following operations come back. With thousands of partitions they approach painfully slow: SELECT * FROM TABNAME SHOW TABLE EXTENDED LIKE `TABNAME` Do you know if this fix takes case of these issues? If not is it something you could test? If not, I'll enter a new case. Thanks, Doug doug.sedlak@sas.com
          Hide
          Hudson added a comment -

          ABORTED: Integrated in Hive-trunk-hadoop1-ptest #121 (See https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/121/)
          HIVE-4051 : Hive's metastore suffers from 1+N queries when querying partitions & is slow (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1511177)

          • /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          • /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
          • /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
          • /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
          Show
          Hudson added a comment - ABORTED: Integrated in Hive-trunk-hadoop1-ptest #121 (See https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/121/ ) HIVE-4051 : Hive's metastore suffers from 1+N queries when querying partitions & is slow (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1511177 ) /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Hive-trunk-h0.21 #2250 (See https://builds.apache.org/job/Hive-trunk-h0.21/2250/)
          HIVE-4051 : Hive's metastore suffers from 1+N queries when querying partitions & is slow (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1511177)

          • /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          • /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
          • /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
          • /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
          Show
          Hudson added a comment - SUCCESS: Integrated in Hive-trunk-h0.21 #2250 (See https://builds.apache.org/job/Hive-trunk-h0.21/2250/ ) HIVE-4051 : Hive's metastore suffers from 1+N queries when querying partitions & is slow (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1511177 ) /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
          Hide
          Hudson added a comment -

          ABORTED: Integrated in Hive-trunk-hadoop2-ptest #50 (See https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/50/)
          HIVE-4051 : Hive's metastore suffers from 1+N queries when querying partitions & is slow (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1511177)

          • /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          • /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
          • /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
          • /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
          Show
          Hudson added a comment - ABORTED: Integrated in Hive-trunk-hadoop2-ptest #50 (See https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/50/ ) HIVE-4051 : Hive's metastore suffers from 1+N queries when querying partitions & is slow (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1511177 ) /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hive-trunk-hadoop2 #336 (See https://builds.apache.org/job/Hive-trunk-hadoop2/336/)
          HIVE-4051 : Hive's metastore suffers from 1+N queries when querying partitions & is slow (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1511177)

          • /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          • /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
          • /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
          • /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hive-trunk-hadoop2 #336 (See https://builds.apache.org/job/Hive-trunk-hadoop2/336/ ) HIVE-4051 : Hive's metastore suffers from 1+N queries when querying partitions & is slow (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1511177 ) /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
          Hide
          Ashutosh Chauhan added a comment -

          Committed to trunk. Thanks, Sergey!

          Show
          Ashutosh Chauhan added a comment - Committed to trunk. Thanks, Sergey!
          Hide
          Phabricator added a comment -

          ashutoshc has accepted the revision "HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow".

          +1

          REVISION DETAIL
          https://reviews.facebook.net/D11805

          BRANCH
          HIVE-4051

          ARCANIST PROJECT
          hive

          To: JIRA, ashutoshc, sershe
          Cc: brock

          Show
          Phabricator added a comment - ashutoshc has accepted the revision " HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow". +1 REVISION DETAIL https://reviews.facebook.net/D11805 BRANCH HIVE-4051 ARCANIST PROJECT hive To: JIRA, ashutoshc, sershe Cc: brock
          Hide
          Phabricator added a comment -

          sershe updated the revision "HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow".

          Fixing bugs introduced while moving code.

          Reviewers: ashutoshc, JIRA

          REVISION DETAIL
          https://reviews.facebook.net/D11805

          CHANGE SINCE LAST DIFF
          https://reviews.facebook.net/D11805?vs=37071&id=37083#toc

          AFFECTED FILES
          common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
          metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
          ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java

          To: JIRA, ashutoshc, sershe
          Cc: brock

          Show
          Phabricator added a comment - sershe updated the revision " HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow". Fixing bugs introduced while moving code. Reviewers: ashutoshc, JIRA REVISION DETAIL https://reviews.facebook.net/D11805 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D11805?vs=37071&id=37083#toc AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java To: JIRA, ashutoshc, sershe Cc: brock
          Hide
          Phabricator added a comment -

          sershe has commented on the revision "HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow".

          INLINE COMMENTS
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1681 fixed
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1776 fixed

          REVISION DETAIL
          https://reviews.facebook.net/D11805

          To: JIRA, ashutoshc, sershe
          Cc: brock

          Show
          Phabricator added a comment - sershe has commented on the revision " HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow". INLINE COMMENTS metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1681 fixed metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1776 fixed REVISION DETAIL https://reviews.facebook.net/D11805 To: JIRA, ashutoshc, sershe Cc: brock
          Hide
          Phabricator added a comment -

          sershe updated the revision "HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow".

          Moved the code for SQL filter generation and usage into separate class.
          The only other changes are latest two comments on Phabricator, as well as some minor cleanup like null checks.

          Reviewers: ashutoshc, JIRA

          REVISION DETAIL
          https://reviews.facebook.net/D11805

          CHANGE SINCE LAST DIFF
          https://reviews.facebook.net/D11805?vs=36879&id=37071#toc

          AFFECTED FILES
          common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
          metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
          ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java

          To: JIRA, ashutoshc, sershe
          Cc: brock

          Show
          Phabricator added a comment - sershe updated the revision " HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow". Moved the code for SQL filter generation and usage into separate class. The only other changes are latest two comments on Phabricator, as well as some minor cleanup like null checks. Reviewers: ashutoshc, JIRA REVISION DETAIL https://reviews.facebook.net/D11805 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D11805?vs=36879&id=37071#toc AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java To: JIRA, ashutoshc, sershe Cc: brock
          Hide
          Ashutosh Chauhan added a comment -

          I think automatic fallback is good which makes sure that user query always succeed even in cases where this direct-sql optimization doesn't work for whatever reason. Folks who wants to turn this off anyway, can always do this by setting config variable. I am +1 on the patch as it is.

          Show
          Ashutosh Chauhan added a comment - I think automatic fallback is good which makes sure that user query always succeed even in cases where this direct-sql optimization doesn't work for whatever reason. Folks who wants to turn this off anyway, can always do this by setting config variable. I am +1 on the patch as it is.
          Hide
          Phabricator added a comment -

          ashutoshc has requested changes to the revision "HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow".

          Mostly looks good. Can you update the final patch with new class in its own file, with following two comments (if they looks alright.)

          INLINE COMMENTS
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1681 You are still selecting dbname, tblname ?
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1776 Yes.. I think we should throw in those cases. Having empty list will mask the root problem if there is any which results from it.

          REVISION DETAIL
          https://reviews.facebook.net/D11805

          BRANCH
          HIVE-4051

          ARCANIST PROJECT
          hive

          To: JIRA, ashutoshc, sershe
          Cc: brock

          Show
          Phabricator added a comment - ashutoshc has requested changes to the revision " HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow". Mostly looks good. Can you update the final patch with new class in its own file, with following two comments (if they looks alright.) INLINE COMMENTS metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1681 You are still selecting dbname, tblname ? metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1776 Yes.. I think we should throw in those cases. Having empty list will mask the root problem if there is any which results from it. REVISION DETAIL https://reviews.facebook.net/D11805 BRANCH HIVE-4051 ARCANIST PROJECT hive To: JIRA, ashutoshc, sershe Cc: brock
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12595725/HIVE-4051.D11805.7.patch

          ERROR: -1 due to 1 failed/errored test(s), 2758 tests executed
          Failed tests:

          org.apache.hcatalog.pig.TestHCatStorer.testPartColsInData
          

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/296/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/296/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests failed with: TestsFailedException: 1 tests failed
          

          This message is automatically generated.

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12595725/HIVE-4051.D11805.7.patch ERROR: -1 due to 1 failed/errored test(s), 2758 tests executed Failed tests: org.apache.hcatalog.pig.TestHCatStorer.testPartColsInData Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/296/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/296/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed This message is automatically generated.
          Hide
          Phabricator added a comment -

          sershe updated the revision "HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow".

          Feedback from review board, split the first query because it's hideously slow otherwise after sorting is added.

          Reviewers: ashutoshc, JIRA

          REVISION DETAIL
          https://reviews.facebook.net/D11805

          CHANGE SINCE LAST DIFF
          https://reviews.facebook.net/D11805?vs=36771&id=36879#toc

          AFFECTED FILES
          common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
          metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
          ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java

          To: JIRA, ashutoshc, sershe
          Cc: brock

          Show
          Phabricator added a comment - sershe updated the revision " HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow". Feedback from review board, split the first query because it's hideously slow otherwise after sorting is added. Reviewers: ashutoshc, JIRA REVISION DETAIL https://reviews.facebook.net/D11805 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D11805?vs=36771&id=36879#toc AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java To: JIRA, ashutoshc, sershe Cc: brock
          Hide
          Sergey Shelukhin added a comment -

          Carl Steinbach Sorry, mistyped the alias in the previous comment.

          What do you think about the more conservative fallback scheme proposed above? (not in the patch yet)

          I can test on Postgres if desired but I don't think I have access to Oracle.

          Show
          Sergey Shelukhin added a comment - Carl Steinbach Sorry, mistyped the alias in the previous comment. What do you think about the more conservative fallback scheme proposed above? (not in the patch yet) I can test on Postgres if desired but I don't think I have access to Oracle.
          Hide
          Sergey Shelukhin added a comment -

          Some perf results (with a patch I am about to update).
          I am running metastore separately on the same box as MySQL box in a cluster, with default config.
          I've created a table as such:

          create table sqltest(c string) partitioned by (s string,n string,p int);
          

          and inserted 32k partitions with s being 32-character random string, n - a number between 0 and 14, and p also a number.
          There's no data in the table.

          I am running a query that would select 256 partitions with filter pushdown from hive CLI is on a different box
          The time spent for creating partition objects out of 32k is about 150-180ms on current database schema. Bulk of it (~100ms) is in the query that does the filtering by joining PARTITION_KEY_VALS.
          If this is done beforehand: "create index idx1 on PARTITION_KEY_VALS (PART_KEY_VAL);", the total server-side fetch time becomes about 50ms, ~40-45ms for assorted queries and ~5ms for java code.
          On empty dataset, the total time to run the query (no job as there's no data) is about 150ms in the latter case.
          Trunk with filter pushdown takes the total of 1.7-2 seconds in this case; with more partitions fetched the difference is proportional.
          Results are similar if both are run w/o pushdown (both are slower ofc).

          As mentioned above, I will move code into its own class in separate "last" iteration w/o logic changes in order to not complicate the review.

          Tuning MySQL/other databases, and queries, for further perf, can be in separate JIRA(s). The Hive admin/DBA could even do the former in the meantime if absolutely necessary.

          [~cws] What do you think about the more conservative fallback scheme proposed above? (not in the patch yet)
          I can test on Postgres if desired but I don't think I have access to Oracle.

          Show
          Sergey Shelukhin added a comment - Some perf results (with a patch I am about to update). I am running metastore separately on the same box as MySQL box in a cluster, with default config. I've created a table as such: create table sqltest(c string) partitioned by (s string,n string,p int ); and inserted 32k partitions with s being 32-character random string, n - a number between 0 and 14, and p also a number. There's no data in the table. I am running a query that would select 256 partitions with filter pushdown from hive CLI is on a different box The time spent for creating partition objects out of 32k is about 150-180ms on current database schema. Bulk of it (~100ms) is in the query that does the filtering by joining PARTITION_KEY_VALS. If this is done beforehand: "create index idx1 on PARTITION_KEY_VALS (PART_KEY_VAL);", the total server-side fetch time becomes about 50ms, ~40-45ms for assorted queries and ~5ms for java code. On empty dataset, the total time to run the query (no job as there's no data) is about 150ms in the latter case. Trunk with filter pushdown takes the total of 1.7-2 seconds in this case; with more partitions fetched the difference is proportional. Results are similar if both are run w/o pushdown (both are slower ofc). As mentioned above, I will move code into its own class in separate "last" iteration w/o logic changes in order to not complicate the review. Tuning MySQL/other databases, and queries, for further perf, can be in separate JIRA(s). The Hive admin/DBA could even do the former in the meantime if absolutely necessary. [~cws] What do you think about the more conservative fallback scheme proposed above? (not in the patch yet) I can test on Postgres if desired but I don't think I have access to Oracle.
          Hide
          Phabricator added a comment -

          sershe has commented on the revision "HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow".

          INLINE COMMENTS
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1654 Will do after other feedback, otherwise the diff will be hard to do between files
          build.xml:703 fixed
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1681 fixed
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1681 someone might use them without knowing... it's cheap to get couple more columns
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1711 treemaps don't have size initializer?
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1735 they are int both in db, and in methods
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1776 I am being paranoid... I think it cannot happen, but db schema doesn't prevent that.
          Should we throw if sd, serde or col Ids are null?
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1826 the database is created by script (or DN) in RDBMS-specific way I'd assume, so it would escape the reserved word properly. Mysql by default doesn't support ANSI way of escaping words... and it we use backticks for mysql it won't work on other RDBMS-es.
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:2268 the tx is external to the method... renamed method for clarity

          REVISION DETAIL
          https://reviews.facebook.net/D11805

          BRANCH
          HIVE-4051

          ARCANIST PROJECT
          hive

          To: JIRA, ashutoshc, sershe
          Cc: brock

          Show
          Phabricator added a comment - sershe has commented on the revision " HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow". INLINE COMMENTS metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1654 Will do after other feedback, otherwise the diff will be hard to do between files build.xml:703 fixed metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1681 fixed metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1681 someone might use them without knowing... it's cheap to get couple more columns metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1711 treemaps don't have size initializer? metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1735 they are int both in db, and in methods metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1776 I am being paranoid... I think it cannot happen, but db schema doesn't prevent that. Should we throw if sd, serde or col Ids are null? metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1826 the database is created by script (or DN) in RDBMS-specific way I'd assume, so it would escape the reserved word properly. Mysql by default doesn't support ANSI way of escaping words... and it we use backticks for mysql it won't work on other RDBMS-es. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:2268 the tx is external to the method... renamed method for clarity REVISION DETAIL https://reviews.facebook.net/D11805 BRANCH HIVE-4051 ARCANIST PROJECT hive To: JIRA, ashutoshc, sershe Cc: brock
          Hide
          Phabricator added a comment -

          ashutoshc has requested changes to the revision "HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow".

          Initial review pass. Few comments.

          INLINE COMMENTS
          build.xml:703 This is a good change which I like, but lets do it in separate jira : )
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1654 ObjectStore is getting too big. Lets move this class in a separate file.
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1681 I havent verified in code, but as far as I remember nothing in code at the moment uses CREATE_TIME, LAST_ACCESS_TIME, so we can drop these from select list.
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1711 You can initialize size of these treemaps with result.size()
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1681 We already know dbName and table name, why are you having them in select list?
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1735 If we do decide to keep create and last access time, can you check if they are Integer, I though they are longs.
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1776 Can there ever be a case when coldId == null ? I am not able to imagine one, did you see some failures ?
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1826 Instead of *, its better to use ORDER here, because RDBMS which doesn't support ORDER as column name would anyway be broken when schemas are created in it.
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:2268 This query is not encapsulated in transaction anymore?
          metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java:379 We should be able to support for non-strings as well, right? Is it just that you want to do that in later jira ?

          REVISION DETAIL
          https://reviews.facebook.net/D11805

          BRANCH
          HIVE-4051

          ARCANIST PROJECT
          hive

          To: JIRA, ashutoshc, sershe
          Cc: brock

          Show
          Phabricator added a comment - ashutoshc has requested changes to the revision " HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow". Initial review pass. Few comments. INLINE COMMENTS build.xml:703 This is a good change which I like, but lets do it in separate jira : ) metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1654 ObjectStore is getting too big. Lets move this class in a separate file. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1681 I havent verified in code, but as far as I remember nothing in code at the moment uses CREATE_TIME, LAST_ACCESS_TIME, so we can drop these from select list. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1711 You can initialize size of these treemaps with result.size() metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1681 We already know dbName and table name, why are you having them in select list? metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1735 If we do decide to keep create and last access time, can you check if they are Integer, I though they are longs. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1776 Can there ever be a case when coldId == null ? I am not able to imagine one, did you see some failures ? metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1826 Instead of *, its better to use ORDER here, because RDBMS which doesn't support ORDER as column name would anyway be broken when schemas are created in it. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:2268 This query is not encapsulated in transaction anymore? metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java:379 We should be able to support for non-strings as well, right? Is it just that you want to do that in later jira ? REVISION DETAIL https://reviews.facebook.net/D11805 BRANCH HIVE-4051 ARCANIST PROJECT hive To: JIRA, ashutoshc, sershe Cc: brock
          Hide
          Hive QA added a comment -

          Overall: +1 all checks pass

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12595477/HIVE-4051.D11805.6.patch

          SUCCESS: +1 2749 tests passed

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/278/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/278/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          

          This message is automatically generated.

          Show
          Hive QA added a comment - Overall : +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12595477/HIVE-4051.D11805.6.patch SUCCESS: +1 2749 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/278/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/278/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase This message is automatically generated.
          Hide
          Phabricator added a comment -

          sershe updated the revision "HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow".

          The above failed tests are due to sorting. Due to so many tests failing I changed the code to preserve the same sort as JDO, it's actually a very small change. I am getting some test failures locally that look like they are caused by my laptop problems; let me try to push to HiveQA again.

          Reviewers: JIRA

          REVISION DETAIL
          https://reviews.facebook.net/D11805

          CHANGE SINCE LAST DIFF
          https://reviews.facebook.net/D11805?vs=36627&id=36771#toc

          AFFECTED FILES
          build.xml
          common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
          metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
          ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java

          To: JIRA, sershe
          Cc: brock

          Show
          Phabricator added a comment - sershe updated the revision " HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow". The above failed tests are due to sorting. Due to so many tests failing I changed the code to preserve the same sort as JDO, it's actually a very small change. I am getting some test failures locally that look like they are caused by my laptop problems; let me try to push to HiveQA again. Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D11805 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D11805?vs=36627&id=36771#toc AFFECTED FILES build.xml common/src/java/org/apache/hadoop/hive/conf/HiveConf.java metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java To: JIRA, sershe Cc: brock
          Hide
          Sergey Shelukhin added a comment -

          Has anyone tested this patch on Derby, PostgreSQL, or Oracle? Until it's verified to work on these DBs I think this new code should be disabled by default.

          I tested on Derby and MySQL so far.
          Note that full fallback is there, so it could have a 3-position switch or two settings - current on/off being the same, and the "on, but turn off [for some grace period?] on first error"-setting. The latter could be the default, so in case if it fails it goes back to DN and doesn't introduce a lot of extra load.
          What do you think?

          Show
          Sergey Shelukhin added a comment - Has anyone tested this patch on Derby, PostgreSQL, or Oracle? Until it's verified to work on these DBs I think this new code should be disabled by default. I tested on Derby and MySQL so far. Note that full fallback is there, so it could have a 3-position switch or two settings - current on/off being the same, and the "on, but turn off [for some grace period?] on first error"-setting. The latter could be the default, so in case if it fails it goes back to DN and doesn't introduce a lot of extra load. What do you think?
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12595110/HIVE-4051.D11805.5.patch

          ERROR: -1 due to 8 failed/errored test(s), 2748 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part13
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part4
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats4
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part9
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part8
          

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/256/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/256/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests failed with: TestsFailedException: 8 tests failed
          

          This message is automatically generated.

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12595110/HIVE-4051.D11805.5.patch ERROR: -1 due to 8 failed/errored test(s), 2748 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part8 Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/256/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/256/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 8 tests failed This message is automatically generated.
          Hide
          Carl Steinbach added a comment -

          This introduces "direct SQL" optimization for metastore.
          When memstore gets partitions by filter or by names, it does so using ANSI SQL92 queries, if enabled (default).
          This will not work on non-RDBMS datastores, may not work on non-ANSI-compiant ones (I tested on mysql in default mode).

          Has anyone tested this patch on Derby, PostgreSQL, or Oracle? Until it's verified to work on these DBs I think this new code should be disabled by default.

          Show
          Carl Steinbach added a comment - This introduces "direct SQL" optimization for metastore. When memstore gets partitions by filter or by names, it does so using ANSI SQL92 queries, if enabled (default). This will not work on non-RDBMS datastores, may not work on non-ANSI-compiant ones (I tested on mysql in default mode). Has anyone tested this patch on Derby, PostgreSQL, or Oracle? Until it's verified to work on these DBs I think this new code should be disabled by default.
          Hide
          Phabricator added a comment -

          sershe updated the revision "HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow".

          Rebase the patch

          Reviewers: JIRA

          REVISION DETAIL
          https://reviews.facebook.net/D11805

          CHANGE SINCE LAST DIFF
          https://reviews.facebook.net/D11805?vs=36621&id=36627#toc

          AFFECTED FILES
          build.xml
          common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
          metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
          ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
          ql/src/test/queries/clientpositive/alter_partition_coltype.q
          ql/src/test/queries/clientpositive/load_dyn_part3.q
          ql/src/test/queries/clientpositive/load_dyn_part4.q
          ql/src/test/queries/clientpositive/load_dyn_part9.q
          ql/src/test/queries/clientpositive/ppr_pushdown2.q
          ql/src/test/queries/clientpositive/stats4.q
          ql/src/test/results/clientpositive/alter_partition_coltype.q.out
          ql/src/test/results/clientpositive/load_dyn_part3.q.out
          ql/src/test/results/clientpositive/load_dyn_part4.q.out
          ql/src/test/results/clientpositive/load_dyn_part9.q.out
          ql/src/test/results/clientpositive/ppr_pushdown2.q.out
          ql/src/test/results/clientpositive/stats4.q.out

          To: JIRA, sershe
          Cc: brock

          Show
          Phabricator added a comment - sershe updated the revision " HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow". Rebase the patch Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D11805 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D11805?vs=36621&id=36627#toc AFFECTED FILES build.xml common/src/java/org/apache/hadoop/hive/conf/HiveConf.java metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ql/src/test/queries/clientpositive/alter_partition_coltype.q ql/src/test/queries/clientpositive/load_dyn_part3.q ql/src/test/queries/clientpositive/load_dyn_part4.q ql/src/test/queries/clientpositive/load_dyn_part9.q ql/src/test/queries/clientpositive/ppr_pushdown2.q ql/src/test/queries/clientpositive/stats4.q ql/src/test/results/clientpositive/alter_partition_coltype.q.out ql/src/test/results/clientpositive/load_dyn_part3.q.out ql/src/test/results/clientpositive/load_dyn_part4.q.out ql/src/test/results/clientpositive/load_dyn_part9.q.out ql/src/test/results/clientpositive/ppr_pushdown2.q.out ql/src/test/results/clientpositive/stats4.q.out To: JIRA, sershe Cc: brock
          Hide
          Hive QA added a comment -

          Overall: -1 no tests executed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12595079/HIVE-4051.D11805.4.patch

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/252/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/252/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]]
          + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128'
          + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128'
          + cd /data/hive-ptest/working/
          + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-252/source-prep.txt
          + mkdir -p maven ivy
          + [[ svn = \s\v\n ]]
          + [[ -n '' ]]
          + [[ -d apache-svn-trunk-source ]]
          + [[ ! -d apache-svn-trunk-source/.svn ]]
          + [[ ! -d apache-svn-trunk-source ]]
          + cd apache-svn-trunk-source
          + svn revert -R .
          ++ awk '{print $2}'
          ++ egrep -v '^X|^Performing status on external'
          ++ svn status --no-ignore
          + rm -rf
          + svn update
          U    testutils/ptest2/src/test/resources/SomeTest-success.xml
          U    testutils/ptest2/src/test/resources/test-outputs/SomeTest-truncated.xml
          U    testutils/ptest2/src/test/resources/test-outputs/skewjoin_union_remove_1.q-TEST-org.apache.hadoop.hive.cli.TestCliDriver.xml
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestHostExecutor.java
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestExecutionPhase.testPassingUnitTest.approved.txt
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestCleanupPhase.java
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestScripts.testPrepGit.approved.txt
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestPhase.testRsyncFromLocalToRemoteInstancesWithFailureUnknown.approved.txt
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestCleanupPhase.testExecute.approved.txt
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestScripts.java
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestScripts.testPrepNone.approved.txt
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestScripts.testPrepSvn.approved.txt
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestPhase.java
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestPrepPhase.testExecute.approved.txt
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestPhase.testRsyncFromLocalToRemoteInstancesWithFailureOne.approved.txt
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestExecutionPhase.testFailingUnitTest.approved.txt
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestExecutionPhase.java
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/AbstractTestPhase.java
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestExecutionPhase.testPassingQFileTest.approved.txt
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/MockSSHCommandExecutor.java
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/MockRSyncCommandExecutor.java
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestReportParser.java
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestExecutionPhase.testFailingQFile.approved.txt
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/MockLocalCommandFactory.java
          U    testutils/ptest2/src/test/java/org/apache/hive/ptest/api/server/TestTestExecutor.java
          U    testutils/ptest2/src/main/resources/batch-exec.vm
          U    testutils/ptest2/src/main/resources/source-prep.vm
          U    testutils/ptest2/src/main/resources/log4j.properties
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/api/server/TestExecutor.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/api/server/ExecutionController.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/api/request/TestStartRequest.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/api/client/PTestClient.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/ExecutionPhase.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/PrepPhase.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/ReportingPhase.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/HostExecutor.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/PTest.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/LogDirectoryCleaner.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/Phase.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/JUnitReportParser.java
          A    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/HostExecutorBuilder.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/context/CloudExecutionContextProvider.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/context/CloudComputeService.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/conf/ExecutionContextConfiguration.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/conf/QFileTestBatch.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/conf/TestConfiguration.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/conf/TestParser.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/JIRAService.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/Drone.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/CleanupPhase.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/Constants.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/ssh/SSHCommandExecutor.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/ssh/RSyncCommandExecutor.java
          U    testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/ssh/AbstractSSHCommand.java
          U    eclipse-templates/.classpath
          U    eclipse-templates/.classpath._hbase
          
          Fetching external item into 'hcatalog/src/test/e2e/harness'
          Updated external to revision 1508709.
          
          Updated to revision 1508708.
          + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
          + patchFilePath=/data/hive-ptest/working/scratch/build.patch
          + [[ -f /data/hive-ptest/working/scratch/build.patch ]]
          + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
          + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch
          The patch does not appear to apply with p0 to p2
          + exit 1
          '
          

          This message is automatically generated.

          Show
          Hive QA added a comment - Overall : -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12595079/HIVE-4051.D11805.4.patch Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/252/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/252/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-252/source-prep.txt + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . ++ awk '{print $2}' ++ egrep -v '^X|^Performing status on external' ++ svn status --no-ignore + rm -rf + svn update U testutils/ptest2/src/test/resources/SomeTest-success.xml U testutils/ptest2/src/test/resources/test-outputs/SomeTest-truncated.xml U testutils/ptest2/src/test/resources/test-outputs/skewjoin_union_remove_1.q-TEST-org.apache.hadoop.hive.cli.TestCliDriver.xml U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestHostExecutor.java U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestExecutionPhase.testPassingUnitTest.approved.txt U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestCleanupPhase.java U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestScripts.testPrepGit.approved.txt U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestPhase.testRsyncFromLocalToRemoteInstancesWithFailureUnknown.approved.txt U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestCleanupPhase.testExecute.approved.txt U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestScripts.java U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestScripts.testPrepNone.approved.txt U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestScripts.testPrepSvn.approved.txt U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestPhase.java U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestPrepPhase.testExecute.approved.txt U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestPhase.testRsyncFromLocalToRemoteInstancesWithFailureOne.approved.txt U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestExecutionPhase.testFailingUnitTest.approved.txt U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestExecutionPhase.java U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/AbstractTestPhase.java U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestExecutionPhase.testPassingQFileTest.approved.txt U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/MockSSHCommandExecutor.java U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/MockRSyncCommandExecutor.java U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestReportParser.java U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/TestExecutionPhase.testFailingQFile.approved.txt U testutils/ptest2/src/test/java/org/apache/hive/ptest/execution/MockLocalCommandFactory.java U testutils/ptest2/src/test/java/org/apache/hive/ptest/api/server/TestTestExecutor.java U testutils/ptest2/src/main/resources/batch-exec.vm U testutils/ptest2/src/main/resources/source-prep.vm U testutils/ptest2/src/main/resources/log4j.properties U testutils/ptest2/src/main/java/org/apache/hive/ptest/api/server/TestExecutor.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/api/server/ExecutionController.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/api/request/TestStartRequest.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/api/client/PTestClient.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/ExecutionPhase.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/PrepPhase.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/ReportingPhase.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/HostExecutor.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/PTest.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/LogDirectoryCleaner.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/Phase.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/JUnitReportParser.java A testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/HostExecutorBuilder.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/context/CloudExecutionContextProvider.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/context/CloudComputeService.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/conf/ExecutionContextConfiguration.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/conf/QFileTestBatch.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/conf/TestConfiguration.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/conf/TestParser.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/JIRAService.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/Drone.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/CleanupPhase.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/Constants.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/ssh/SSHCommandExecutor.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/ssh/RSyncCommandExecutor.java U testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/ssh/AbstractSSHCommand.java U eclipse-templates/.classpath U eclipse-templates/.classpath._hbase Fetching external item into 'hcatalog/src/test/e2e/harness' Updated external to revision 1508709. Updated to revision 1508708. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0 to p2 + exit 1 ' This message is automatically generated.
          Hide
          Hive QA added a comment -

          Overall: -1 no tests executed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12595079/HIVE-4051.D11805.4.patch

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/251/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/251/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]]
          + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128'
          + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128'
          + cd /data/hive-ptest/working/
          + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-251/source-prep.txt
          + mkdir -p maven ivy
          + [[ svn = \s\v\n ]]
          + [[ -n '' ]]
          + [[ -d apache-svn-trunk-source ]]
          + [[ ! -d apache-svn-trunk-source/.svn ]]
          + [[ ! -d apache-svn-trunk-source ]]
          + cd apache-svn-trunk-source
          + svn revert -R .
          Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java'
          Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java'
          Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java'
          ++ egrep -v '^X|^Performing status on external'
          ++ awk '{print $2}'
          ++ svn status --no-ignore
          + rm -rf build hcatalog/build hcatalog/core/build hcatalog/storage-handlers/hbase/build hcatalog/server-extensions/build hcatalog/webhcat/svr/build hcatalog/webhcat/java-client/build hcatalog/hcatalog-pig-adapter/build common/src/gen
          + svn update
          
          Fetching external item into 'hcatalog/src/test/e2e/harness'
          External at revision 1508705.
          
          At revision 1508705.
          + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
          + patchFilePath=/data/hive-ptest/working/scratch/build.patch
          + [[ -f /data/hive-ptest/working/scratch/build.patch ]]
          + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
          + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch
          The patch does not appear to apply with p0 to p2
          + exit 1
          '
          

          This message is automatically generated.

          Show
          Hive QA added a comment - Overall : -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12595079/HIVE-4051.D11805.4.patch Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/251/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/251/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-251/source-prep.txt + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf build hcatalog/build hcatalog/core/build hcatalog/storage-handlers/hbase/build hcatalog/server-extensions/build hcatalog/webhcat/svr/build hcatalog/webhcat/java-client/build hcatalog/hcatalog-pig-adapter/build common/src/gen + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1508705. At revision 1508705. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0 to p2 + exit 1 ' This message is automatically generated.
          Hide
          Phabricator added a comment -

          sershe updated the revision "HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow".

          Followup - forgot to rerun one query after changing.

          Reviewers: JIRA

          REVISION DETAIL
          https://reviews.facebook.net/D11805

          CHANGE SINCE LAST DIFF
          https://reviews.facebook.net/D11805?vs=36615&id=36621#toc

          AFFECTED FILES
          build.xml
          common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
          metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
          ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
          ql/src/test/queries/clientpositive/alter_partition_coltype.q
          ql/src/test/queries/clientpositive/load_dyn_part3.q
          ql/src/test/queries/clientpositive/load_dyn_part4.q
          ql/src/test/queries/clientpositive/load_dyn_part9.q
          ql/src/test/queries/clientpositive/ppr_pushdown2.q
          ql/src/test/queries/clientpositive/stats4.q
          ql/src/test/results/clientpositive/alter_partition_coltype.q.out
          ql/src/test/results/clientpositive/load_dyn_part3.q.out
          ql/src/test/results/clientpositive/load_dyn_part4.q.out
          ql/src/test/results/clientpositive/load_dyn_part9.q.out
          ql/src/test/results/clientpositive/ppr_pushdown2.q.out
          ql/src/test/results/clientpositive/stats4.q.out

          To: JIRA, sershe
          Cc: brock

          Show
          Phabricator added a comment - sershe updated the revision " HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow". Followup - forgot to rerun one query after changing. Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D11805 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D11805?vs=36615&id=36621#toc AFFECTED FILES build.xml common/src/java/org/apache/hadoop/hive/conf/HiveConf.java metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ql/src/test/queries/clientpositive/alter_partition_coltype.q ql/src/test/queries/clientpositive/load_dyn_part3.q ql/src/test/queries/clientpositive/load_dyn_part4.q ql/src/test/queries/clientpositive/load_dyn_part9.q ql/src/test/queries/clientpositive/ppr_pushdown2.q ql/src/test/queries/clientpositive/stats4.q ql/src/test/results/clientpositive/alter_partition_coltype.q.out ql/src/test/results/clientpositive/load_dyn_part3.q.out ql/src/test/results/clientpositive/load_dyn_part4.q.out ql/src/test/results/clientpositive/load_dyn_part9.q.out ql/src/test/results/clientpositive/ppr_pushdown2.q.out ql/src/test/results/clientpositive/stats4.q.out To: JIRA, sershe Cc: brock
          Hide
          Phabricator added a comment -

          sershe updated the revision "HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow".

          Addressed Phabricator comments, fixed minor bugs (e.g. null checks, setTableName called instead of setDbName), added column schemas that ended up being needed after all, cleaned up the code a bit, added some short circuiting, added order to tests that had undefined order and so depended on the order in which partitions are returned (ORM code returns them by name, SQL by ID).
          Added some short-circuiting to the queries/getting stuff.
          I compared the reflection-based dump of SQL- and ORM- based objects from some tests (code not included) and they are the same.

          The existing tests seem to adequately cover this code. The only concern is that if it fails it's impossible to see as it falls back to ORM...

          Reviewers: JIRA

          REVISION DETAIL
          https://reviews.facebook.net/D11805

          CHANGE SINCE LAST DIFF
          https://reviews.facebook.net/D11805?vs=36357&id=36615#toc

          AFFECTED FILES
          build.xml
          common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
          metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
          ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
          ql/src/test/queries/clientpositive/alter_partition_coltype.q
          ql/src/test/queries/clientpositive/load_dyn_part3.q
          ql/src/test/queries/clientpositive/load_dyn_part4.q
          ql/src/test/queries/clientpositive/load_dyn_part9.q
          ql/src/test/queries/clientpositive/ppr_pushdown2.q
          ql/src/test/queries/clientpositive/stats4.q
          ql/src/test/results/clientpositive/load_dyn_part3.q.out
          ql/src/test/results/clientpositive/load_dyn_part4.q.out
          ql/src/test/results/clientpositive/load_dyn_part9.q.out
          ql/src/test/results/clientpositive/ppr_pushdown2.q.out
          ql/src/test/results/clientpositive/stats4.q.out

          To: JIRA, sershe
          Cc: brock

          Show
          Phabricator added a comment - sershe updated the revision " HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow". Addressed Phabricator comments, fixed minor bugs (e.g. null checks, setTableName called instead of setDbName), added column schemas that ended up being needed after all, cleaned up the code a bit, added some short circuiting, added order to tests that had undefined order and so depended on the order in which partitions are returned (ORM code returns them by name, SQL by ID). Added some short-circuiting to the queries/getting stuff. I compared the reflection-based dump of SQL- and ORM- based objects from some tests (code not included) and they are the same. The existing tests seem to adequately cover this code. The only concern is that if it fails it's impossible to see as it falls back to ORM... Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D11805 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D11805?vs=36357&id=36615#toc AFFECTED FILES build.xml common/src/java/org/apache/hadoop/hive/conf/HiveConf.java metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ql/src/test/queries/clientpositive/alter_partition_coltype.q ql/src/test/queries/clientpositive/load_dyn_part3.q ql/src/test/queries/clientpositive/load_dyn_part4.q ql/src/test/queries/clientpositive/load_dyn_part9.q ql/src/test/queries/clientpositive/ppr_pushdown2.q ql/src/test/queries/clientpositive/stats4.q ql/src/test/results/clientpositive/load_dyn_part3.q.out ql/src/test/results/clientpositive/load_dyn_part4.q.out ql/src/test/results/clientpositive/load_dyn_part9.q.out ql/src/test/results/clientpositive/ppr_pushdown2.q.out ql/src/test/results/clientpositive/stats4.q.out To: JIRA, sershe Cc: brock
          Hide
          Laurent Chouinard added a comment -

          Hi,

          I will be on vacation from July 29th to August 6th inclusively. For any question or emergency, please contact the group MTL-IT-Production-Tools@ubisoft.com<MTL-IT-Production-Tools@ubisoft.com>

          Thanks.

          Laurent Chouinard
          IT Production - Tools Programmer

          Show
          Laurent Chouinard added a comment - Hi, I will be on vacation from July 29th to August 6th inclusively. For any question or emergency, please contact the group MTL-IT-Production-Tools@ubisoft.com< MTL-IT-Production-Tools@ubisoft.com > Thanks. Laurent Chouinard IT Production - Tools Programmer
          Hide
          Sergey Shelukhin added a comment -

          I've fixed most of the queries, there are a couple of bugs and sorting is undefined (and changed as we no longer sort by partition name) in some tests, couple stubborn ones remain, hopefully will update today

          Show
          Sergey Shelukhin added a comment - I've fixed most of the queries, there are a couple of bugs and sorting is undefined (and changed as we no longer sort by partition name) in some tests, couple stubborn ones remain, hopefully will update today
          Hide
          Phabricator added a comment -

          brock has commented on the revision "HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow".

          Comments about handling of error.

          INLINE COMMENTS
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:2074 In general I am supportive of this patch but we should not be catching Throwable here. We should at most catch Exceptuon.

          Additionally log4j now supports passing exceptions directly, the stringifyException is no longer needed, you can pass the exception as the second arg of the log method.

          REVISION DETAIL
          https://reviews.facebook.net/D11805

          To: JIRA, sershe
          Cc: brock

          Show
          Phabricator added a comment - brock has commented on the revision " HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow". Comments about handling of error. INLINE COMMENTS metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:2074 In general I am supportive of this patch but we should not be catching Throwable here. We should at most catch Exceptuon. Additionally log4j now supports passing exceptions directly, the stringifyException is no longer needed, you can pass the exception as the second arg of the log method. REVISION DETAIL https://reviews.facebook.net/D11805 To: JIRA, sershe Cc: brock
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12594511/HIVE-4051.D11805.2.patch

          ERROR: -1 due to 59 failed/errored test(s), 2653 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppr_pushdown2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_or
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_rc
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_partitioned
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact_2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_and
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat11
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_dependency2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample10
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_special_char
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part12
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_inputddl7
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mi
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine2_hadoop20
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part4
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats4
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_decode_name
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat12
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ql_rewrite_gbtoidx
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_noscan_2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part10
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats12
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_6
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_unused
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_stale_partitioned
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_groupby
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_dependency
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_concatenate_inherit_table_location
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auth
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat5
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input24
          org.apache.hcatalog.api.TestHCatClient.testGetPartitionsWithPartialSpec
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part8
          org.apache.hcatalog.api.TestHCatClient.testPartitionsHCatClientImpl
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_logical
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rename_partition_location
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_not
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part9
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_updateAccessTime
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats7
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_database_drop
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_constant_where
          org.apache.hcatalog.api.TestHCatClient.testDropPartitionsWithPartialSpec
          

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/206/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/206/console

          Messages:

          Executing org.apache.hive.ptest.execution.CleanupPhase
          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests failed with: TestsFailedException: 59 tests failed
          

          This message is automatically generated.

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12594511/HIVE-4051.D11805.2.patch ERROR: -1 due to 59 failed/errored test(s), 2653 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppr_pushdown2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_or org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_rc org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_and org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_dependency2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_special_char org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_inputddl7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mi org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine2_hadoop20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_decode_name org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_noscan_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_unused org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_stale_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_dependency org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_concatenate_inherit_table_location org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auth org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input24 org.apache.hcatalog.api.TestHCatClient.testGetPartitionsWithPartialSpec org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part8 org.apache.hcatalog.api.TestHCatClient.testPartitionsHCatClientImpl org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_logical org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rename_partition_location org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_not org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_updateAccessTime org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_database_drop org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_constant_where org.apache.hcatalog.api.TestHCatClient.testDropPartitionsWithPartialSpec Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/206/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/206/console Messages: Executing org.apache.hive.ptest.execution.CleanupPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 59 tests failed This message is automatically generated.
          Hide
          Phabricator added a comment -

          sershe updated the revision "HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow".

          Fixed mishandling of transactions, lack of support for limit, and boolean fields failing on Derby. Some random tests I ran pass, will run all tests overnight.

          Reviewers: JIRA

          REVISION DETAIL
          https://reviews.facebook.net/D11805

          CHANGE SINCE LAST DIFF
          https://reviews.facebook.net/D11805?vs=36135&id=36357#toc

          AFFECTED FILES
          build.xml
          common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
          metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
          ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java

          To: JIRA, sershe

          Show
          Phabricator added a comment - sershe updated the revision " HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow". Fixed mishandling of transactions, lack of support for limit, and boolean fields failing on Derby. Some random tests I ran pass, will run all tests overnight. Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D11805 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D11805?vs=36135&id=36357#toc AFFECTED FILES build.xml common/src/java/org/apache/hadoop/hive/conf/HiveConf.java metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java To: JIRA, sershe
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12594063/HIVE-4051.D11805.1.patch

          ERROR: -1 due to 230 failed/errored test(s), 2651 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_multi_partitions
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_6
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi7
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_quote1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup5
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin2
          org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testFilterSinglePartition
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part10
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part5
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_hook
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_8
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_insert4
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_part
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_19
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_rc
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_ppr
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_lateralview
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join9
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppr_pushdown2
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_hook_context_cs
          org.apache.hcatalog.listener.TestNotificationListener.testAMQListener
          org.apache.hcatalog.pig.TestHCatStorer.testPartColsInData
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_12
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part6
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppr_pushdown
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_11
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_distinct
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats10
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat7
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact_2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_5
          org.apache.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part0
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_partitioned
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi2
          org.apache.hcatalog.mapreduce.TestHCatPartitioned.testHCatPartitionedTable
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_10
          org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testPartitionFilter
          org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testFilterSinglePartition
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_partitions_ignore_protection
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part5
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonmr_fetch
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input25
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union26
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_transform_ppr2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition5
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_serde_format
          org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testPartitionFilter
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_20
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_disallow_incompatible_type_change_on2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lock2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_part1
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_partspec5
          org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testFilterLastPartition
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi1
          org.apache.hcatalog.pig.TestOrcHCatLoader.testReadPartitionedBasic
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask2
          org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testPartitionFilter
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin12
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_18
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_partition_offline
          org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testFilterLastPartition
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_protect_mode
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert2_overwrite_partitions
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_global_limit
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_router_join_ppr
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact
          org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testFilterSinglePartition
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition4
          org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testFilterLastPartition
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mi
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_5
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input42
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_2
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_protectmode_tbl7
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_inputddl7
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_partspec2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_or_replace_view
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge4
          org.apache.hcatalog.mapreduce.TestHCatDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_dynamic_partitions_with_whitelist
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin8
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_21
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin9
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input28
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insertexternal1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part9
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppr_allchildsarenull
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_sa_fail_hook3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample8
          org.apache.hcatalog.mapreduce.TestHCatHiveThriftCompatibility.testDynamicCols
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi4
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join14_hadoop20
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part2
          org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testPartitionFilter
          org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testFilterLastPartition
          org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testFilterSinglePartition
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_protectmode
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lock4
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_union22
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_protectmode_part
          org.apache.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable
          org.apache.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rename_external_partition_location
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partitions_json
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_partspec1
          org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testFilterSinglePartition
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_table
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_noscan_2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_4
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_ppr_multi_distinct
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_inputddl6
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_partspec3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_6
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_partition_nodrop
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_4
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin_negative
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner2
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi6
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_merge1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats3
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_insert1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_unused
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter3
          org.apache.hadoop.hive.metastore.TestMetaStoreEventListener.testListener
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_disallow_incompatible_type_change_on1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input40
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr_multi_distinct
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_louter_join_ppr
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_groupby
          org.apache.hadoop.hive.ql.security.TestAuthorizationPreEventListener.testListener
          org.apache.hcatalog.security.TestHdfsAuthorizationProvider.testAddDropPartition
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_stale_partitioned
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part2
          org.apache.hcatalog.security.TestHdfsAuthorizationProvider.testDropPartitionFail2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join14_hadoop20
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_insert2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part7
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_partspec4
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert1_overwrite_partitions
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats11
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_concatenate_inherit_table_location
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_disallow_incompatible_type_change_off
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input26
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_loadpart1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lock3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view_partitioned
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join9
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_8
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_protectmode_part_no_drop
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi5
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union25
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input24
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part11
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auth
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_lockneg4
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union22
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_touch
          org.apache.hcatalog.api.TestHCatClient.testGetPartitionsWithPartialSpec
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine3
          org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testPartitionFilter
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_bucket_mapjoin_mismatch1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rename_partition_location
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_7
          org.apache.hcatalog.api.TestHCatClient.testPartitionsHCatClientImpl
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part8
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input39_hadoop20
          org.apache.hcatalog.mapreduce.TestHCatDynamicPartitioned.testHCatDynamicPartitionedTable
          org.apache.hcatalog.security.TestHdfsAuthorizationProvider.testDropPartitionFail1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_updateAccessTime
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_partitions_filter
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_database_drop
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_insert3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat4
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_protectmode_tbl8
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part7
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin13
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin1
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_lockneg3
          org.apache.hcatalog.pig.TestHCatStorer.testStoreInPartiitonedTbl
          org.apache.hcatalog.mapreduce.TestHCatHiveCompatibility.testPartedRead
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_view_cast
          org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testFilterLastPartition
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin9
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_ppr
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_protectmode_part1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_database
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin7
          org.apache.hcatalog.api.TestHCatClient.testDropPartitionsWithPartialSpec
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_9
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view_ppd
          

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/182/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/182/console

          Messages:

          Executing org.apache.hive.ptest.execution.CleanupPhase
          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests failed with: TestsFailedException: 230 tests failed
          

          This message is automatically generated.

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12594063/HIVE-4051.D11805.1.patch ERROR: -1 due to 230 failed/errored test(s), 2651 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_multi_partitions org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_6 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_quote1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin2 org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testFilterSinglePartition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_hook org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_insert4 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_rc org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_ppr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_lateralview org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppr_pushdown2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_hook_context_cs org.apache.hcatalog.listener.TestNotificationListener.testAMQListener org.apache.hcatalog.pig.TestHCatStorer.testPartColsInData org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppr_pushdown org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_5 org.apache.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part0 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_partitioned org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi2 org.apache.hcatalog.mapreduce.TestHCatPartitioned.testHCatPartitionedTable org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_10 org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testPartitionFilter org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testFilterSinglePartition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_partitions_ignore_protection org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonmr_fetch org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input25 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_transform_ppr2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_serde_format org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testPartitionFilter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_20 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_disallow_incompatible_type_change_on2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lock2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_part1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_partspec5 org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testFilterLastPartition org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi1 org.apache.hcatalog.pig.TestOrcHCatLoader.testReadPartitionedBasic org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask2 org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testPartitionFilter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_18 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_partition_offline org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testFilterLastPartition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_protect_mode org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert2_overwrite_partitions org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_global_limit org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_router_join_ppr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testFilterSinglePartition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition4 org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testFilterLastPartition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mi org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input42 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_protectmode_tbl7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_inputddl7 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_partspec2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_or_replace_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge4 org.apache.hcatalog.mapreduce.TestHCatDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_dynamic_partitions_with_whitelist org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_21 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input28 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insertexternal1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppr_allchildsarenull org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_sa_fail_hook3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample8 org.apache.hcatalog.mapreduce.TestHCatHiveThriftCompatibility.testDynamicCols org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join14_hadoop20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part2 org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testPartitionFilter org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testFilterLastPartition org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testFilterSinglePartition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_protectmode org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lock4 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_union22 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_protectmode_part org.apache.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable org.apache.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rename_external_partition_location org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partitions_json org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_partspec1 org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testFilterSinglePartition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_noscan_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_ppr_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_inputddl6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_partspec3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_6 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_partition_nodrop org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin_negative org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_merge1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_insert1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_unused org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter3 org.apache.hadoop.hive.metastore.TestMetaStoreEventListener.testListener org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_disallow_incompatible_type_change_on1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input40 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_louter_join_ppr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_groupby org.apache.hadoop.hive.ql.security.TestAuthorizationPreEventListener.testListener org.apache.hcatalog.security.TestHdfsAuthorizationProvider.testAddDropPartition org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_stale_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part2 org.apache.hcatalog.security.TestHdfsAuthorizationProvider.testDropPartitionFail2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join14_hadoop20 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_insert2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part7 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_partspec4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert1_overwrite_partitions org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_concatenate_inherit_table_location org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_disallow_incompatible_type_change_off org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_loadpart1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lock3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_8 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_protectmode_part_no_drop org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union25 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input24 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auth org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_lockneg4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union22 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_touch org.apache.hcatalog.api.TestHCatClient.testGetPartitionsWithPartialSpec org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine3 org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testPartitionFilter org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_bucket_mapjoin_mismatch1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rename_partition_location org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_7 org.apache.hcatalog.api.TestHCatClient.testPartitionsHCatClientImpl org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input39_hadoop20 org.apache.hcatalog.mapreduce.TestHCatDynamicPartitioned.testHCatDynamicPartitionedTable org.apache.hcatalog.security.TestHdfsAuthorizationProvider.testDropPartitionFail1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_updateAccessTime org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_partitions_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_database_drop org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_insert3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat4 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_protectmode_tbl8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_lockneg3 org.apache.hcatalog.pig.TestHCatStorer.testStoreInPartiitonedTbl org.apache.hcatalog.mapreduce.TestHCatHiveCompatibility.testPartedRead org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_view_cast org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testFilterLastPartition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_ppr org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_protectmode_part1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_database org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin7 org.apache.hcatalog.api.TestHCatClient.testDropPartitionsWithPartialSpec org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view_ppd Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/182/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/182/console Messages: Executing org.apache.hive.ptest.execution.CleanupPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 230 tests failed This message is automatically generated.
          Hide
          Phabricator added a comment -

          sershe requested code review of "HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow".

          Reviewers: JIRA

          Preliminary patch for HIVE-4051 (ready for review).
          This introduces "direct SQL" optimization for metastore.
          When memstore gets partitions by filter or by names, it does so using ANSI SQL92 queries, if enabled (default).
          This will not work on non-RDBMS datastores, may not work on non-ANSI-compiant ones (I tested on mysql in default mode).
          If disabled, or if any error happens, it will use the default DataNucleus code.
          When it does work, which is most practical cases, with local mysql it produces up to x20 speedup for getting large number of partitions.
          SQL queries can be further optimized to achieve even more, presumably - see some comments in the code. That will be done as separate JIRAs.

          So far the patch has no tests, because it is peculiar to the storage engine, so I need to look if derby test makes sense.
          Another question is whether it makes sense to put SQL antics into separate class.
          Let me run HIVE QA (some test have run on local). After discussion I will update the patch, but the core is ready for review.

          Hive's query client takes a long time to initialize & start planning queries because of delays in creating all the MTable/MPartition objects.

          For a hive db with 1800 partitions, the metastore took 6-7 seconds to initialize - firing approximately 5900 queries to the mysql database.

          Several of those queries fetch exactly one row to create a single object on the client.

          The following 12 queries were repeated for each partition, generating a storm of SQL queries

          4 Query SELECT `A0`.`SD_ID`,`B0`.`INPUT_FORMAT`,`B0`.`IS_COMPRESSED`,`B0`.`IS_STOREDASSUBDIRECTORIES`,`B0`.`LOCATION`,`B0`.`NUM_BUCKETS`,`B0`.`OUTPUT_FORMAT`,`B0`.`SD_ID` FROM `PARTITIONS` `A0` LEFT OUTER JOIN `SDS` `B0` ON `A0`.`SD_ID` = `B0`.`SD_ID` WHERE `A0`.`PART_ID` = 3945
          4 Query SELECT `A0`.`CD_ID`,`B0`.`CD_ID` FROM `SDS` `A0` LEFT OUTER JOIN `CDS` `B0` ON `A0`.`CD_ID` = `B0`.`CD_ID` WHERE `A0`.`SD_ID` =4871
          4 Query SELECT COUNT FROM `COLUMNS_V2` THIS WHERE THIS.`CD_ID`=1546 AND THIS.`INTEGER_IDX`>=0
          4 Query SELECT `A0`.`COMMENT`,`A0`.`COLUMN_NAME`,`A0`.`TYPE_NAME`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `COLUMNS_V2` `A0` WHERE `A0`.`CD_ID` = 1546 AND `A0`.`INTEGER_IDX` >= 0 ORDER BY NUCORDER0
          4 Query SELECT `A0`.`SERDE_ID`,`B0`.`NAME`,`B0`.`SLIB`,`B0`.`SERDE_ID` FROM `SDS` `A0` LEFT OUTER JOIN `SERDES` `B0` ON `A0`.`SERDE_ID` = `B0`.`SERDE_ID` WHERE `A0`.`SD_ID` =4871
          4 Query SELECT COUNT FROM `SORT_COLS` THIS WHERE THIS.`SD_ID`=4871 AND THIS.`INTEGER_IDX`>=0
          4 Query SELECT `A0`.`COLUMN_NAME`,`A0`.`ORDER`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `SORT_COLS` `A0` WHERE `A0`.`SD_ID` =4871 AND `A0`.`INTEGER_IDX` >= 0 ORDER BY NUCORDER0
          4 Query SELECT COUNT FROM `SKEWED_VALUES` THIS WHERE THIS.`SD_ID_OID`=4871 AND THIS.`INTEGER_IDX`>=0
          4 Query SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS NUCLEUS_TYPE,`A1`.`STRING_LIST_ID`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `SKEWED_VALUES` `A0` INNER JOIN `SKEWED_STRING_LIST` `A1` ON `A0`.`STRING_LIST_ID_EID` = `A1`.`STRING_LIST_ID` WHERE `A0`.`SD_ID_OID` =4871 AND `A0`.`INTEGER_IDX` >= 0 ORDER BY NUCORDER0
          4 Query SELECT COUNT FROM `SKEWED_COL_VALUE_LOC_MAP` WHERE `SD_ID` =4871 AND `STRING_LIST_ID_KID` IS NOT NULL
          4 Query SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS NUCLEUS_TYPE,`A0`.`STRING_LIST_ID` FROM `SKEWED_STRING_LIST` `A0` INNER JOIN `SKEWED_COL_VALUE_LOC_MAP` `B0` ON `A0`.`STRING_LIST_ID` = `B0`.`STRING_LIST_ID_KID` WHERE `B0`.`SD_ID` =4871
          4 Query SELECT `A0`.`STRING_LIST_ID_KID`,`A0`.`LOCATION` FROM `SKEWED_COL_VALUE_LOC_MAP` `A0` WHERE `A0`.`SD_ID` =4871 AND NOT (`A0`.`STRING_LIST_ID_KID` IS NULL)

          This data is not detached or cached, so this operation is performed during every query plan for the partitions, even in the same hive client.

          The queries are automatically generated by JDO/DataNucleus which makes it nearly impossible to rewrite it into a single denormalized join operation & process it locally.

          Attempts to optimize this with JDO fetch-groups did not bear fruit in improving the query count.

          TEST PLAN
          EMPTY

          REVISION DETAIL
          https://reviews.facebook.net/D11805

          AFFECTED FILES
          build.xml
          common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
          metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
          ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java

          MANAGE HERALD RULES
          https://reviews.facebook.net/herald/view/differential/

          WHY DID I GET THIS EMAIL?
          https://reviews.facebook.net/herald/transcript/28047/

          To: JIRA, sershe

          Show
          Phabricator added a comment - sershe requested code review of " HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions & is slow". Reviewers: JIRA Preliminary patch for HIVE-4051 (ready for review). This introduces "direct SQL" optimization for metastore. When memstore gets partitions by filter or by names, it does so using ANSI SQL92 queries, if enabled (default). This will not work on non-RDBMS datastores, may not work on non-ANSI-compiant ones (I tested on mysql in default mode). If disabled, or if any error happens, it will use the default DataNucleus code. When it does work, which is most practical cases, with local mysql it produces up to x20 speedup for getting large number of partitions. SQL queries can be further optimized to achieve even more, presumably - see some comments in the code. That will be done as separate JIRAs. So far the patch has no tests, because it is peculiar to the storage engine, so I need to look if derby test makes sense. Another question is whether it makes sense to put SQL antics into separate class. Let me run HIVE QA (some test have run on local). After discussion I will update the patch, but the core is ready for review. Hive's query client takes a long time to initialize & start planning queries because of delays in creating all the MTable/MPartition objects. For a hive db with 1800 partitions, the metastore took 6-7 seconds to initialize - firing approximately 5900 queries to the mysql database. Several of those queries fetch exactly one row to create a single object on the client. The following 12 queries were repeated for each partition, generating a storm of SQL queries 4 Query SELECT `A0`.`SD_ID`,`B0`.`INPUT_FORMAT`,`B0`.`IS_COMPRESSED`,`B0`.`IS_STOREDASSUBDIRECTORIES`,`B0`.`LOCATION`,`B0`.`NUM_BUCKETS`,`B0`.`OUTPUT_FORMAT`,`B0`.`SD_ID` FROM `PARTITIONS` `A0` LEFT OUTER JOIN `SDS` `B0` ON `A0`.`SD_ID` = `B0`.`SD_ID` WHERE `A0`.`PART_ID` = 3945 4 Query SELECT `A0`.`CD_ID`,`B0`.`CD_ID` FROM `SDS` `A0` LEFT OUTER JOIN `CDS` `B0` ON `A0`.`CD_ID` = `B0`.`CD_ID` WHERE `A0`.`SD_ID` =4871 4 Query SELECT COUNT FROM `COLUMNS_V2` THIS WHERE THIS.`CD_ID`=1546 AND THIS.`INTEGER_IDX`>=0 4 Query SELECT `A0`.`COMMENT`,`A0`.`COLUMN_NAME`,`A0`.`TYPE_NAME`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `COLUMNS_V2` `A0` WHERE `A0`.`CD_ID` = 1546 AND `A0`.`INTEGER_IDX` >= 0 ORDER BY NUCORDER0 4 Query SELECT `A0`.`SERDE_ID`,`B0`.`NAME`,`B0`.`SLIB`,`B0`.`SERDE_ID` FROM `SDS` `A0` LEFT OUTER JOIN `SERDES` `B0` ON `A0`.`SERDE_ID` = `B0`.`SERDE_ID` WHERE `A0`.`SD_ID` =4871 4 Query SELECT COUNT FROM `SORT_COLS` THIS WHERE THIS.`SD_ID`=4871 AND THIS.`INTEGER_IDX`>=0 4 Query SELECT `A0`.`COLUMN_NAME`,`A0`.`ORDER`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `SORT_COLS` `A0` WHERE `A0`.`SD_ID` =4871 AND `A0`.`INTEGER_IDX` >= 0 ORDER BY NUCORDER0 4 Query SELECT COUNT FROM `SKEWED_VALUES` THIS WHERE THIS.`SD_ID_OID`=4871 AND THIS.`INTEGER_IDX`>=0 4 Query SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS NUCLEUS_TYPE,`A1`.`STRING_LIST_ID`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `SKEWED_VALUES` `A0` INNER JOIN `SKEWED_STRING_LIST` `A1` ON `A0`.`STRING_LIST_ID_EID` = `A1`.`STRING_LIST_ID` WHERE `A0`.`SD_ID_OID` =4871 AND `A0`.`INTEGER_IDX` >= 0 ORDER BY NUCORDER0 4 Query SELECT COUNT FROM `SKEWED_COL_VALUE_LOC_MAP` WHERE `SD_ID` =4871 AND `STRING_LIST_ID_KID` IS NOT NULL 4 Query SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS NUCLEUS_TYPE,`A0`.`STRING_LIST_ID` FROM `SKEWED_STRING_LIST` `A0` INNER JOIN `SKEWED_COL_VALUE_LOC_MAP` `B0` ON `A0`.`STRING_LIST_ID` = `B0`.`STRING_LIST_ID_KID` WHERE `B0`.`SD_ID` =4871 4 Query SELECT `A0`.`STRING_LIST_ID_KID`,`A0`.`LOCATION` FROM `SKEWED_COL_VALUE_LOC_MAP` `A0` WHERE `A0`.`SD_ID` =4871 AND NOT (`A0`.`STRING_LIST_ID_KID` IS NULL) This data is not detached or cached, so this operation is performed during every query plan for the partitions, even in the same hive client. The queries are automatically generated by JDO/DataNucleus which makes it nearly impossible to rewrite it into a single denormalized join operation & process it locally. Attempts to optimize this with JDO fetch-groups did not bear fruit in improving the query count. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D11805 AFFECTED FILES build.xml common/src/java/org/apache/hadoop/hive/conf/HiveConf.java metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/28047/ To: JIRA, sershe
          Hide
          Sergey Shelukhin added a comment -

          I am actually getting 16 queries... 3 by PART_ID (itself, values, params), 10 per its SD_ID, 2 for SD_ID_OID (might also be SD_ID?), and 1 for SERDE.
          Some normalization doesn't make any sense, e.g.

          mysql> select name, slib, count(1) from SERDES;
          +------+----------------------------------------------------+----------+
          | name | slib                                               | count(1) |
          +------+----------------------------------------------------+----------+
          | NULL | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe |    32769 |
          +------+----------------------------------------------------+----------+
          

          if stored for each row 1-on-1 anyway, why not just store it in the row?

          I am looking in the direction of doing a full-string query returning a bag of fields, rather than MPartition, so that per-object calls are not done. Then SDs can be fetched with join, etc.
          One thing that is not quite clear is how to do fetch from separate tables for which DN object doesn't exist e.g. PARTITION_KEY_VALS. Ideally we would fetch all of these for all necessary partitions in one query and put them into objects.

          If all else fails we can use SQL http://www.datanucleus.org/products/datanucleus/jdo/sql.html, but it will halfway defeat the purpose of having DN in the first place, since it becomes RDBMS-dependent.

          In fact to avoid warring with datanucleus perhaps we could do SQL as storage-specific optimization, configurable for metastore, and fall back to ORM if disabled.

          Show
          Sergey Shelukhin added a comment - I am actually getting 16 queries... 3 by PART_ID (itself, values, params), 10 per its SD_ID, 2 for SD_ID_OID (might also be SD_ID?), and 1 for SERDE. Some normalization doesn't make any sense, e.g. mysql> select name, slib, count(1) from SERDES; +------+----------------------------------------------------+----------+ | name | slib | count(1) | +------+----------------------------------------------------+----------+ | NULL | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | 32769 | +------+----------------------------------------------------+----------+ if stored for each row 1-on-1 anyway, why not just store it in the row? I am looking in the direction of doing a full-string query returning a bag of fields, rather than MPartition, so that per-object calls are not done. Then SDs can be fetched with join, etc. One thing that is not quite clear is how to do fetch from separate tables for which DN object doesn't exist e.g. PARTITION_KEY_VALS. Ideally we would fetch all of these for all necessary partitions in one query and put them into objects. If all else fails we can use SQL http://www.datanucleus.org/products/datanucleus/jdo/sql.html , but it will halfway defeat the purpose of having DN in the first place, since it becomes RDBMS-dependent. In fact to avoid warring with datanucleus perhaps we could do SQL as storage-specific optimization, configurable for metastore, and fall back to ORM if disabled.

            People

            • Assignee:
              Sergey Shelukhin
              Reporter:
              Gopal V
            • Votes:
              3 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development