Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: Query Processor
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change
    • Release Note:
      Added support for metadata only queries.

      Description

      Queries like:

      select max(ds) from T

      where ds is a partitioning column should be optimized.

      1. ASF.LICENSE.NOT.GRANTED--D105.1.patch
        0.4 kB
        Phabricator
      2. ASF.LICENSE.NOT.GRANTED--D105.2.patch
        75 kB
        Phabricator
      3. ASF.LICENSE.NOT.GRANTED--HIVE-1003.D105.3.patch
        34 kB
        Phabricator
      4. hive.1003.2.patch
        40 kB
        Namit Jain
      5. hive.1003.3.patch
        72 kB
        Namit Jain
      6. hive.1003.4.patch
        96 kB
        Namit Jain
      7. HIVE-1003.1.patch
        40 kB
        Marcin Kurczych

        Issue Links

          Activity

          Show
          Marcin Kurczych added a comment - https://reviews.apache.org/r/1962/
          Hide
          Namit Jain added a comment -

          cleaning up the patch

          Show
          Namit Jain added a comment - cleaning up the patch
          Hide
          Marcin Kurczych added a comment -

          I have finished my internship at Facebook.
          If you want to contact me you can reach me at Marcin Kurczych <marcin.kurczych@gmail.com>.

          Thanks!

          Show
          Marcin Kurczych added a comment - I have finished my internship at Facebook. If you want to contact me you can reach me at Marcin Kurczych <marcin.kurczych@gmail.com>. Thanks!
          Hide
          He Yongqiang added a comment -

          will take a look.

          Show
          He Yongqiang added a comment - will take a look.
          Hide
          Namit Jain added a comment -

          will add more tests

          Show
          Namit Jain added a comment - will add more tests
          Hide
          Namit Jain added a comment -

          added more tests

          Show
          Namit Jain added a comment - added more tests
          Hide
          Ashutosh Chauhan added a comment -

          @Namit,
          Will second query on HIVE-2119 will also get optimized with this?

          Show
          Ashutosh Chauhan added a comment - @Namit, Will second query on HIVE-2119 will also get optimized with this?
          Hide
          Namit Jain added a comment -

          @Ashutosh, it should - i haven't tried it yet.
          Let me confirm tomorrow

          Show
          Namit Jain added a comment - @Ashutosh, it should - i haven't tried it yet. Let me confirm tomorrow
          Hide
          Namit Jain added a comment -

          @Ashutosh, that query is not optimized -
          Let us keep HIVE-2119 open and work on that as a followup.

          Show
          Namit Jain added a comment - @Ashutosh, that query is not optimized - Let us keep HIVE-2119 open and work on that as a followup.
          Hide
          He Yongqiang added a comment -

          1) move the optimizer to physical optimizer.
          2) after 1), changes in GenMapRedUtils also need to be moved.
          3) add a testcase which uses CombineHiveInputFormat.
          4) also test virtual_column.q
          nitpicks:
          1) getCategory() in NullStructSerDe's getObjectInspector should return Struct
          2) remove import of "HiveNullValueSequenceFileOutputFormat" and "OneNullRowInputFormat" from SemanticAnalyzer.java

          Show
          He Yongqiang added a comment - 1) move the optimizer to physical optimizer. 2) after 1), changes in GenMapRedUtils also need to be moved. 3) add a testcase which uses CombineHiveInputFormat. 4) also test virtual_column.q nitpicks: 1) getCategory() in NullStructSerDe's getObjectInspector should return Struct 2) remove import of "HiveNullValueSequenceFileOutputFormat" and "OneNullRowInputFormat" from SemanticAnalyzer.java
          Hide
          Namit Jain added a comment -

          @Ashutosh, moving to physical optimizer should also fix the query in HIVE-2119

          Show
          Namit Jain added a comment - @Ashutosh, moving to physical optimizer should also fix the query in HIVE-2119
          Hide
          Phabricator added a comment -

          njain requested code review of "HIVE-1003 [jira] optimize metadata only queries".
          Reviewers: JIRA

          testing

          Queries like:

          select max(ds) from T

          where ds is a partitioning column should be optimized.

          TEST PLAN
          EMPTY

          REVISION DETAIL
          https://reviews.facebook.net/D105

          AFFECTED FILES
          ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java

          MANAGE HERALD DIFFERENTIAL RULES
          https://reviews.facebook.net/herald/view/differential/

          WHY DID I GET THIS EMAIL?
          https://reviews.facebook.net/herald/transcript/219/

          Tip: use the X-Herald-Rules header to filter Herald messages in your client.

          Show
          Phabricator added a comment - njain requested code review of " HIVE-1003 [jira] optimize metadata only queries". Reviewers: JIRA testing Queries like: select max(ds) from T where ds is a partitioning column should be optimized. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D105 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/219/ Tip: use the X-Herald-Rules header to filter Herald messages in your client.
          Hide
          Phabricator added a comment -

          njain has commented on the revision "HIVE-1003 [jira] optimize metadata only queries".

          testing integration

          REVISION DETAIL
          https://reviews.facebook.net/D105

          Show
          Phabricator added a comment - njain has commented on the revision " HIVE-1003 [jira] optimize metadata only queries". testing integration REVISION DETAIL https://reviews.facebook.net/D105
          Hide
          Namit Jain added a comment -

          addressed comments

          Show
          Namit Jain added a comment - addressed comments
          Hide
          He Yongqiang added a comment -

          looks good to me, running tests

          Show
          He Yongqiang added a comment - looks good to me, running tests
          Hide
          He Yongqiang added a comment -

          committed, thanks Marcin and Namit!

          Show
          He Yongqiang added a comment - committed, thanks Marcin and Namit!
          Hide
          Hudson added a comment -

          Integrated in Hive-trunk-h0.21 #1048 (See https://builds.apache.org/job/Hive-trunk-h0.21/1048/)
          HIVE-1003: optimize metadata only queries (Marcin Kurczych, Namit Jain via He Yongqiang)

          heyongqiang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1195577
          Files :

          • /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
          • /hive/trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveDatabaseMetaData.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/OneNullRowInputFormat.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MetadataOnlyOptimizer.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/PhysicalOptimizer.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFType.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java
          • /hive/trunk/ql/src/test/queries/clientpositive/metadataonly1.q
          • /hive/trunk/ql/src/test/results/clientpositive/metadataonly1.q.out
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/NullStructSerDe.java
          Show
          Hudson added a comment - Integrated in Hive-trunk-h0.21 #1048 (See https://builds.apache.org/job/Hive-trunk-h0.21/1048/ ) HIVE-1003 : optimize metadata only queries (Marcin Kurczych, Namit Jain via He Yongqiang) heyongqiang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1195577 Files : /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java /hive/trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveDatabaseMetaData.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/OneNullRowInputFormat.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MetadataOnlyOptimizer.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/PhysicalOptimizer.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFType.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java /hive/trunk/ql/src/test/queries/clientpositive/metadataonly1.q /hive/trunk/ql/src/test/results/clientpositive/metadataonly1.q.out /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/NullStructSerDe.java
          Hide
          Phabricator added a comment -

          njain updated the revision "HIVE-1003 [jira] optimize metadata only queries".
          Reviewers: JIRA

          REVISION DETAIL
          https://reviews.facebook.net/D105

          AFFECTED FILES
          ql/src/test/results/clientpositive/join0.q.out
          ql/src/test/queries/clientpositive/groupby2.q
          ql/src/test/queries/clientpositive/join0.q
          ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java
          ql/src/java/org/apache/hadoop/hive/ql/plan/ExplainWork.java
          ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
          ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java

          Show
          Phabricator added a comment - njain updated the revision " HIVE-1003 [jira] optimize metadata only queries". Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D105 AFFECTED FILES ql/src/test/results/clientpositive/join0.q.out ql/src/test/queries/clientpositive/groupby2.q ql/src/test/queries/clientpositive/join0.q ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java ql/src/java/org/apache/hadoop/hive/ql/plan/ExplainWork.java ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java
          Hide
          Phabricator added a comment -

          njain updated the revision "HIVE-1003 [jira] optimize metadata only queries".
          Reviewers: JIRA

          HIVE-2618

          REVISION DETAIL
          https://reviews.facebook.net/D105

          AFFECTED FILES
          ql/src/test/results/clientpositive/alter_table_serde.q.out
          ql/src/test/results/clientpositive/exim_04_evolved_parts.q.out
          ql/src/test/results/clientpositive/partition_schema1.q.out
          ql/src/test/queries/clientpositive/partition_schema1.q
          ql/src/java/org/apache/hadoop/hive/ql/metadata/MetaDataFormatUtils.java
          ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java

          Show
          Phabricator added a comment - njain updated the revision " HIVE-1003 [jira] optimize metadata only queries". Reviewers: JIRA HIVE-2618 REVISION DETAIL https://reviews.facebook.net/D105 AFFECTED FILES ql/src/test/results/clientpositive/alter_table_serde.q.out ql/src/test/results/clientpositive/exim_04_evolved_parts.q.out ql/src/test/results/clientpositive/partition_schema1.q.out ql/src/test/queries/clientpositive/partition_schema1.q ql/src/java/org/apache/hadoop/hive/ql/metadata/MetaDataFormatUtils.java ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
          Hide
          Phabricator added a comment -

          heyongqiang has accepted the revision "HIVE-1003 [jira] optimize metadata only queries".

          tests passed

          REVISION DETAIL
          https://reviews.facebook.net/D105

          Show
          Phabricator added a comment - heyongqiang has accepted the revision " HIVE-1003 [jira] optimize metadata only queries". tests passed REVISION DETAIL https://reviews.facebook.net/D105
          Hide
          Phabricator added a comment -

          njain has committed the revision "HIVE-1003 [jira] optimize metadata only queries".

          REVISION DETAIL
          https://reviews.facebook.net/D105

          COMMIT
          https://reviews.facebook.net/rHIVE1211767

          Show
          Phabricator added a comment - njain has committed the revision " HIVE-1003 [jira] optimize metadata only queries". REVISION DETAIL https://reviews.facebook.net/D105 COMMIT https://reviews.facebook.net/rHIVE1211767
          Hide
          Hudson added a comment -

          Integrated in Hive-trunk-h0.23.0 #7 (See https://builds.apache.org/job/Hive-trunk-h0.23.0/7/)
          HIVE-1003 [jira] optimize metadata only queries
          (Namit Jain via Yongqiang He)

          Summary:
          testing

          Queries like:

          select max(ds) from T

          where ds is a partitioning column should be optimized.

          Test Plan: EMPTY

          Reviewers: JIRA, heyongqiang

          Reviewed By: heyongqiang

          CC: njain, heyongqiang

          Differential Revision: 105

          heyongqiang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1211767
          Files :

          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/MetaDataFormatUtils.java
          • /hive/trunk/ql/src/test/queries/clientpositive/partition_schema1.q
          • /hive/trunk/ql/src/test/results/clientpositive/alter_table_serde.q.out
          • /hive/trunk/ql/src/test/results/clientpositive/exim_04_evolved_parts.q.out
          • /hive/trunk/ql/src/test/results/clientpositive/partition_schema1.q.out
          Show
          Hudson added a comment - Integrated in Hive-trunk-h0.23.0 #7 (See https://builds.apache.org/job/Hive-trunk-h0.23.0/7/ ) HIVE-1003 [jira] optimize metadata only queries (Namit Jain via Yongqiang He) Summary: testing Queries like: select max(ds) from T where ds is a partitioning column should be optimized. Test Plan: EMPTY Reviewers: JIRA, heyongqiang Reviewed By: heyongqiang CC: njain, heyongqiang Differential Revision: 105 heyongqiang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1211767 Files : /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/MetaDataFormatUtils.java /hive/trunk/ql/src/test/queries/clientpositive/partition_schema1.q /hive/trunk/ql/src/test/results/clientpositive/alter_table_serde.q.out /hive/trunk/ql/src/test/results/clientpositive/exim_04_evolved_parts.q.out /hive/trunk/ql/src/test/results/clientpositive/partition_schema1.q.out
          Hide
          Hudson added a comment -

          Integrated in Hive-trunk-h0.21 #1131 (See https://builds.apache.org/job/Hive-trunk-h0.21/1131/)
          HIVE-1003 [jira] optimize metadata only queries
          (Namit Jain via Yongqiang He)

          Summary:
          testing

          Queries like:

          select max(ds) from T

          where ds is a partitioning column should be optimized.

          Test Plan: EMPTY

          Reviewers: JIRA, heyongqiang

          Reviewed By: heyongqiang

          CC: njain, heyongqiang

          Differential Revision: 105

          heyongqiang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1211767
          Files :

          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/MetaDataFormatUtils.java
          • /hive/trunk/ql/src/test/queries/clientpositive/partition_schema1.q
          • /hive/trunk/ql/src/test/results/clientpositive/alter_table_serde.q.out
          • /hive/trunk/ql/src/test/results/clientpositive/exim_04_evolved_parts.q.out
          • /hive/trunk/ql/src/test/results/clientpositive/partition_schema1.q.out
          Show
          Hudson added a comment - Integrated in Hive-trunk-h0.21 #1131 (See https://builds.apache.org/job/Hive-trunk-h0.21/1131/ ) HIVE-1003 [jira] optimize metadata only queries (Namit Jain via Yongqiang He) Summary: testing Queries like: select max(ds) from T where ds is a partitioning column should be optimized. Test Plan: EMPTY Reviewers: JIRA, heyongqiang Reviewed By: heyongqiang CC: njain, heyongqiang Differential Revision: 105 heyongqiang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1211767 Files : /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/MetaDataFormatUtils.java /hive/trunk/ql/src/test/queries/clientpositive/partition_schema1.q /hive/trunk/ql/src/test/results/clientpositive/alter_table_serde.q.out /hive/trunk/ql/src/test/results/clientpositive/exim_04_evolved_parts.q.out /hive/trunk/ql/src/test/results/clientpositive/partition_schema1.q.out

            People

            • Assignee:
              Marcin Kurczych
              Reporter:
              Namit Jain
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development