Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Tags:
      Hive, TPC-H, SQL, benchmark

      Description

      The goal is to run all TPC-H (http://www.tpc.org/tpch/) benchmark queries on Hive for two reasons. First, through those queries, we would like to find the new features that we need to put into Hive so that Hive supports common SQL queries. Second, we would like to measure the performance of Hive to find out what Hive is not good at. We can then improve Hive based on those information.

      For queries that are not supported now in Hive, I will try to rewrite them to one or more Hive-supported queries.

      1. TPC-H_on_Hive_2009-08-11.pdf
        609 kB
        Yuntao Jia
      2. TPC-H_on_Hive_2009-08-11.tar.gz
        13 kB
        Yuntao Jia
      3. TPC-H_on_Hive_2009-08-14.tar.gz
        13 kB
        Yuntao Jia

        Issue Links

          Activity

          Hide
          Yuntao Jia added a comment -

          Attached the report of running TPC-H Benchmark on Hive, together with the package that is necessary to reproduce the benchmark.

          Please note that we only considered the twenty two queries in the TPC-H benchmark but not the two refresh functions due to limited time. That will also be part of the future work.

          Basically, Hive supports all the TPC-H queries even though rewriting of some queries is required. We also set up Hive on an eleven node cluster and performed the benchmark. In this particular configuration, the Price/Performance metric of the Hive system is 84.34.

          Please see the official TPC-H benchmark specification for the details of the benchmark and metric.

          Show
          Yuntao Jia added a comment - Attached the report of running TPC-H Benchmark on Hive, together with the package that is necessary to reproduce the benchmark. Please note that we only considered the twenty two queries in the TPC-H benchmark but not the two refresh functions due to limited time. That will also be part of the future work. Basically, Hive supports all the TPC-H queries even though rewriting of some queries is required. We also set up Hive on an eleven node cluster and performed the benchmark. In this particular configuration, the Price/Performance metric of the Hive system is 84.34. Please see the official TPC-H benchmark specification for the details of the benchmark and metric.
          Hide
          Aaron Kimball added a comment -

          Interesting results. Can you please post the hadoop-site.xml file used to run the test? I'm curious what Hadoop performance-tuning settings you used.

          Show
          Aaron Kimball added a comment - Interesting results. Can you please post the hadoop-site.xml file used to run the test? I'm curious what Hadoop performance-tuning settings you used.
          Hide
          Yuntao Jia added a comment -

          The hadoop-site.xml is included in the attached package: TPC-H_on_Hive_2009-08-11.tar.gz. You can download it and check it out. Since It has more than 300 lines, I'd better not post it here.

          Show
          Yuntao Jia added a comment - The hadoop-site.xml is included in the attached package: TPC-H_on_Hive_2009-08-11.tar.gz. You can download it and check it out. Since It has more than 300 lines, I'd better not post it here.
          Hide
          Aaron Kimball added a comment -

          Yuntao,

          Thanks. I took a look through this file and have some questions:

          1) mapred.reduce.tasks isn't set in hadoop-site.xml, nor do any of the scripts explicitly set it. This means it's left at the default value of '1'. Necessary for anything with an ORDER BY clause, but slows down anything else (you could set this to 40 on your cluster for any situations where you don't need total ordering). Could some of these queries get refactored to make use of multiple reducers in the middle?

          2) Your writeup says that you've got 4 hdds per machine, but dfs.data.dir and mapred.local.dir both just reference a single path each. Are you doing something unusual in your filesystem to get this to spread across all 4 disks? Or could three of them be unused by this?

          Thank you

          • Aaron
          Show
          Aaron Kimball added a comment - Yuntao, Thanks. I took a look through this file and have some questions: 1) mapred.reduce.tasks isn't set in hadoop-site.xml, nor do any of the scripts explicitly set it. This means it's left at the default value of '1'. Necessary for anything with an ORDER BY clause, but slows down anything else (you could set this to 40 on your cluster for any situations where you don't need total ordering). Could some of these queries get refactored to make use of multiple reducers in the middle? 2) Your writeup says that you've got 4 hdds per machine, but dfs.data.dir and mapred.local.dir both just reference a single path each. Are you doing something unusual in your filesystem to get this to spread across all 4 disks? Or could three of them be unused by this? Thank you Aaron
          Hide
          Yuntao Jia added a comment -

          To the 1st question, the reduce number is set in Hive. In particular, in Hive-default.xml, one property is:

          <property>
          <name>mapred.reduce.tasks</name>
          <value>-1</value>
          <description>The default number of reduce tasks per job. Typically set
          to a prime close to the number of available hosts. Ignored when
          mapred.job.tracker is "local". Hadoop set this to 1 by default, whereas hive uses -1 as its default value.
          By setting this property to -1, Hive will automatically figure out what should be the number of reducers.
          </description>
          </property>

          To the 2nd question, in the actual Hadoop configuration, we did use four paths. However, for security reasons, we anonymized the configuration file and put one path instead.

          Hope that answers your questions.

          Show
          Yuntao Jia added a comment - To the 1st question, the reduce number is set in Hive. In particular, in Hive-default.xml, one property is: <property> <name>mapred.reduce.tasks</name> <value>-1</value> <description>The default number of reduce tasks per job. Typically set to a prime close to the number of available hosts. Ignored when mapred.job.tracker is "local". Hadoop set this to 1 by default, whereas hive uses -1 as its default value. By setting this property to -1, Hive will automatically figure out what should be the number of reducers. </description> </property> To the 2nd question, in the actual Hadoop configuration, we did use four paths. However, for security reasons, we anonymized the configuration file and put one path instead. Hope that answers your questions.
          Hide
          Aaron Kimball added a comment -

          Sounds good to me

          Show
          Aaron Kimball added a comment - Sounds good to me
          Hide
          Yuntao Jia added a comment -

          Updated the package that includes the latest changes to the queries. Particularly, those query changes are carried out to optimize the performance of Hive in the TPC-H Benchmark.

          Show
          Yuntao Jia added a comment - Updated the package that includes the latest changes to the queries. Particularly, those query changes are carried out to optimize the performance of Hive in the TPC-H Benchmark.
          Hide
          Kamil Bajda-Pawlikowski added a comment -

          Hi Yuntao,

          I have attempted to run TPC-H on Hive. Thanks for really well prepared scripts!

          During the first query, I realized that things are not going well. It seems that Aaron's concern about the number of reducers was valid one.
          However, the problem is that Hive schedules too many reducers! The default configuration of Hive tries to determine # of tasks automatically using value of "hive.exec.reducers.bytes.per.reducer" property (the default setting is to have one reduce task per 1GB of input data). When the size of the data is huge, this is inefficient. This needs to capped!

          For example in my case, there is 50GB data per node, but only 2 reduce task slots and I'm getting 25 reduce task waves. Q1 ran for 1h49min. In contrast, when I set "hive.exec.reducers.max" property to the number of reduce slots in my Hadoop installation, the query running time is only about 23min. Of note, the default value for "hive.exec.reducers.max" is 999.

          The above issue was not too bad for the data size you used. TPC-H dataset with SF=100 translates into at most 100 reducers per job, and with 40 reduce slots in total, each job had max. 2.5 reduce task waves. Still, your numbers could be somewhat better by capping "hive.exec.reducers.max" to 40 per Tom White's tip #9 from http://www.cloudera.com/blog/2009/05/10-mapreduce-tips.

          Could please confirm whether my understanding is correct.

          Thank you,
          Kamil

          Show
          Kamil Bajda-Pawlikowski added a comment - Hi Yuntao, I have attempted to run TPC-H on Hive. Thanks for really well prepared scripts! During the first query, I realized that things are not going well. It seems that Aaron's concern about the number of reducers was valid one. However, the problem is that Hive schedules too many reducers! The default configuration of Hive tries to determine # of tasks automatically using value of "hive.exec.reducers.bytes.per.reducer" property (the default setting is to have one reduce task per 1GB of input data). When the size of the data is huge, this is inefficient. This needs to capped! For example in my case, there is 50GB data per node, but only 2 reduce task slots and I'm getting 25 reduce task waves. Q1 ran for 1h49min. In contrast, when I set "hive.exec.reducers.max" property to the number of reduce slots in my Hadoop installation, the query running time is only about 23min. Of note, the default value for "hive.exec.reducers.max" is 999. The above issue was not too bad for the data size you used. TPC-H dataset with SF=100 translates into at most 100 reducers per job, and with 40 reduce slots in total, each job had max. 2.5 reduce task waves. Still, your numbers could be somewhat better by capping "hive.exec.reducers.max" to 40 per Tom White's tip #9 from http://www.cloudera.com/blog/2009/05/10-mapreduce-tips . Could please confirm whether my understanding is correct. Thank you, Kamil
          Hide
          Joseph Salisbury added a comment -

          Hello Yuntao,

          Is this project still active? I would be very interested in participating.

          @Kamil, were you able to resolve your issue with the number reducers?

          Thanks,

          Joe

          Show
          Joseph Salisbury added a comment - Hello Yuntao, Is this project still active? I would be very interested in participating. @Kamil, were you able to resolve your issue with the number reducers? Thanks, Joe
          Hide
          Jie Li added a comment -

          Hi all, we conducted TPC-H benchmark on Pig as well and compared with Hive. Overall Hive is very efficient, but we find some of Hive's queries are suboptimal, especially for the order of joins, e.g. it's better to do small joins first. That's probably why some of Hive's queries were either super slow or failed (e.g. Q9 failed in our comparison, and was extremely slow in Hadapt's comparison).

          Our results are available at https://issues.apache.org/jira/browse/PIG-2397. Hope they're helpful to Hive as well.

          Show
          Jie Li added a comment - Hi all, we conducted TPC-H benchmark on Pig as well and compared with Hive. Overall Hive is very efficient, but we find some of Hive's queries are suboptimal, especially for the order of joins, e.g. it's better to do small joins first. That's probably why some of Hive's queries were either super slow or failed (e.g. Q9 failed in our comparison, and was extremely slow in Hadapt's comparison). Our results are available at https://issues.apache.org/jira/browse/PIG-2397 . Hope they're helpful to Hive as well.
          Hide
          alex gemini added a comment -

          I guess the Q9 failed is because hive set mapred.min.split.size=536870912, the default jvm setting for task launch is 200M, so it failed,remove this setting or set jvm memory for a larger value would be nice.

          Show
          alex gemini added a comment - I guess the Q9 failed is because hive set mapred.min.split.size=536870912, the default jvm setting for task launch is 200M, so it failed,remove this setting or set jvm memory for a larger value would be nice.
          Hide
          xuanjinlee added a comment -

          Hi all:
          Which database I choose to generate data ? SQL SEVER ?DB2 or ORACLE? or change the default setting of config?
          The README of HIVE-600 does not tell me which database should I set to,

          Show
          xuanjinlee added a comment - Hi all: Which database I choose to generate data ? SQL SEVER ?DB2 or ORACLE? or change the default setting of config? The README of HIVE-600 does not tell me which database should I set to,

            People

            • Assignee:
              Yuntao Jia
              Reporter:
              Yuntao Jia
            • Votes:
              0 Vote for this issue
              Watchers:
              21 Start watching this issue

              Dates

              • Created:
                Updated:

                Development