Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-9134 Uber JIRA to track HOS performance work
  3. HIVE-9124

Performance of query 28 from tpc-ds [Spark Branch]

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • spark-branch
    • None
    • Spark
    • None

    Description

      As you can see the from the attached screenshot, one stage was submitted at 2014/12/16 12:06:30 and took 6 minutes (ending around 12:12). However the next stage was not submitted until 2014/12/16 12:18:42. We should understand:

      • What is going on the mean time
      • Why is it taking so long
      select  *
      from (select avg(ss_list_price) B1_LP
                  ,count(ss_list_price) B1_CNT
                  ,count(distinct ss_list_price) B1_CNTD
            from store_sales
            where ss_quantity between 0 and 5
              and (ss_list_price between 11 and 11+10 
                   or ss_coupon_amt between 460 and 460+1000
                   or ss_wholesale_cost between 14 and 14+20)) B1,
           (select avg(ss_list_price) B2_LP
                  ,count(ss_list_price) B2_CNT
                  ,count(distinct ss_list_price) B2_CNTD
            from store_sales
            where ss_quantity between 6 and 10
              and (ss_list_price between 91 and 91+10
                or ss_coupon_amt between 1430 and 1430+1000
                or ss_wholesale_cost between 32 and 32+20)) B2,
           (select avg(ss_list_price) B3_LP
                  ,count(ss_list_price) B3_CNT
                  ,count(distinct ss_list_price) B3_CNTD
            from store_sales
            where ss_quantity between 11 and 15
              and (ss_list_price between 66 and 66+10
                or ss_coupon_amt between 920 and 920+1000
                or ss_wholesale_cost between 4 and 4+20)) B3,
           (select avg(ss_list_price) B4_LP
                  ,count(ss_list_price) B4_CNT
                  ,count(distinct ss_list_price) B4_CNTD
            from store_sales
            where ss_quantity between 16 and 20
              and (ss_list_price between 142 and 142+10
                or ss_coupon_amt between 3054 and 3054+1000
                or ss_wholesale_cost between 80 and 80+20)) B4,
           (select avg(ss_list_price) B5_LP
                  ,count(ss_list_price) B5_CNT
                  ,count(distinct ss_list_price) B5_CNTD
            from store_sales
            where ss_quantity between 21 and 25
              and (ss_list_price between 135 and 135+10
                or ss_coupon_amt between 14180 and 14180+1000
                or ss_wholesale_cost between 38 and 38+20)) B5,
           (select avg(ss_list_price) B6_LP
                  ,count(ss_list_price) B6_CNT
                  ,count(distinct ss_list_price) B6_CNTD
            from store_sales
            where ss_quantity between 26 and 30
              and (ss_list_price between 28 and 28+10
                or ss_coupon_amt between 2513 and 2513+1000
                or ss_wholesale_cost between 42 and 42+20)) B6
      limit 100
      

      Attachments

        1. Screen Shot 2014-12-16 at 9.30.41 AM.png
          120 kB
          Brock Noland
        2. query28-explain.txt
          21 kB
          Brock Noland

        Issue Links

          Activity

            People

              Unassigned Unassigned
              brocknoland Brock Noland
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: