Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-9228

Problem with subquery using windowing functions

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.14.0, 0.13.1, 1.0.0
    • Fix Version/s: 1.0.2
    • Component/s: PTF-Windowing
    • Labels:
      None

      Description

      The following query with window functions failed. The internal query works fine.

      select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 then 1 end ) over (partition by col1, col2) as col5, row_number() over (partition by col1, col2 order by col4) as col6 from tab1) t;

      HIVE generates an execution plan with 2 jobs.
      1. The first job is to basically calculate window function for col5.
      2. The second job is to calculate window function for col6 and output.

      The plan says the first job outputs the columns (col1, col2, col3, col4) to a tmp file since only these columns are used in later stage. While, the PTF operator for the first job outputs (_wcol0, col1, col2, col3, col4) with _wcol0 as the result of the window function even it's not used.

      In the second job, the map operator still reads the 4 columns (col1, col2, col3, col4) from the temp file using the plan. That causes the exception.

        Attachments

        1. create_table_tab1.sql
          0.2 kB
          Aihua Xu
        2. HIVE-9228.1.patch.txt
          7 kB
          Navis Ryu
        3. HIVE-9228.2.patch.txt
          18 kB
          Navis Ryu
        4. HIVE-9228.3.patch.txt
          22 kB
          Navis Ryu
        5. tab1.csv
          0.0 kB
          Aihua Xu

          Issue Links

            Activity

              People

              • Assignee:
                navis Navis Ryu
                Reporter:
                aihuaxu Aihua Xu
              • Votes:
                0 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 96h
                  96h
                  Remaining:
                  Remaining Estimate - 96h
                  96h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified