Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22903

Vectorized row_number() resets the row number after one batch in case of constant expression in partition clause

    XMLWordPrintableJSON

    Details

      Description

      Vectorized row number implementation resets the row number when constant expression is passed in partition clause.

      Repro Query

      select row_number() over(partition by 1) r1, t from over10k_n8;
      
      Or
      
      select row_number() over() r1, t from over10k_n8;
      

      where table over10k_n8 contains more than 1024 records.

      This happens because currently in VectorPTFOperator, we reset evaluators if only partition clause is there.

          // If we are only processing a PARTITION BY, reset our evaluators.
          if (!isPartitionOrderBy) {
            groupBatches.resetEvaluators();
          }
      

      To resolve, we should also check if the entire partition clause is a constant expression, if it is so then we should not do groupBatches.resetEvaluators()

        Attachments

        1. HIVE-22903.04.patch
          51 kB
          Shubham Chaurasia
        2. HIVE-22903.03.patch
          53 kB
          Shubham Chaurasia
        3. HIVE-22903.02.patch
          53 kB
          Shubham Chaurasia
        4. HIVE-22903.patch
          47 kB
          Shubham Chaurasia
        5. HIVE-22903.01.patch
          47 kB
          Shubham Chaurasia

          Issue Links

            Activity

              People

              • Assignee:
                ShubhamChaurasia Shubham Chaurasia
                Reporter:
                ShubhamChaurasia Shubham Chaurasia
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h