Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22903

Vectorized row_number() resets the row number after one batch in case of constant expression in partition clause

    XMLWordPrintableJSON

Details

    Description

      Vectorized row number implementation resets the row number when constant expression is passed in partition clause.

      Repro Query

      select row_number() over(partition by 1) r1, t from over10k_n8;
      
      Or
      
      select row_number() over() r1, t from over10k_n8;
      

      where table over10k_n8 contains more than 1024 records.

      This happens because currently in VectorPTFOperator, we reset evaluators if only partition clause is there.

          // If we are only processing a PARTITION BY, reset our evaluators.
          if (!isPartitionOrderBy) {
            groupBatches.resetEvaluators();
          }
      

      To resolve, we should also check if the entire partition clause is a constant expression, if it is so then we should not do groupBatches.resetEvaluators()

      Attachments

        1. HIVE-22903.04.patch
          51 kB
          Shubham Chaurasia
        2. HIVE-22903.03.patch
          53 kB
          Shubham Chaurasia
        3. HIVE-22903.02.patch
          53 kB
          Shubham Chaurasia
        4. HIVE-22903.patch
          47 kB
          Shubham Chaurasia
        5. HIVE-22903.01.patch
          47 kB
          Shubham Chaurasia

        Issue Links

          Activity

            People

              ShubhamChaurasia Shubham Chaurasia
              ShubhamChaurasia Shubham Chaurasia
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h