Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20362

PTF: Invert sorting for CURRENT_ROW to UNBOUNDED FOLLOWING

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • CBO, PTF-Windowing
    • None

    Description

      COALESCE(current_page, FIRST_VALUE(current_page,TRUE) OVER ( PARTITION BY user_id ORDER BY timestamp ROWS BETWEEN CURRENT ROW and UNBOUNDED FOLLOWING))
      

      This is a spilling PTF because it needs an unseen row to generate the current row when executing this.

      This can be inverted, so that the shuffle sorts it the other way, so the result can be generated from a previously seen row.

      COALESCE(current_page, LAST_VALUE(current_page,TRUE) OVER ( PARTITION BY user_id ORDER BY timestamp desc ROWS BETWEEN UNBOUNDED PRECEDING and CURRENT ROW))
      

      This query pattern is bad in general, but seems to exist as a workaround for - HIVE-18145

      Attachments

        Activity

          People

            Unassigned Unassigned
            gopalv Gopal Vijayaraghavan
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: