Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3952

Improve Window Functions performance when not all batches are required to process the current batch

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Currently, the window operator blocks until all batches of current partition to be available. For some queries it's necessary (e.g. aggregate with no order-by in the window definition), but for other cases the window operator can process and pass the current batch downstream sooner.

      Implementing this should help the window operator use less memory and run faster, especially in the presence of a limit operator.

      The purpose of this JIRA is to improve the window operator in the following cases:

      • aggregate, when order-by clause is available in window definition, can process current batch as soon as it receives the last peer row
      • lead can process current batch as soon as it receives 1 more batch
      • lag can process current batch immediately
      • first_value can process current batch immediately
      • last_value, when order-by clause is available in window definition, can process current batch as soon as it receives the last peer row
      • row_number, rank and dense_rank can process current batch immediately

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            adeneche Abdel Hakim Deneche
            adeneche Abdel Hakim Deneche
            Dechang Gu Dechang Gu
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment