Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5755

TOP_N_SORT operator does not free memory while running

    Details

      Description

      The TOP_N_SORT operator should keep the top N rows while processing its input, and free the memory used to hold all rows below the top N.

      For example, the following query uses a table with 125M rows:

      select row_count, sum(row_count), avg(double_field), max(double_rand), count(float_rand) from dfs.`/data/tmp` group by row_count order by row_count limit 30;
      

      And failed with an OOM when each of the 3 TOP_N_SORT operators was holding about 2.44 GB !! (see attached profile). It should take far less memory to hold 30 rows !!

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                timothyfarkas Timothy Farkas
                Reporter:
                ben-zvi Boaz Ben-Zvi
                Reviewer:
                Paul Rogers
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: