Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5755

TOP_N_SORT operator does not free memory while running

    XMLWordPrintableJSON

Details

    Description

      The TOP_N_SORT operator should keep the top N rows while processing its input, and free the memory used to hold all rows below the top N.

      For example, the following query uses a table with 125M rows:

      select row_count, sum(row_count), avg(double_field), max(double_rand), count(float_rand) from dfs.`/data/tmp` group by row_count order by row_count limit 30;
      

      And failed with an OOM when each of the 3 TOP_N_SORT operators was holding about 2.44 GB !! (see attached profile). It should take far less memory to hold 30 rows !!

      Attachments

        Issue Links

          Activity

            People

              timothyfarkas Timothy Farkas
              ben-zvi Boaz Ben-Zvi
              Paul Rogers Paul Rogers
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: