Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.8.0
    • Fix Version/s: 1.11.0
    • Component/s: None
    • Labels:
      None

      Description

      The ExternalSortBatch (ESB) operator sorts records while spilling to disk to control memory use. The size of the spill file is not easy to control. It is a function of the accumulated batches size (half of the accumulated total), which is determined by either the memory budget or the drill.exec.sort.external.group.size parameter. (But, even with the parameter, the actual file size is still half the accumulated batches.)

      The proposed solution is to provide an explicit parameter that sets the maximum spill file size: drill.exec.sort.external.spill.size. If the ESB needs to spill more than this amount of data, ESB should split the spill into multiple files.

      The spill.size should be in bytes (or MB). (A size in records makes the file size data-dependent, which would not be helpful.)

        Attachments

          Activity

            People

            • Assignee:
              paul-rogers Paul Rogers
              Reporter:
              paul-rogers Paul Rogers
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: