Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5478

Spill file size parameter is not honored by the managed external sort

    XMLWordPrintableJSON

Details

    Description

      git.commit.id.abbrev=1e0a14c

      Query:

      ALTER SESSION SET `exec.sort.disable_managed` = false;
      alter session set `planner.width.max_per_node` = 1;
      alter session set `planner.disable_exchanges` = true;
      alter session set `planner.width.max_per_query` = 1;
      alter session set `planner.memory.max_query_memory_per_node` = 1052428800;
      alter session set `planner.enable_decimal_data_type` = true;
      select count(*) from (
        select * from dfs.`/drill/testdata/resource-manager/all_types_large` d1
        order by d1.map.missing
      ) d;
      

      Boot Options (spill file size is set to 256MB)

      0: jdbc:drill:zk=10.10.100.190:5181> select * from sys.boot where name like '%spill%';
      +--------------------------------------------------+---------+-------+---------+----------+----------------------------------------------------+-----------+------------+
      |                       name                       |  kind   | type  | status  | num_val  |                     string_val                     | bool_val  | float_val  |
      +--------------------------------------------------+---------+-------+---------+----------+----------------------------------------------------+-----------+------------+
      | drill.exec.sort.external.spill.directories       | STRING  | BOOT  | BOOT    | null     | [
          # drill-override.conf: 26
          "/tmp/test"
      ]  | null      | null       |
      | drill.exec.sort.external.spill.file_size         | STRING  | BOOT  | BOOT    | null     | "256M"                                             | null      | null       |
      | drill.exec.sort.external.spill.fs                | STRING  | BOOT  | BOOT    | null     | "maprfs:///"                                       | null      | null       |
      | drill.exec.sort.external.spill.group.size        | LONG    | BOOT  | BOOT    | 40000    | null                                               | null      | null       |
      | drill.exec.sort.external.spill.merge_batch_size  | STRING  | BOOT  | BOOT    | null     | "16M"                                              | null      | null       |
      | drill.exec.sort.external.spill.spill_batch_size  | STRING  | BOOT  | BOOT    | null     | "8M"                                               | null      | null       |
      | drill.exec.sort.external.spill.threshold         | LONG    | BOOT  | BOOT    | 40000    | null                                               | null      | null       |
      +--------------------------------------------------+---------+-------+---------+----------+----------------------------------------------------+-----------+------------+
      

      Below are the spill files while the query is still executing. The size of the spill files is ~34MB

      -rwxr-xr-x   3 root root   34957815 2017-05-05 11:26 /tmp/test/26f33c36-4235-3531-aeaa-2c73dc4ddeb5_major0_minor0_op5_sort/run1
      -rwxr-xr-x   3 root root   34957815 2017-05-05 11:27 /tmp/test/26f33c36-4235-3531-aeaa-2c73dc4ddeb5_major0_minor0_op5_sort/run2
      -rwxr-xr-x   3 root root          0 2017-05-05 11:27 /tmp/test/26f33c36-4235-3531-aeaa-2c73dc4ddeb5_major0_minor0_op5_sort/run3
      

      The data set is too large to attach here. Reach out to me if you need anything

      Attachments

        Issue Links

          Activity

            People

              paul-rogers Paul Rogers
              rkins Rahul Kumar Challapalli
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: