Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4296

Query hangs in CANCELLATION_REQUESTED when cancelled after it starts returning results

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.5.0
    • None
    • Execution - Flow
    • None
    • commit.id=c9dbfbd
      2 nodes with 32 cores and 32GB of max direct memory for drill

    Description

      After running the following queries (it's the same reproduction from DRILL-2274):

      set planner.memory.max_query_memory_per_node=8589934592;
      select sub1.uid from `all2274.json` sub1 inner join `all2274.json` sub2 on sub1.uid = sub2.uid order by sub1.uid;
      

      After the query starts returning results, I cancelled the query from sqlline. This caused the query to hang in a CANCELLATION_REQUESTED state.

      Looking at jstack (attached) the root fragment is blocked waiting for Ack from the client.

      The the foreman node (which also runs Zookeeper) runs out of disk space once the query finishes spilling, which seems to contribute to the issue. Once I changed the spill directory to nfs I no longer so the issue.

      Attachments

        1. 295eefc3-8d15-d63b-a721-3fde365b639c.sys.drill
          17 kB
          Abdel Hakim Deneche
        2. data.tar.gz
          1.71 MB
          Abdel Hakim Deneche
        3. node1_jstack.txt
          67 kB
          Abdel Hakim Deneche
        4. node2_jstack.txt
          61 kB
          Abdel Hakim Deneche

        Issue Links

          Activity

            People

              Unassigned Unassigned
              adeneche Abdel Hakim Deneche
              Khurram Faraaz Khurram Faraaz
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: