Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6879

Indicate a warning in the WebUI when a query makes little to no progress for a while

    XMLWordPrintableJSON

Details

    Description

      When running a very large query on a cluster with limited resource, we noticed that one of the node's VM thread freezes the fragment threads as it tries to do some work (GC perhaps?). This is a clear indication that the query is stuck in a weird state where it might not recover from.
      Under such circumstances, it makes sense to cancel or atleast warn the user on that page of the query exceeding a certain threshold.
      For detecting this, the user will find that the Last Progress column in the Fragments Overview section will show large times.

      In addition, there are instances where a query might have buffered operators spilling to disk, which also hits performance (and, subsequently, longer run times). Calling out this skew can be very useful.

       

      Or there might be cases where a single fragment takes much longer than the average (indicated by an extreme skew in the Gantt chart).

       

      Attachments

        1. image-2018-12-04-11-54-54-247.png
          92 kB
          Kunal Khatua
        2. image-2018-12-06-11-19-00-339.png
          47 kB
          Kunal Khatua
        3. image-2018-12-06-11-27-14-719.png
          572 kB
          Kunal Khatua

        Issue Links

          Activity

            People

              kkhatua Kunal Khatua
              kkhatua Kunal Khatua
              Arina Ielchiieva Arina Ielchiieva
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: