Uploaded image for project: 'Livy'
  1. Livy
  2. LIVY-533

Spark jobs submitted via programmatic API cannot always be canceled

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.5.0
    • Fix Version/s: 0.6.0
    • Component/s: RSC

      Description

      Running stages of Spark jobs submitted via Livy' programmatic API cannot (always) be successfully cancelled.

      The current implementation of .JobWrapper.cancel() interrupts the worker thread on the Spark driver (via Future.cancel(true)):

      https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/rsc/src/main/java/org/apache/livy/rsc/driver/JobWrapper.java#L84

      This does not always cancel all activity in Spark, e.g. long-running stages may remain unaffected.

      The Spark-way of cancelling jobs seems to be via SparkContext.setJobGroup()/cancelJobGroup(), which is also being used in Livy's REPL Session:

      https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/repl/src/main/scala/org/apache/livy/repl/Session.scala#L164

      I have opened a PR that invokes setJobGroup()/cancelJobGroup() in addition to interrupting the worker thread running on the driver:

      https://github.com/apache/incubator-livy/pull/128

       

      It would be great if the fix could make it into the 0.6 release.

       

        Attachments

          Activity

            People

            • Assignee:
              bjoern.lohrmann Björn Lohrmann
              Reporter:
              bjoern.lohrmann Björn Lohrmann
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: