Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-16484

Investigate SparkLauncher for HoS as alternative to bin/spark-submit

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Done
    • None
    • None
    • Spark
    • None

    Description

      The SparkClientImpl#startDriver currently looks for the SPARK_HOME directory and invokes the bin/spark-submit script, which spawns a separate process to run the Spark application.

      SparkLauncher was added in SPARK-4924 and is a programatic way to launch Spark applications.

      I see a few advantages:

      • No need to spawn a separate process to launch a HoS --> lower startup time
      • Simplifies the code in SparkClientImpl --> easier to debug
      • SparkLauncher#startApplication returns a SparkAppHandle which contains some useful utilities for querying the state of the Spark job
        • It also allows the launcher to specify a list of job listeners

      Attachments

        1. HIVE-16484.1.patch
          32 kB
          Sahil Takiar
        2. HIVE-16484.10.patch
          51 kB
          Sahil Takiar
        3. HIVE-16484.2.patch
          32 kB
          Sahil Takiar
        4. HIVE-16484.3.patch
          33 kB
          Sahil Takiar
        5. HIVE-16484.4.patch
          49 kB
          Sahil Takiar
        6. HIVE-16484.5.patch
          49 kB
          Sahil Takiar
        7. HIVE-16484.6.patch
          49 kB
          Sahil Takiar
        8. HIVE-16484.7.patch
          52 kB
          Sahil Takiar
        9. HIVE-16484.8.patch
          48 kB
          Sahil Takiar
        10. HIVE-16484.9.patch
          49 kB
          Sahil Takiar

        Issue Links

          Activity

            People

              stakiar Sahil Takiar
              stakiar Sahil Takiar
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: