Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-260

Hudi Spark Bundle does not work when passed in extraClassPath option

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • Spark Integration
    • None

    Description

      On EMR's side we have the same findings. a + b + c +d work in the following cases:

      • The bundle jar (with databricks-avro shaded) is specified using --jars or spark.jars option
      • The bundle jar (with databricks-avro shaded) is placed in the Spark Home jars folder i.e. /usr/lib/spark/jars folder

      However, it does not work if the jar is specified using spark.driver.extraClassPath and spark.executor.extraClassPath options which is what EMR uses to configure external dependencies. Although we can drop the jar in /usr/lib/spark/jars folder, but I am not sure if it is recommended because that folder is supposed to contain the jars coming from spark. Extra dependencies from users side would be better off specified through extraClassPath option.

      Attachments

        Issue Links

          Activity

            People

              uditme Udit Mehrotra
              vinoth Vinoth Chandar
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: