Uploaded image for project: 'Apache Airflow'
  1. Apache Airflow
  2. AIRFLOW-6086

SparkSubmitOperator - Unable to override spark_binary

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 1.10.6
    • Fix Version/s: None
    • Component/s: contrib, core
    • Labels:
      None

      Description

      Hello,

      I have a connection "spark2_default" :

      Conn Id Conn Type Host Port Is Encrypted Is Extra Encrypted Extra
      'spark2_default' 'spark2'  'yarn-cluster' None   False  False {"master":"yarn-cluster","deploy-mode":"cluster","spark-binary":"spark2-submit"}

      Extra contains 'spark-binary' key that was use by airflow 1.10.2 to choose spark-submit operator. But in version 1.10.6 this config is ignore.

      I think that , in class SparkSubmitOperator in init function they has a default value "spark-submit" for spark_binary parameter.

       spark_binary="spark-submit",
      

      Therefore in class SparkSubmitHook when we control if spark_binary is empty it can't be.

      conn_data['spark_binary'] = self._spark_binary or  \
                      extra.get('spark-binary', "spark-submit")
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                flof076 Florian FERREIRA
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: