Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17172

pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while calling None.org.apache.spark.sql.hive.HiveContext.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 1.6.2
    • None
    • PySpark
    • None
    • spark version: 1.6.2
      python version: 3.4.2 (v3.4.2:ab2c023a9432, Oct 5 2014, 20:42:22)
      [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]

    Description

      from pyspark.sql import HiveContext
      sqlContext = HiveContext(sc)

      1. Define udf
        from pyspark.sql.functions import udf
        def scoreToCategory(score):
        if score >= 80: return 'A'
        elif score >= 60: return 'B'
        elif score >= 35: return 'C'
        else: return 'D'

      udfScoreToCategory=udf(scoreToCategory, StringType())

      throws exception

      Py4JJavaError: An error occurred while calling None.org.apache.spark.sql.hive.HiveContext.
      : java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

      Attachments

        1. hiveUDFBug.html
          249 kB
          Andrew Davidson
        2. hiveUDFBug.ipynb
          107 kB
          Andrew Davidson

        Issue Links

          Activity

            People

              Unassigned Unassigned
              aedwip Andrew Davidson
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: