Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-1347 Release 0.6.2
  3. ZEPPELIN-1411

UDF with pyspark not working - object has no attribute 'parseDataType'

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.6.1
    • Fix Version/s: 0.6.2, 0.7.0
    • Component/s: Interpreters
    • Labels:
      None

      Description

      The following UDF example doesn't work.

      from pyspark.sql.types import StringType
      from pyspark.sql.functions import udf
      
      maturity_udf = udf(lambda age: "adult" if age >=18 else "child", StringType())  ## Error is from here.
      
      df = sqlContext.createDataFrame([{'name': 'Alice', 'age': 1}])
      df.withColumn("maturity", maturity_udf(df.age))
      

      The error arises from

      maturity_udf = udf(lambda age: "adult" if age >=18 else "child", StringType()) 
      

      I tried several examples with UDF and they all result in the same stack trace.
      Stack trace

      Traceback (most recent call last):
        File "/tmp/zeppelin_pyspark-64075962331083004.py", line 266, in <module>
          raise Exception(traceback.format_exc())
      Exception: Traceback (most recent call last):
        File "/tmp/zeppelin_pyspark-64075962331083004.py", line 259, in <module>
          exec(code)
        File "<stdin>", line 3, in <module>
        File "/home/sjames/zeppelin/zeppelin-0.6.1-bin-all/interpreter/spark/pyspark/pyspark.zip/pyspark/sql/functions.py", line 1789, in udf
          return UserDefinedFunction(f, returnType)
        File "/home/sjames/zeppelin/zeppelin-0.6.1-bin-all/interpreter/spark/pyspark/pyspark.zip/pyspark/sql/functions.py", line 1751, in __init__
          self._judf = self._create_judf(name)
        File "/home/sjames/zeppelin/zeppelin-0.6.1-bin-all/interpreter/spark/pyspark/pyspark.zip/pyspark/sql/functions.py", line 1758, in _create_judf
          jdt = ctx._ssql_ctx.parseDataType(self.returnType.json())
      AttributeError: 'JavaMember' object has no attribute 'parseDataType'
      

      Similar error is also reported in https://forums.aws.amazon.com/thread.jspa?messageID=739815&tstart=0

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                zjffdu Jeff Zhang
                Reporter:
                sojan.james Sojan James
              • Votes:
                1 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: