Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-40820

Creating StructType from Json

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete CommentsDelete
    XMLWordPrintableJSON

Details

    Description

      When create a StructType from a Python dictionary you use StructType.fromJson or in scala DataType.fromJson

      To create a schema can be created as follows from the code below, but it requires to put inside the json: Nullable and Metadata, this is inconsistent because within the DataType class this by default.

      schema = {
           "name": "name", "type": "string" 
      }
      
      StructField.fromJson(schema)
      

      Python Error:

      from pyspark.sql.types import StructField
      
      schema = {
           "name": "c1", "type": "string" 
      }
      
      StructField.fromJson(schema)
      
      >>
      Traceback (most recent call last):
      File "code.py", line 90, in runcode
      exec(code, self.locals)
      File "<input>", line 1, in <module>
      File "pyspark/sql/types.py", line 583, in fromJson
      json["nullable"],
      KeyError: 'nullable' 
      

      Scala Error:

      val schema = """
              |{
              |    "type": "struct",
              |    "fields": [
              |        {
              |            "name": "c1",
              |            "type": "string",
              |            "nullable": false
              |        }
              |    ]
              |}
              |""".stripMargin
      
      DataType.fromJson(schema)
      
      >>
      Failed to convert the JSON string '{"name":"c1","type":"string"}' to a field.
      java.lang.IllegalArgumentException: Failed to convert the JSON string '{"name":"c1","type":"string"}' to a field.
      at org.apache.spark.sql.types.DataType$.parseStructField(DataType.scala:268)
      at org.apache.spark.sql.types.DataType$.$anonfun$parseDataType$1(DataType.scala:225)
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            awainerc Anthony Wainer Cachay Guivin Assign to me
            awainerc Anthony Wainer Cachay Guivin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment