Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-40646

Fix returning partial results in JSON data source and JSON functions

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • SQL
    • None

    Description

      I recently found an issue when parsing the following JSON file:

      {"a": {"x": 1, "y": true}, "b": {"x": 1}}
      {"a": {"x": 2}, "b": {"x": 2}}

      Trying to read such table with fixed schema where y is a struct column and not a boolean: 

      val df = spark.read
        .schema("a struct<x: int, y: struct<x: int>>, b struct<x: int>")
        .json("path") 

      results in the following answer:

      a	                 b
      null	                 null
      {"x":2,"y":null}	{"x":2} 

      Column b is valid and should be still parsed despite a having the wrong value.

       

      This could be considered a follow-up to https://issues.apache.org/jira/browse/SPARK-33134.

      Attachments

        Issue Links

          Activity

            People

              ivan.sadikov Ivan Sadikov
              ivan.sadikov Ivan Sadikov
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: