Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-40646

Fix returning partial results in JSON data source and JSON functions

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • SQL
    • None

    Description

      I recently found an issue when parsing the following JSON file:

      {"a": {"x": 1, "y": true}, "b": {"x": 1}}
      {"a": {"x": 2}, "b": {"x": 2}}

      Trying to read such table with fixed schema where y is a struct column and not a boolean: 

      val df = spark.read
        .schema("a struct<x: int, y: struct<x: int>>, b struct<x: int>")
        .json("path") 

      results in the following answer:

      a	                 b
      null	                 null
      {"x":2,"y":null}	{"x":2} 

      Column b is valid and should be still parsed despite a having the wrong value.

       

      This could be considered a follow-up to https://issues.apache.org/jira/browse/SPARK-33134.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            ivan.sadikov Ivan Sadikov
            ivan.sadikov Ivan Sadikov
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment