Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-9302

Handle complex JSON types in collect()/head()

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.4.0, 1.4.1
    • None
    • SparkR
    • None

    Description

      Reported in the mailing list by Exie <tfindlay@prodevelop.com.au>:

      A sample record in raw JSON looks like this:
      {"version": 1,"event": "view","timestamp": 1427846422377,"system":
      "DCDS","asset": "6404476","assetType": "myType","assetCategory":
      "myCategory","extras": [{"name": "videoSource","value": "mySource"},{"name":
      "playerType","value": "Article"},{"name": "duration","value":
      "202088"}],"trackingId": "155629a0-d802-11e4-13ee-6884e43d6000","ipAddress":
      "165.69.2.4","title": "myTitle"}
      
      > head(mydf)
      Error in as.data.frame.default(x[[i]], optional = TRUE) : 
        cannot coerce class ""jobj"" to a data.frame
      >
      > show(mydf)
      DataFrame[localEventDtTm:timestamp, asset:string, assetCategory:string, assetType:string, event:string, extras:array<struct&lt;name:string,value:string>>, ipAddress:string, memberId:string, system:string, timestamp:bigint, title:string, trackingId:string, version:bigint]
      >
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              sunrui Sun Rui
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: