Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-9032

scala.MatchError in DataFrameReader.json(String path)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.4.0
    • 1.4.1
    • Java API, SQL
    • None
    • Ubuntu 15.04

    Description

      Executing read().json() of SQLContext e.g. DataFrameReader raises a MatchError with a stacktrace as follows while trying to read JSON data:

      15/07/14 11:25:26 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
      15/07/14 11:25:26 INFO DAGScheduler: Job 0 finished: json at Example.java:23, took 6.981330 s
      Exception in thread "main" scala.MatchError: StringType (of class org.apache.spark.sql.types.StringType$)
      	at org.apache.spark.sql.json.InferSchema$.apply(InferSchema.scala:58)
      	at org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:139)
      	at org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:138)
      	at scala.Option.getOrElse(Option.scala:120)
      	at org.apache.spark.sql.json.JSONRelation.schema$lzycompute(JSONRelation.scala:137)
      	at org.apache.spark.sql.json.JSONRelation.schema(JSONRelation.scala:137)
      	at org.apache.spark.sql.sources.LogicalRelation.<init>(LogicalRelation.scala:30)
      	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:120)
      	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:104)
      	at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:213)
      	at com.hp.sparkdemo.Example.main(Example.java:23)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:497)
      	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
      	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
      	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
      	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
      	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
      15/07/14 11:25:26 INFO SparkContext: Invoking stop() from shutdown hook
      15/07/14 11:25:26 INFO SparkUI: Stopped Spark web UI at http://10.0.2.15:4040
      15/07/14 11:25:26 INFO DAGScheduler: Stopping DAGScheduler
      15/07/14 11:25:26 INFO SparkDeploySchedulerBackend: Shutting down all executors
      15/07/14 11:25:26 INFO SparkDeploySchedulerBackend: Asking each executor to shut down
      15/07/14 11:25:26 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
      

      Offending code snippet (around line 23):

             JavaSparkContext sctx = new JavaSparkContext(sparkConf);
              SQLContext ctx = new SQLContext(sctx);
              DataFrame frame = ctx.read().json(facebookJSON);
              frame.printSchema();
      

      The exception is reproducable using the following JSON:

      {
         "data": [
            {
               "id": "X999_Y999",
               "from": {
                  "name": "Tom Brady", "id": "X12"
               },
               "message": "Looking forward to 2010!",
               "actions": [
                  {
                     "name": "Comment",
                     "link": "http://www.facebook.com/X999/posts/Y999"
                  },
                  {
                     "name": "Like",
                     "link": "http://www.facebook.com/X999/posts/Y999"
                  }
               ],
               "type": "status",
               "created_time": "2010-08-02T21:27:44+0000",
               "updated_time": "2010-08-02T21:27:44+0000"
            },
            {
               "id": "X998_Y998",
               "from": {
                  "name": "Peyton Manning", "id": "X18"
               },
               "message": "Where's my contract?",
               "actions": [
                  {
                     "name": "Comment",
                     "link": "http://www.facebook.com/X998/posts/Y998"
                  },
                  {
                     "name": "Like",
                     "link": "http://www.facebook.com/X998/posts/Y998"
                  }
               ],
               "type": "status",
               "created_time": "2010-08-02T21:27:44+0000",
               "updated_time": "2010-08-02T21:27:44+0000"
            }
         ]
      }
      

      Attachments

        Activity

          People

            joshrosen Josh Rosen
            ppoetter Philipp Poetter
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: