Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3202

Count(*) fails on JSON wrapped up in single array - JSON parsing error

    XMLWordPrintableJSON

Details

    • Bug
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 1.0.0
    • Future
    • Storage - JSON
    • None

    Description

      I have a JSON document as follows.

      [

      { "Category": "1,2", "Comments": "Total sites: 20, RV sites: 20, Elec sites: 20, Water at site, RV Dump, Showers, Flush Toilets, RV Fee: $14, Tent Fee: $14, Elev: 545', Tel: 256-577-9619, Nearest town: Muscle Shoals", "Latitude": "34.800446", "Longitude": "-87.498242", "Name": "Alloys Co Park", "State": "AL", "Type": "cp", "URL": "http://www.campingroadtrip.com/campgrounds/campground/campground/23478/alabama/colbert-county-alloys-park-campground" }

      ]

      Drill has ability to unwrap the array (without user specifying it) and perform some SQL operations on it. However count specifically fails on these documents.

      0: jdbc:drill:zk=local> select * from dfs.`default`.`/Users/nrentachintala/Downloads/yelp/uspointsofinterestshort.json` limit 10;
      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      Category Comments Latitude Longitude Name State Type URL

      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      1,2 Total sites: 20, RV sites: 20, Elec sites: 20, Water at site, RV Dump, Showers, Flush Toilets, RV Fee: $14, Tent Fee: $14, Elev: 545', Tel: 256-577-9619, Nearest town: Muscle Shoals 34.800446 -87.498242 Alloys Co Park AL cp http://www.campingroadtrip.com/campgrounds/campground/campground/23478/alabama/colbert-county-alloys-park-campground

      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      1 row selected (0.197 seconds)
      0: jdbc:drill:zk=local> select distinct type from dfs.`default`.`/Users/nrentachintala/Downloads/yelp/uspointsofinterestshort.json` limit 10;
      -------

      type

      -------

      cp

      -------
      1 row selected (0.193 seconds)
      0: jdbc:drill:zk=local>
      0: jdbc:drill:zk=local> select count from dfs.`default`.`/Users/nrentachintala/Downloads/yelp/uspointsofinterestshort.json` limit 10;
      Error: DATA_READ ERROR: Error parsing JSON - Cannot read from the middle of a record. Current token was START_ARRAY

      File /Users/nrentachintala/Downloads/yelp/uspointsofinterestshort.json
      Record 1
      Fragment 0:0

      [Error Id: 4742f738-1d43-4fef-af48-110065c9dd83 on 172.16.1.82:31010] (state=,code=0)

      Attachments

        1. DRILL-3202.patch
          8 kB
          Steven Phillips

        Activity

          People

            sphillips Steven Phillips
            Neeraja Neeraja
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: