Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4479

JsonReader should pick a less restrictive type when creating the default column

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.5.0
    • 1.7.0
    • Storage - JSON
    • None

    Description

      This JIRA is related to DRILL-3806 but has a narrower scope, so I decided to create separate one.

      The JsonReader has the method ensureAtLeastOneField() (see https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReader.java#L91) that ensures that when no columns are found, create an empty one and it chooses to create a nullable int column. One consequence is that queries of the following type fail:

      select c1 from dfs.`mostlynulls.json`;
      ...
      ...
      | null  |
      | null  |
      Error: DATA_READ ERROR: Error parsing JSON - You tried to write a VarChar type when you are using a ValueWriter of type NullableIntWriterImpl.
      
      File  /Users/asinha/data/mostlynulls.json
      Record  4097
      

      In this file the first 4096 rows have NULL values for c1 followed by rows that have a valid string.

      It would be useful for the Json reader to choose a less restrictive type such as varchar in order to allow more types of queries to run.

      Attachments

        1. mostlynulls.json
          128 kB
          Aman Sinha

        Issue Links

          Activity

            People

              amansinha100 Aman Sinha
              amansinha100 Aman Sinha
              Chun Chang Chun Chang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: