Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5283

Support "is not present" as subtype of "is null" for JSON data

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments


    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.10.0
    • 2.0.0
    • None
    • None


      JSON files consist of a series of "objects", each of which has name/value pairs. Values can be in one of three states:

      • Not present (the value does not appear)
      • Null (the name appears and the value is null)
      • Non-null (the field is one of the JSON data types)

      Drill, however, has only a single null state and so Drill collapses "not present" and "null" into the same state.

      The not-present and present-but-null states work identically for calculations inside Drill. But, when doing a CTAS from JSON to JSON, the collapsed state means that the user does not get out of Drill what was put in: all null values either appear as null values, or do not appear at all (depending on Drill version.)

      This ticket asks to repurpose the "bit" fields in nullable vectors. Rename the vector to "nullState". Then, use these values:

      • 0: value is set
      • 1: value is null
      • 3: value is not present

      The column is null if the null state is non-zero. The column is not null if the null state is 0.

      This change requires reversing the "polarity" of the bit field, and so is a major change.


        Issue Links


          This comment will be Viewable by All Users Viewable by All Users


            Unassigned Unassigned
            paul-rogers Paul Rogers



              Issue deployment