Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3831

Allow null values in lists

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Future
    • Component/s: Execution - Data Types
    • Labels:
      None

      Description

      Drill currently fails to read a json file where a list has a value of null in it. We have a workaround with all_text_mode for this case, but we need to enhance Drill to support this concept in the core ValueVector data structure used to represent records.

      As part of this change, I am considering removing the concept of a list that requires all of its members to be non-null, effectively the only type of list we have today. The data that can be read today would simply be read into a list where the members could be nullable, but they all happen to be non-null. This would simplify the code to prevent the need to cover the null and non-null cases explicitly.

      Initially this could pose a risk with a minor performance hit, but overall our approach with complex data is not been heavily performance tested. Keeping the code simple for now will at least allow for more thorough testing of the smaller number of cases, and hopefully make it easier to reason about and improve as we evaluate the performance of Drill with complex data more thoroughly.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                jaltekruse Jason Altekruse
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: