Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2272

Parquet scanner always materializes NULL for empty collections

    XMLWordPrintableJSON

Details

    Description

      Currently the Parquet scanner will always materialize a NULL slot for an empty collection, rather than an empty ArrayValue/CollectionValue. It is not currently possible to write a query that exposes this bug (i.e. it's not possible to write a query that distinguishes between an empty and NULL collection), but it will be once we add expressions that take collections as input (e.g. "select array_column is null from tbl").

      We have this bug because the parquet scanner only looks at the repeated field of an array, not the containing group field. To fix it, it will have to consider the def/rep levels of both.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              skye Skye Wanderman-Milne
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: