Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4994

Push conversion and validation into dictionary construction

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • Impala 2.9.0
    • None
    • Backend

    Description

      Certain data types require conversion and/or validation when read from a Parquet file. For example, timestamps can require conversion to account for different storage offsets. Char/varchar fields can require conversion to handle lengths and space padding. Timestamps require validation, because not all bit combinations are valid timestamps.

      Right now, this is done per element as it is read. For dictionary encoded columns, it would save processing to do the conversion/validation once at dictionary construction.

      Attachments

        Issue Links

          Activity

            People

              csringhofer Csaba Ringhofer
              joemcdonnell Joe McDonnell
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: