Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5982

String columns saved to Parquet files should be annotated with the UTF8 logical type

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Backend
    • ghx-label-1

    Description

      When creating Parquet files, Impala doesn't add the proper logical type corresponding to the string SQL type. String columns should be annotated with the UTF8 logical type.

      The lack of the UTF8 logical type annotation makes it harder to consume the data using other tools, or even by Impala itself if the files are manually moved around in the filesystem and a new table has to be created based on their metadata.

      Attachments

        Activity

          People

            stigahuang Quanlong Huang
            zi Zoltan Ivanfi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: