Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9074

Add support for zstd in ORC

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • Backend
    • None

    Description

      The ORC lib already supports reading/writing to zstd compressed ORC files. However, I failed in a quick try in Impala:

      hive> create table orc_zstd (id int, name string) stored as orc;
      $ hdfs dfs -put id_name_zstd.orc hdfs://localhost:20500/test-warehouse/orc_zstd
      impala-shell> invalidate metadata orc_zstd;
      impala-shell> select * from orc_zstd;
      ERROR: Encountered parse error in tail of ORC file hdfs://localhost:20500/test-warehouse/orc_zstd/id_name_zstd.orc: Unknown compression codec 5
      

      The ORC file is generated by the csv-import tool: https://github.com/apache/orc/blob/rel/release-1.6.0/tools/src/CSVFileImport.cc
      (Manually changing the compression from ZLIB to ZSTD in it)

      Attachments

        1. id_name_zstd.orc
          0.4 kB
          Quanlong Huang

        Activity

          People

            stigahuang Quanlong Huang
            stigahuang Quanlong Huang
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: