Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-8617

Add support for lz4 in parquet

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: Impala 3.3.0
    • Component/s: Backend
    • Labels:

      Description

      Hadoop uses a native block format for LZ4 (same as parquet-mr api) which is incompatible with LZ4 block format.

      As a result Parquet/LZ4 could have different block formats.

      The parquet-cpp api (now Apache Arrow) uses LZ4 frame format, which is also incompatible with LZ4 block format.

      The current decision is to use a format compatible with Hive, Spark and parquet-mr.

       

        Attachments

          Activity

            People

            • Assignee:
              arawat Abhishek Rawat
              Reporter:
              arawat Abhishek Rawat
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: