Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-759

Consider switching to dictionary encoding for all NULL pages for parquet writer.

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 1.2.1, Impala 1.2.3
    • Impala 1.3
    • None
    • None

    Description

      If the page is all NULL, the data encoding doesn't matter since there are no data values (everything is stored in the definition levels).

      Impala defaults to dictionary, so the page is written as dictionary encoded. This is unreadable by the the version of parquet in CDH4. We could
      write the page as being PLAIN encoded which would be compatible. There is no compat concern on Impala's side.

      More details:
      https://groups.google.com/forum/#!topic/parquet-dev/Q-TPuA2rGk0

      Attachments

        Activity

          People

            nong_impala_60e1 Nong Li
            nong_impala_60e1 Nong Li
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: