Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-548

Add Java metadata for PageEncodingStats

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.9.0
    • Component/s: parquet-mr
    • Labels:
      None

      Description

      PARQUET-384 needs to determine whether an entire column chunk is dictionary-encoded, but it is difficult to detect that case based on the set of encodings for a column. For 1.0, this can be done by checking for a PLAIN page because both dictionary pages and dictionary-encoded pages use PLAIN_DICTIONARY and RLE/BIT_PACKING is only used for repetition and definition levels. But for 2.0, dictionary pages might be using PLAIN and there is no way to tell if a column has fallen back.

      PageEncodingStats were added to the format to solve this problem, so we just need to implement them.

        Issue Links

          Activity

          Hide
          rdblue Ryan Blue added a comment -

          Merged #332. Thanks for reviewing, Julien Le Dem!

          Show
          rdblue Ryan Blue added a comment - Merged #332. Thanks for reviewing, Julien Le Dem !

            People

            • Assignee:
              rdblue Ryan Blue
              Reporter:
              rdblue Ryan Blue
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development