Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-895

Reading of nested columns is broken

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • parquet-cpp
    • None

    Description

      Problem occurs when reading a nested column with repeated values, specially when there is much more levels in that column than the number of global rows.

      Citing @peshopetrov, who filed a github pull request identifying the problem and proposing a fix:

      Nested repeated columns' count is incorrectly read from row group's metadata. That's correct in cases where there aren't any nested repeated fields but is generally not correct. Instead the num_values from the column's metadata should be used.

      Attachments

        Issue Links

          Activity

            People

              mvertes Marc Vertes
              mvertes Marc Vertes
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: