Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-739

Rle-decoding uses static buffer that is shared accross threads

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • cpp-1.0.0
    • parquet-cpp
    • None

    Description

      Reading two parquet files in parallel lead to a memory corruption that caused a crash. The columns are rle dictionary encoded strings in an uncompressed page, created with parquet-mr.

      Initial debugging showed that the indices for the dictionary returned by the rle decoder are garbage. So that data page got corrupted in memory. Reading the files in one thread works.

      I have a ColumnReader for each column and read one element from reach column to get a complete row.

      The indices are decoded into one global static buffer. So multiple threads all use the same buffer and overwrite each other's indices.

      Attachments

        Activity

          People

            mdeepak Deepak Majeti
            florian.scheibner Florian Scheibner
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: