Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6383

Memory from previous row groups can accumulate in Parquet scanner

    Details

      Description

      I ran across this bug when working on porting scanners to the new buffer pool. Before that the only symptom of the failures was excessive memory consumption, but with the reservations they become easy-to-detect hard failures.

      The problem is in HdfsParquetScanner::NextRowGroup(), which calls InitColumns() on column readers, which starts scans, which allocate memory. The problem is that, if the row group is skipped because of dictionary predicates or some other error, the scans aren't cancelled and the I/O buffers aren't releated.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tarmstrong Tim Armstrong
                Reporter:
                tarmstrong Tim Armstrong
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: