Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3528

Memory of scratch batch should be transferred when closing a Parquet scanner thread.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.6.0
    • Fix Version/s: Impala 2.6.0
    • Component/s: Backend
    • Labels:
      None

      Description

      The lifetime of a scanner thread is decoupled from that of row batches that it produces. That means that all resources associated with row batches produced by the scanner thread should be transferred to those batches.

      The bug is that we are not transferring the ownership of memory from the scratch batch to the final row batch returned in HdfsParquetScanner::Close().

      Relevant snippet:

      void HdfsParquetScanner::Close() {
       ...
        if (batch_ != NULL) {
          AttachPool(dictionary_pool_.get(), false);
          AddFinalRowBatch();
        }
        // Verify all resources (if any) have been transferred.
        DCHECK_EQ(dictionary_pool_.get()->total_allocated_bytes(), 0);
        DCHECK_EQ(scratch_batch_->mem_pool()->total_allocated_bytes(), 0);
        DCHECK_EQ(context_->num_completed_io_buffers(), 0);
       ... 
      }
      

      I noticed this bug while investigating IMPALA-3519, but unfortunately, Tim and I could not see a direct connection to IMPALA-3519 so this is probably a separate problem.

      As far as I know we have not seen any problems/crashes due to this bug - but it's definitely a bug.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                alex.behm Alexander Behm
                Reporter:
                alex.behm Alexander Behm
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: