Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3200

Replace BufferedBlockMgr with new buffer pool

    Details

      Description

      We want to replace BufferedBlockMgr, a query-wide buffer pool, with a new BufferPool that is shared between all queries. The goals are:

      • Support for guaranteed reservations: i.e. if a reservation is granted, the buffer pool will fulfil it (unless the OS is unable to fulfill the buffer pool's memory requirements).
      • Simplified interaction between reservations and pins (if you have a reservation, you can pin, if you don't, you can't)
      • Support for increasing reservations (up to a planner-specified limit).
      • Support for smaller buffer sizes with similar performance (so we can reduce minimum memory requirement to execute spill-to-disk algorithm)
      • Support for larger buffers to support wide rows
      • Reduced reliance on TCMalloc, which isn't suited to management of large buffers (e.g. see IMPALA-2800)
      • Better transfer model for buffer pool pages, so we can implement transfer-of-ownership consistently for row batches (instead of mixing transfer with MarkNeedsToReturn()).

        Attachments

          Issue Links

          1.
          Implement basic in-memory buffer pool Sub-task Resolved Tim Armstrong
          2.
          Add spilling support to new buffer pool Sub-task Resolved Tim Armstrong
          3.
          Integrate buffer pool reservations with memtrackers Sub-task Resolved Tim Armstrong
          4.
          Set up query mem tracker in QueryState Sub-task Resolved Tim Armstrong
          5.
          Clients can violate BufferPool invariants by calling ReservationTracker methods directly. Sub-task Resolved Tim Armstrong
          6.
          Buffer pool unpinned invariant does not take into account multiply-pinned bytes Sub-task Resolved Tim Armstrong
          7.
          MemTracker::EnableReservationReporting() is not thread-safe Sub-task Resolved Tim Armstrong
          8.
          Compute memory reservation in planner and claim atomically in Prepare() Sub-task Resolved Tim Armstrong
          9.
          Port spilling ExecNodes to new buffer pool Sub-task Resolved Tim Armstrong
          10.
          Implement scalable buffer recycling in buffer pool Sub-task Resolved Tim Armstrong
          11.
          Backend support for large rows Sub-task Resolved Tim Armstrong
          12.
          Backend support for large rows in BufferedTupleStream Sub-task Resolved Tim Armstrong
          13.
          Fix BufferPool handling of scratch read errors Sub-task Resolved Tim Armstrong
          14.
          Port relevant BufferedBlockMgr unit tests for BufferPool Sub-task Resolved Tim Armstrong
          15.
          Add reservation stress option for test coverage Sub-task Resolved Tim Armstrong
          16.
          Validate and fix spilling performance and memory usage of new buffer pool Sub-task Resolved Tim Armstrong
          17.
          Queries with a large number of small joins regress in terms of memory usage due to memory reservation Sub-task Resolved Tim Armstrong
          18.
          TPC-DS Q78 with MEM_LIMIT=10GB fails with "Repartitioning did not reduce the size of a spilled partition" on BufferPool dev branch Sub-task Resolved Tim Armstrong
          19.
          Account for difference between process memory consumption and memory used by queries Sub-task Resolved Tim Armstrong
          20.
          Clean up BufferPool profile counters Sub-task Resolved Tim Armstrong
          21.
          Consider reducing partition fanout in Hash Join Sub-task Resolved Tim Armstrong
          22.
          Partitioned aggregation node repartitions when spilled partition could fit in memory Sub-task Resolved Tim Armstrong
          23.
          Ensure that NAAJ works with spilling enabled and disabled. Sub-task Resolved Tim Armstrong
          24.
          Performance regresses on buffer pool dev branch for high-ndv aggregations Sub-task Resolved Tim Armstrong
          25.
          list::size() in BufferedTupleStreamV2::AdvanceWritePage() is expensive Sub-task Resolved Tim Armstrong
          26.
          Consider allowing configuration of buffer pool size Sub-task Resolved Tim Armstrong
          27.
          Track exchange node buffers memory as part of memory reservation Sub-task Resolved Tim Armstrong
          28.
          BufferedTupleStreamV2::CheckConsistency() is too slow for large streams with small pages in Debug build Sub-task Resolved Tim Armstrong
          29.
          Consider ways to reduce the accumulation of clean pages when executing large spilling queries Sub-task Resolved Tim Armstrong
          30.
          Eagerly release reservation in blocking nodes Sub-task Resolved Tim Armstrong
          31.
          Consider always reserving memory for grouping pre-aggregations Sub-task Resolved Tim Armstrong
          32.
          Ensure test coverage for spilling disabled for all spilling operators Sub-task Resolved Tim Armstrong
          33.
          Consider reducing RESERVATION_MIN_MEM_REMAINING Sub-task Resolved Tim Armstrong
          34.
          SET_DENY_RESERVATION_PROBABILITY debug action is not always effective Sub-task Resolved Tim Armstrong

            Activity

              People

              • Assignee:
                tarmstrong Tim Armstrong
                Reporter:
                tarmstrong Tim Armstrong
              • Votes:
                1 Vote for this issue
                Watchers:
                16 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: