Accumulo
  1. Accumulo
  2. ACCUMULO-665

large values, complex iterator stacks, and RFile readers can consume a surprising amount of memory

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4.0
    • Fix Version/s: 1.4.1
    • Component/s: tserver
    • Labels:
      None
    • Environment:

      large cluster

      Description

      On a production cluster, with a complex iterator tree, a large value (~350M) was causing a 4G tserver to fail with out-of-memory.

      There were several factors contributing to the problem:

      1. a bug: the query should not have been looking to the big data
      2. complex iterator tree, causing many copies of the data to be held at the same time
      3. RFile doubles the buffer it uses to load values, and continues to use that large buffer for future values

      This ticket is for the last point. If we know we're not even going to look at the value, we can read past it without storing it in memory. It is surprising that skipping past a large value would cause the server to run out of memory, especially since it should fit into memory enough times to be returned to the caller.

      The provided iterators inside core/org/apache/accumulo/iterators should be revisited to ensure that they properly set the seekColumnFamilies where necessary, specifically the IntersectingIterator.

        Activity

        Eric Newton created issue -
        Josh Elser made changes -
        Field Original Value New Value
        Description On a production cluster, with a complex iterator tree, a large value (~350M) was causing a 4G tserver to fail with out-of-memory.

        There were several factors contributing to the problem:
        # a bug: the query should not have been looking to the big data
        # complex iterator tree, causing many copies of the data to be held at the same time
        # RFile doubles the buffer it uses to load values, and continues to use that large buffer for future values

        This ticket is for the last point. If we know we're not even going to look at the value, we can read past it without storing it in memory. It is surprising that skipping past a large value would cause the server to run out of memory, especially since it should fit into memory enough times to be returned to the caller.
        On a production cluster, with a complex iterator tree, a large value (~350M) was causing a 4G tserver to fail with out-of-memory.

        There were several factors contributing to the problem:
        # a bug: the query should not have been looking to the big data
        # complex iterator tree, causing many copies of the data to be held at the same time
        # RFile doubles the buffer it uses to load values, and continues to use that large buffer for future values

        This ticket is for the last point. If we know we're not even going to look at the value, we can read past it without storing it in memory. It is surprising that skipping past a large value would cause the server to run out of memory, especially since it should fit into memory enough times to be returned to the caller.

        The provided iterators inside core/org/apache/accumulo/iterators should be revisited to ensure that they properly set the seekColumnFamilies where necessary, specifically the IntersectingIterator.
        Josh Elser made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Fix Version/s 1.4.1 [ 12319882 ]
        Josh Elser made changes -
        Attachment ACCUMULO-665.patch [ 12534496 ]
        Eric Newton made changes -
        Fix Version/s 1.4.2 [ 12321843 ]
        Fix Version/s 1.4.1 [ 12319882 ]
        Josh Elser made changes -
        Fix Version/s 1.4.1 [ 12319882 ]
        Fix Version/s 1.4.2 [ 12321843 ]
        Josh Elser made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Christopher Tubbs made changes -
        Affects Version/s 1.5.0 [ 12318645 ]

          People

          • Assignee:
            Eric Newton
            Reporter:
            Eric Newton
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development