Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-7406

Flatbuffer wrappers use almost as much memory as underlying data

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • Impala 3.1.0
    • Catalog
    • None
    • ghx-label-9

    Description

      Currently the file descriptors stored in the catalogd memory for each partition use a FlatBuffer to reduce the number of separate objects on the Java heap. However, the FlatBuffer objects internally each store a ByteBuffer and int position, so each object takes 32 bytes on its own. The ByteBuffer takes 56 bytes since it stores various references, endianness, limit, mark, position, etc. This amounts to about 88 bytes overhead on top of the actual underlying flatbuf byte array which is typically around 100 bytes for a single-block file. So, we're have about a 1:1 ratio of memory overhead and a 2:1 ratio of object count overhead for each partition.

      If we simply stored the byte[] array and constructed wrappers on demand, we'd save 88 bytes and 2 objects per partition. The downside is that we'd need to do short-lived ByteBuffer allocations at access time, and based on some benchmarking I did, they don't get escape-analyzed out. So, it's not a super clear win, but still worth considering.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tlipcon Todd Lipcon
            tlipcon Todd Lipcon
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment