Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-28805

Implement chunked persistence of backing map for persistent bucket cache.

    XMLWordPrintableJSON

Details

    • Hide
      Earlier, all the entries of this backing map were serialised into as a single large protobuf message BucketCacheEntry. However, this serialisation would hit the 2GB limit of protobuf message, and lead to serialisation failures.

      This change introduces a new serialisation format for the contents of backing-map in bucket cache. These backing map entries are serialised in chunk size of 10M. This size is configurable via the configuration parameter "hbase.bucketcache.persistence.chunksize". With this, the size of protobuf messages that is used to serialise these chunks will remain within the limit of 2GB avoiding the serialisation error. The backing map is reconstructed by reading these multiple chunks from the persistence file during the server restart.
      The bucket cache is capable of reading old format of persistence (of single protobuf) to maintain the backward compatibility with the older versions of the persistence.
      Show
      Earlier, all the entries of this backing map were serialised into as a single large protobuf message BucketCacheEntry. However, this serialisation would hit the 2GB limit of protobuf message, and lead to serialisation failures. This change introduces a new serialisation format for the contents of backing-map in bucket cache. These backing map entries are serialised in chunk size of 10M. This size is configurable via the configuration parameter "hbase.bucketcache.persistence.chunksize". With this, the size of protobuf messages that is used to serialise these chunks will remain within the limit of 2GB avoiding the serialisation error. The backing map is reconstructed by reading these multiple chunks from the persistence file during the server restart. The bucket cache is capable of reading old format of persistence (of single protobuf) to maintain the backward compatibility with the older versions of the persistence.

    Description

      The persistent bucket cache implementation feature relies on the persistence of backing map to a persistent file. the protobuf APIs are used to serialise the backing map and its related structures into the file. An asynchronous thread periodically flushes the contents of backing map to the persistence file.

      The protobuf library has a limitation of 2GB on the size of protobuf messages. If the size of backing map increases beyond 2GB, an unexpected exception is reported in the asynchronous thread and stops the persister thread. This causes the persistent file go out of sync with the actual bucket cache. Due to this, the bucket cache shrinks to a smaller size after a cache restart. Checksum errors are also reported.

      This Jira tracks the implementation of introducing chunking of the backing map to persistence such that every protobuf is smaller than 2GB in size.

      Thanks,
      Janardhan

      Attachments

        Issue Links

          Activity

            People

              janardhan.hungund Janardhan Hungund
              janardhan.hungund Janardhan Hungund
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: