Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-26850

Optimize the implementation of LRUCache in LRUDictionary

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • 1.7.1, 3.0.0-alpha-2, 2.4.11
    • None
    • wal
    • None

    Description

      This issue is to unify the behavior of reading and writing links.

       

      During my research on HBASE-26849, I found that there seems to be something wrong with the implementation of LRUDictionary.

      It uses array to save data, and uses hashMap to save array index.

      For the write link:

      CompressedKvEncoder#write
      ->
      LRUDictionary#findEntry
      ->
      LRUDictionary#addEntry
      

      And for the read link:

      CompressedKVDecoder#readIntoArray
      ->
      LRUDirectory#addEntry
      

      Then we could see the logic in findEntry:

        @Override
        public short findEntry(byte[] data, int offset, int length) {
          short ret = backingStore.findIdx(data, offset, length);
          if (ret == NOT_IN_DICTIONARY) {
            addEntry(data, offset, length);
          }
          return ret;
        }
      
          private short findIdx(byte[] array, int offset, int length) {
            Short s;
            final Node comparisonNode = new Node();
            comparisonNode.setContents(array, offset, length);
            if ((s = nodeToIndex.get(comparisonNode)) != null) {
              moveToHead(indexToNode[s]);
              return s;
            } else {
              return -1;
            }
          }
      

      At first it will try to find if there is same node in the cache, If there is, use it directly.

      The problem is, if we have some same nodes, The behavior on the read and write links are inconsistent:
      On the write link we will reuse just one node, but on the read link multiple copies will be stored in the array.
      but there is only one mapping from the node to the array index in the map.
      So, on the read link, if the first 'same' node is evict, we will remove it from the map, then we might met a NPE because when the second 'same' node evict we can not find it in the map.

      Attachments

        1. image-2022-03-16-17-13-00-871.png
          126 kB
          tianhang tang

        Issue Links

          Activity

            People

              tangtianhang tianhang tang
              tangtianhang tianhang tang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: