Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-22457

Harden the HBase HFile reader reference counting

    XMLWordPrintableJSON

Details

    • Brainstorming
    • Status: Resolved
    • Major
    • Resolution: Implemented
    • None
    • None
    • None
    • None

    Description

      The problem that any coprocessor hook that replaces a passed scanner without closing it can cause an incorrect reference count.
      This was bad and wrong before of course, but now it has pretty bad consequences, since an incorrect reference could will prevent HFiles from being archived indefinitely.

      All hooks that are passed a scanner and return a scanner are suspect, since the returned scanner may or may not close the passed scanner:

      • preCompact
      • preCompactScannerOpen
      • preFlush
      • preFlushScannerOpen
      • preScannerOpen
      • preStoreScannerOpen
      • preStoreFileReaderOpen...? (not sure about this one, it could mess with the reader)

      I sampled the Phoenix and also Tephra code, and found a few instances where this is happening.
      And for those I filed issued: TEPHRA-300, PHOENIX-5291
      (We're not using Tephra)

      The Phoenix ones should be rare. In our case we are seeing readers with refCount > 1000.
      Perhaps there are other issues, a path where not all exceptions are caught and scanner is left open that way perhaps. (Generally I am not a fan of reference counting in complex systems - it's too easy to miss something. But that's a different discussion. ).

      Let's brainstorm some way in which we can harden this.

      ram_krish, anoop.hbase, apurtell

      Attachments

        1. 22457-random-1.5.txt
          2 kB
          Lars Hofhansl

        Issue Links

          Activity

            People

              Unassigned Unassigned
              larsh Lars Hofhansl
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: