Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-1251

HConnectionManager.getConnection(HBaseConfiguration) returns same HConnection for different HBaseConfigurations

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.19.0
    • Fix Version/s: 0.20.0
    • Component/s: Client
    • Labels:
      None

      Description

      This occurs when the following happens:

      1. Consider a client that invokes HBaseAdmin.checkHBaseAvailable(config) before doing anything. Although this method copies the HBaseConfiguration object and sets hbase.client.retries.number to 1 (see HBaseAdmin, line 751), it creates an HBaseAdmin object, which invokes HConnectionManager.getConnection(conf). Please notice that this conf is that with hbase.client.retries.number equals to 1.
      2. HConnectionManager.getConnection then creates a HConnection using this conf and puts it into a static map (see HConnectionManager, line 93) indexed by hbase.rootdir.
      3. Then, if the same client now creates a HTable object (using, for instance, a HBaseConfiguration with hbase.client.retries.number equals to 10 but the same hbase.rootdir), it will invoke HConnectionManager.getConnection(conf) again (see HTable, line 109). However, when it checks the static map for a HConnection it finds one - the one previously created by the HBaseAdmin object and using hbase.client.retries.number 1 - and returns it without creating a new one with the correct HBaseConfiguration.

      However, the expected behavior is: HConnectionManager must return different HConnections for different HBaseConfigurations.

      1. HBASE-1251.patch
        9 kB
        Guilherme Mauro Germoglio Barbosa

        Issue Links

          Activity

          Hide
          guiga Guilherme Mauro Germoglio Barbosa added a comment -

          I've edited HConnectionManager in order to use all HBaseConfiguration instead of only HBASE_DIR as key on the HBASE_INSTANCES map. To do so, I've implemented HBaseConfiguration.hashcode() - just like a HashMap.hashCode() is implemented.

          code review is needed.

          Show
          guiga Guilherme Mauro Germoglio Barbosa added a comment - I've edited HConnectionManager in order to use all HBaseConfiguration instead of only HBASE_DIR as key on the HBASE_INSTANCES map. To do so, I've implemented HBaseConfiguration.hashcode() - just like a HashMap.hashCode() is implemented. code review is needed.
          Hide
          stack stack added a comment -

          Patch looks great.

          Only downside I see is that if the HBC is changed – a value added or edited – then its hash will be different and we'll go build up a new TableServer but maybe this is your intent?

          On GSOC, figure what interests you Guilherme and lets make it happy. Write me off-list if you'd like.

          Show
          stack stack added a comment - Patch looks great. Only downside I see is that if the HBC is changed – a value added or edited – then its hash will be different and we'll go build up a new TableServer but maybe this is your intent? On GSOC, figure what interests you Guilherme and lets make it happy. Write me off-list if you'd like.
          Hide
          jimk Jim Kellerman added a comment -

          The biggest downside I see is that since a new TableServers is built up if the configuration changes, the old TableServers will
          be orphaned if the old configuration is never used again. The old TableServers cannot be garbage collected because the
          HBASE_INSTANCES map still holds a reference to it. Could potentially use up a not insignificant amount of memory if this
          is done too often.

          A couple of other comments:

          • Why add hash codes in HBaseConfiguration.hashCode rather than use xor ? I would think using ^= rather than += would produce a better hash value.
          • Some of the indentation is inconsistent. 4 spaces instead of two. That should be fixed.
          Show
          jimk Jim Kellerman added a comment - The biggest downside I see is that since a new TableServers is built up if the configuration changes, the old TableServers will be orphaned if the old configuration is never used again. The old TableServers cannot be garbage collected because the HBASE_INSTANCES map still holds a reference to it. Could potentially use up a not insignificant amount of memory if this is done too often. A couple of other comments: Why add hash codes in HBaseConfiguration.hashCode rather than use xor ? I would think using ^= rather than += would produce a better hash value. Some of the indentation is inconsistent. 4 spaces instead of two. That should be fixed.
          Hide
          guiga Guilherme Mauro Germoglio Barbosa added a comment -

          Stack:
          yes, this is my intent. I think the best scenario would be the hash value be affected only when the values that are used for the creation of connections are modified. However, it would be quite annoying to maintain the list of these values for every release.

          Jim:
          I think the use of a WeakHashMap in HBASE_INSTANCES can solve the probem of old orphaned TableServers. What do you think? This way the old TableServers could be garbage collected.
          Related to the hash code, I've used addition because it is the way java.util.HashMaps compute their hash value (I have no argument against xor).

          Show
          guiga Guilherme Mauro Germoglio Barbosa added a comment - Stack: yes, this is my intent. I think the best scenario would be the hash value be affected only when the values that are used for the creation of connections are modified. However, it would be quite annoying to maintain the list of these values for every release. Jim: I think the use of a WeakHashMap in HBASE_INSTANCES can solve the probem of old orphaned TableServers. What do you think? This way the old TableServers could be garbage collected. Related to the hash code, I've used addition because it is the way java.util.HashMaps compute their hash value (I have no argument against xor).
          Hide
          guiga Guilherme Mauro Germoglio Barbosa added a comment -
          • indentation fixed
          • hash is now computed using xor
          • HBASE_INSTANCES is now a WeakHashMap in order to make orphan TableServers free to be garbage collected.
          Show
          guiga Guilherme Mauro Germoglio Barbosa added a comment - indentation fixed hash is now computed using xor HBASE_INSTANCES is now a WeakHashMap in order to make orphan TableServers free to be garbage collected.
          Hide
          stack stack added a comment -

          Patch looks great. Thanks.

          If you want to figure out google summer of code project, write me off list; lets chat.

          Show
          stack stack added a comment - Patch looks great. Thanks. If you want to figure out google summer of code project, write me off list; lets chat.
          Hide
          stack stack added a comment -

          Applied. Thank you for the patch Guilherme.

          Show
          stack stack added a comment - Applied. Thank you for the patch Guilherme.

            People

            • Assignee:
              Unassigned
              Reporter:
              guiga Guilherme Mauro Germoglio Barbosa
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development