Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-12402

ZKPermissionWatcher race condition in refreshing the cache leaving stale ACLs and causing AccessDenied

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.98.8, 0.99.2
    • None
    • None
    • Reviewed

    Description

      In testing, we have seen an issue where a region in a newly created table will throw AccessDeniedException.

      There seems to be a race condition in the ZKPermissionWatcher when it is just starting up, and a new table is created around the same time.

      The master just created the table, and adds permissions to acl table:

      2014-10-30 19:21:26,494 DEBUG [MASTER_TABLE_OPERATIONS-ip-172-31-32-87:60000-0] access.AccessControlLists: Writing permission with rowKey loadtest_d1 hrt_qa: RWXCA
      

      One of the region servers is just starting:

      Thu Oct 30 19:21:11 UTC 2014 Starting regionserver on ip-172-31-32-90
      2014-10-30 19:21:13,915 INFO  [main] util.VersionInfo: HBase 0.98.4.2.2.0.0-1194-hadoop2
      

      The node creation event is received

      2014-10-30 19:21:26,764 DEBUG [regionserver60020-EventThread] access.ZKPermissionWatcher: Updating permissions cache from node loadtest_d1 with data: PBUF\x0A0\x0A\x06hrt_qa\x12&\x08\x03""\x0A\x16\x0A\x07default\x12\x0Bloadtest_d1 \x00 \x01 \x02 \x03 \x04
      

      which put the write data to the cache, only to be invalidated later shortly:

      ...
      2014-10-30 19:21:26,855 DEBUG [RS_OPEN_REGION-ip-172-31-32-90:60020-1] access.ZKPermissionWatcher: Updating permissions cache from node tabletwo_copytable_cell_versions_two with data: PBUF\x0AI\x0A\x06hrt_qa\x12?\x08\x03";\x0A/\x0A\x07default\x12$tabletwo_copytable_cell_versions_two \x00 \x01 \x02 \x03 \x04
      2014-10-30 19:21:26,856 DEBUG [RS_OPEN_REGION-ip-172-31-32-90:60020-1] access.ZKPermissionWatcher: Updating permissions cache from node loadtest_d1 with data: PBUF
      2014-10-30 19:21:26,856 DEBUG [RS_OPEN_REGION-ip-172-31-32-90:60020-1] access.ZKPermissionWatcher: Updating permissions cache from node tablefour_cell_version_snapshots_copy with data: PBUF
      ...
      

      Notice that the threads are different. The first one is the zk event notification thread, vs the other is the thread from OpenRegionHandler.

      Attachments

        1. hbase-12402_v1.patch
          3 kB
          Enis Soztutar

        Activity

          People

            enis Enis Soztutar
            enis Enis Soztutar
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: