Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-18282

ReplicationLogCleaner can delete WALs not yet replicated in case of a KeeperException

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 1.3.1, 1.2.6, 1.1.11, 2.0.0-alpha-1
    • 1.3.2, 1.4.2, 2.0.0, 1.2.7
    • Replication
    • None

    Description

      ReplicationStateZKBase#getListOfReplicators does not rethrow a KeeperException and returns null in such a case. ReplicationLogCleaner just assumes that there are no replicators and deletes everything.

      ReplicationStateZKBase:

      public List<String> getListOfReplicators() {
          List<String> result = null;
          try {
            result = ZKUtil.listChildrenNoWatch(this.zookeeper, this.queuesZNode);
          } catch (KeeperException e) {
            this.abortable.abort("Failed to get list of replicators", e);
          }
          return result;
        }
      

      ReplicationLogCleaner:

      private Set<String> loadWALsFromQueues() throws KeeperException {
          for (int retry = 0; ; retry++) {
            int v0 = replicationQueues.getQueuesZNodeCversion();
            List<String> rss = replicationQueues.getListOfReplicators();
            if (rss == null) {
              LOG.debug("Didn't find any region server that replicates, won't prevent any deletions.");
              return ImmutableSet.of();
            }
            ...
      

      Attachments

        1. HBASE-18282-branch-1.patch
          12 kB
          Ben Lau
        2. HBASE-18282-branch-2.patch
          13 kB
          Ben Lau
        3. HBASE-18282-branch-1-v2.patch
          12 kB
          Ben Lau
        4. HBASE-18282-branch-2-v2.patch
          13 kB
          Ben Lau

        Activity

          People

            benlau Ben Lau
            ashu210890 Ashu Pachauri
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: