HBase
  1. HBase
  2. HBASE-8104

HBase consistency and availability after replication

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Critical Critical
    • Resolution: Unresolved
    • Affects Version/s: 0.94.3
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      HBase consistency and availability after replication
      Scene as follows:
      1. There are two HBase clusters are the Master clusters and Slave Clusters. two clusters replication function is open.
      2. if master cluster have problems, so all write and read request switching to the slave cluster.
      3. After a period of time ,we need to switch back to the Master cluster, there will be a part of the data is inconsistent, lead to this part of the data is not available.

      This feature is particularly important for providing online services HBase cluster.
      So, I want through a write-back program to keep the data consistency, then to improve HBase availability.
      we will provide a patch for this function.

        Activity

        Hide
        Lars Hofhansl added a comment -

        Can you setup the two clusters in Master-Master replication? That way all changes made the "slave" cluster during the failover are scheduled to be replicated back to the "main" cluster once that becomes available.

        Show
        Lars Hofhansl added a comment - Can you setup the two clusters in Master-Master replication? That way all changes made the "slave" cluster during the failover are scheduled to be replicated back to the "main" cluster once that becomes available.
        Hide
        Brian Fu added a comment -

        Hi , Lars Hofhansl
        Master-Master cluster replication will lead to the data endless loop in the two cluster.

        Show
        Brian Fu added a comment - Hi , Lars Hofhansl Master-Master cluster replication will lead to the data endless loop in the two cluster.
        Hide
        Lars Hofhansl added a comment -

        In 0.94+ it won't.

        Show
        Lars Hofhansl added a comment - In 0.94+ it won't.
        Hide
        Brian Fu added a comment -

        Thank you for your answer.
        Even if use and Master-Master replication, still may appear inconsistent data,
        Suppose we have A, B two master cluster.
        if A cluster have problems, so all write and read request switching to B cluster.
        there may be data in A cluster is not timely replication to B cluster.
        So some data cannot be read in B cluster timely.

        Show
        Brian Fu added a comment - Thank you for your answer. Even if use and Master-Master replication, still may appear inconsistent data, Suppose we have A, B two master cluster. if A cluster have problems, so all write and read request switching to B cluster. there may be data in A cluster is not timely replication to B cluster. So some data cannot be read in B cluster timely.
        Hide
        Chris Trezzo added a comment -

        Brian Fu is your intent to make replication synchronous across HBase clusters?

        Show
        Chris Trezzo added a comment - Brian Fu is your intent to make replication synchronous across HBase clusters?
        Hide
        Jieshan Bean added a comment -

        I think so, this is the only way we can do that. But I don't think we really need that.

        Show
        Jieshan Bean added a comment - I think so, this is the only way we can do that. But I don't think we really need that.
        Hide
        Enis Soztutar added a comment -

        In your case, you have to ensure two conditions:
        (1) Only one master cluster is active at any given time.
        (2) Data is not read from one cluster, unless all the data is replicated from the other cluster.

        Achieving (1) is easy. (2) might not even be possible if you do not have synchronous replication, since eventual updates might not be propagated for extended periods of time.

        Show
        Enis Soztutar added a comment - In your case, you have to ensure two conditions: (1) Only one master cluster is active at any given time. (2) Data is not read from one cluster, unless all the data is replicated from the other cluster. Achieving (1) is easy. (2) might not even be possible if you do not have synchronous replication, since eventual updates might not be propagated for extended periods of time.
        Hide
        Brian Fu added a comment -

        Chris Trezzo
        no. I want to implement a function that automatically write the inconsistent data to the target cluster after one cluster fail.

        Show
        Brian Fu added a comment - Chris Trezzo no. I want to implement a function that automatically write the inconsistent data to the target cluster after one cluster fail.
        Hide
        Chris Trezzo added a comment -

        Brian Fu When you say "one cluster fail" what type of failure are you referring to? Are you trying to cover the case where the HBase application layer is unable to replicate data, but the HDFS cluster and ZooKeeper cluster are still available?

        In the other cases I can think of, your implemented function would be in the same situation as the HBase replication code (unless I am miss-understanding the scenario).

        Show
        Chris Trezzo added a comment - Brian Fu When you say "one cluster fail" what type of failure are you referring to? Are you trying to cover the case where the HBase application layer is unable to replicate data, but the HDFS cluster and ZooKeeper cluster are still available? In the other cases I can think of, your implemented function would be in the same situation as the HBase replication code (unless I am miss-understanding the scenario).
        Hide
        Brian Fu added a comment -

        Chris Trezzo, one cluster fail refer to hbase and zookeeper cluster are unable, but HDFS cluster is ok.

        this funciton not like replicaiton .

        Show
        Brian Fu added a comment - Chris Trezzo , one cluster fail refer to hbase and zookeeper cluster are unable, but HDFS cluster is ok. this funciton not like replicaiton .
        Hide
        Chris Trezzo added a comment -

        Ah I see.

        Show
        Chris Trezzo added a comment - Ah I see.

          People

          • Assignee:
            Unassigned
            Reporter:
            Brian Fu
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:

              Time Tracking

              Estimated:
              Original Estimate - 336h
              336h
              Remaining:
              Remaining Estimate - 336h
              336h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development