HBase
  1. HBase
  2. HBASE-4060

Making region assignment more robust

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      From Eran Kutner:
      My concern is that the region allocation process seems to rely too much on
      timing considerations and doesn't seem to take enough measures to guarantee
      conflicts do not occur. I understand that in a distributed environment, when
      you don't get a timely response from a remote machine you can't know for
      sure if it did or did not receive the request, however there are things that
      can be done to mitigate this and reduce the conflict time significantly. For
      example, when I run dbck it knows that some regions are multiply assigned,
      the master could do the same and try to resolve the conflict. Another
      approach would be to handle late responses, even if the response from the
      remote machine arrives after it was assumed to be dead the master should
      have enough information to know it had created a conflict by assigning the
      region to another server. An even better solution, I think, is for the RS to
      periodically test that it is indeed the rightful owner of every region it
      holds and relinquish control over the region if it's not.
      Obviously a state where two RSs hold the same region is pathological and can
      lead to data loss, as demonstrated in my case. The system should be able to
      actively protect itself against such a scenario. It probably doesn't need
      saying but there is really nothing worse for a data storage system than data
      loss.

      In my case the problem didn't happen in the initial phase but after
      disabling and enabling a table with about 12K regions.

      For more background information, see 'Errors after major compaction' discussion on user@hbase.apache.org

        Activity

        Ted Yu created issue -
        Ted Yu made changes -
        Field Original Value New Value
        Summary ServerShutdownHandler.FindDaughterVisitor doesn't detect whether daughter region is assigned to some server Making region assignment more robust
        Fix Version/s 0.90.4 [ 12316406 ]
        Description From Eran Kutner:
        My concern is that the region allocation process seems to rely too much on
        timing considerations and doesn't seem to take enough measures to guarantee
        conflicts do not occur. I understand that in a distributed environment, when
        you don't get a timely response from a remote machine you can't know for
        sure if it did or did not receive the request, however there are things that
        can be done to mitigate this and reduce the conflict time significantly. For
        example, when I run dbck it knows that some regions are multiply assigned,
        the master could do the same and try to resolve the conflict. Another
        approach would be to handle late responses, even if the response from the
        remote machine arrives after it was assumed to be dead the master should
        have enough information to know it had created a conflict by assigning the
        region to another server. An even better solution, I think, is for the RS to
        periodically test that it is indeed the rightful owner of every region it
        holds and relinquish control over the region if it's not.
        Obviously a state where two RSs hold the same region is pathological and can
        lead to data loss, as demonstrated in my case. The system should be able to
        actively protect itself against such a scenario. It probably doesn't need
        saying but there is really nothing worse for a data storage system than data
        loss.

        In my case the problem didn't happen in the initial phase but after
        disabling and enabling a table with about 12K regions.
        Ted Yu made changes -
        Assignee Ted Yu [ yuzhihong@gmail.com ]
        Ted Yu made changes -
        Description From Eran Kutner:
        My concern is that the region allocation process seems to rely too much on
        timing considerations and doesn't seem to take enough measures to guarantee
        conflicts do not occur. I understand that in a distributed environment, when
        you don't get a timely response from a remote machine you can't know for
        sure if it did or did not receive the request, however there are things that
        can be done to mitigate this and reduce the conflict time significantly. For
        example, when I run dbck it knows that some regions are multiply assigned,
        the master could do the same and try to resolve the conflict. Another
        approach would be to handle late responses, even if the response from the
        remote machine arrives after it was assumed to be dead the master should
        have enough information to know it had created a conflict by assigning the
        region to another server. An even better solution, I think, is for the RS to
        periodically test that it is indeed the rightful owner of every region it
        holds and relinquish control over the region if it's not.
        Obviously a state where two RSs hold the same region is pathological and can
        lead to data loss, as demonstrated in my case. The system should be able to
        actively protect itself against such a scenario. It probably doesn't need
        saying but there is really nothing worse for a data storage system than data
        loss.

        In my case the problem didn't happen in the initial phase but after
        disabling and enabling a table with about 12K regions.
        From Eran Kutner:
        My concern is that the region allocation process seems to rely too much on
        timing considerations and doesn't seem to take enough measures to guarantee
        conflicts do not occur. I understand that in a distributed environment, when
        you don't get a timely response from a remote machine you can't know for
        sure if it did or did not receive the request, however there are things that
        can be done to mitigate this and reduce the conflict time significantly. For
        example, when I run dbck it knows that some regions are multiply assigned,
        the master could do the same and try to resolve the conflict. Another
        approach would be to handle late responses, even if the response from the
        remote machine arrives after it was assumed to be dead the master should
        have enough information to know it had created a conflict by assigning the
        region to another server. An even better solution, I think, is for the RS to
        periodically test that it is indeed the rightful owner of every region it
        holds and relinquish control over the region if it's not.
        Obviously a state where two RSs hold the same region is pathological and can
        lead to data loss, as demonstrated in my case. The system should be able to
        actively protect itself against such a scenario. It probably doesn't need
        saying but there is really nothing worse for a data storage system than data
        loss.

        In my case the problem didn't happen in the initial phase but after
        disabling and enabling a table with about 12K regions.

        For more background information, see 'Errors after major compaction' discussion on user@hbase.apache.org
        Andrew Purtell made changes -
        Fix Version/s 0.92.0 [ 12314223 ]
        Affects Version/s 0.90.3 [ 12316313 ]
        Ted Yu made changes -
        Fix Version/s 0.94.0 [ 12316419 ]
        Fix Version/s 0.92.0 [ 12314223 ]
        ramkrishna.s.vasudevan made changes -
        Assignee ramkrishna.s.vasudevan [ ram_krish ]
        Lars Hofhansl made changes -
        Fix Version/s 0.96.0 [ 12320040 ]
        Fix Version/s 0.94.0 [ 12316419 ]
        stack made changes -
        Fix Version/s 0.96.0 [ 12320040 ]

          People

          • Assignee:
            ramkrishna.s.vasudevan
            Reporter:
            Ted Yu
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:

              Development