Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-12259

Bring quorum based write ahead log into HBase

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 2.0.0
    • Fix Version/s: None
    • Component/s: wal
    • Labels:
      None

      Description

      HydraBase ( https://code.facebook.com/posts/321111638043166/hydrabase-the-evolution-of-hbase-facebook/ ) Facebook's implementation of HBase with Raft for consensus will be going open source shortly. We should pull in the parts of that fb-0.89 based implementation, and offer it as a feature in whatever next major release is next up. Right now the Hydrabase code base isn't ready to be released into the wild; it should be ready soon ( for some definition of soon).

      Since Hydrabase is based upon 0.89 most of the code is not directly applicable. So lots of work will probably need to be done in a feature branch before a merge vote.

      Is this something that's wanted?

      Is there anything clean up that needs to be done before the log implementation is able to be replaced like this?

      What's our story with upgrading to this? Are we ok with requiring down time ?

        Activity

        Hide
        eclark Elliott Clark added a comment -

        Patrick White Rishit Shroff Manukranth Kolloju Yunfan Zhong will all probably play some role in getting this out. Though at this point there are no concrete plans.

        Show
        eclark Elliott Clark added a comment - Patrick White Rishit Shroff Manukranth Kolloju Yunfan Zhong will all probably play some role in getting this out. Though at this point there are no concrete plans.
        Hide
        posix4e Alex Newman added a comment -

        Similarly https://github.com/cloud-software-foundation/c5-replicator is a raft replicator which can provide this today. https://github.com/cloud-software-foundation/c5 uses the replicator to provide a WAL. We have started work on OLog, a HLog implementation which uses this raft replication code.

        Show
        posix4e Alex Newman added a comment - Similarly https://github.com/cloud-software-foundation/c5-replicator is a raft replicator which can provide this today. https://github.com/cloud-software-foundation/c5 uses the replicator to provide a WAL. We have started work on OLog, a HLog implementation which uses this raft replication code.
        Hide
        posix4e Alex Newman added a comment -

        I'd love if we could combine forces. We could focus on getting a great LAN story, you guys could work on the WAN

        Show
        posix4e Alex Newman added a comment - I'd love if we could combine forces. We could focus on getting a great LAN story, you guys could work on the WAN
        Hide
        posix4e Alex Newman added a comment -

        Also would love to get these both into the incubator.

        Show
        posix4e Alex Newman added a comment - Also would love to get these both into the incubator.
        Hide
        eclark Elliott Clark added a comment -

        If at all possible I think this should go into HBase rather than another project. I think the end result is something that most users of HBase would recognize. Since the final result is so close to HBase I think we gain a whole lot more by not fragmenting the community.

        Show
        eclark Elliott Clark added a comment - If at all possible I think this should go into HBase rather than another project. I think the end result is something that most users of HBase would recognize. Since the final result is so close to HBase I think we gain a whole lot more by not fragmenting the community.
        Hide
        posix4e Alex Newman added a comment -

        I would love to do it that way. How do we get it into HBase?

        Show
        posix4e Alex Newman added a comment - I would love to do it that way. How do we get it into HBase?
        Hide
        busbey Sean Busbey added a comment -

        The refactoring working in HBASE-10378 should get finished up first. It obviates some other WAL clean up tickets and generally provides us with a cleaner separation of concerns than the current HLog.

        There's a simplified roadmap around WAL improvements on that ticket and a patch from a bit ago. Both should get updated in the next day or so with a version that I think is ready as a first pass implementation. One of the follow-ons is getting the WAL related code all into its own module, which I think will help a lot in getting the recovery side of things better isolated.

        Show
        busbey Sean Busbey added a comment - The refactoring working in HBASE-10378 should get finished up first. It obviates some other WAL clean up tickets and generally provides us with a cleaner separation of concerns than the current HLog. There's a simplified roadmap around WAL improvements on that ticket and a patch from a bit ago. Both should get updated in the next day or so with a version that I think is ready as a first pass implementation. One of the follow-ons is getting the WAL related code all into its own module, which I think will help a lot in getting the recovery side of things better isolated.
        Hide
        enis Enis Soztutar added a comment -

        It would be good to have quorum writes to the WAL for both intra and inter cluster deployments natively in hbase (or a thin layer on top). We have already seen some usage of the region replica feature even without sync writes. This will not only help in write latencies, but we can also implement proper sync() on local disk more easily. I would imagine the RAFT library might be used elsewhere as well for multi-master, etc.

        I think once we have hydrabase opened, we can see how we can bring this in.

        Show
        enis Enis Soztutar added a comment - It would be good to have quorum writes to the WAL for both intra and inter cluster deployments natively in hbase (or a thin layer on top). We have already seen some usage of the region replica feature even without sync writes. This will not only help in write latencies, but we can also implement proper sync() on local disk more easily. I would imagine the RAFT library might be used elsewhere as well for multi-master, etc. I think once we have hydrabase opened, we can see how we can bring this in.
        Hide
        stack stack added a comment -

        Since Hydrabase is based upon 0.89 most of the code is not directly applicable. So lots of work will probably need to be done in a feature branch before a merge vote.

        Agree. Feature branch would be way to go.

        Is this something that's wanted?

        Sounds good. Lets take a look at what is involved (From what I know of hydrabase, its a X-DC WAN story.. An in-DC, in-cluster deploy is probably what we'd be interested in doing first).

        Is there anything clean up that needs to be done before the log implementation is able to be replaced like this?

        See Sean's comment above. He is doing nice WAL Interface refactor/cleanup. Does current hydrabase need more than current WAL API or it just works using current API w/ the magic going on behind WAL append, sync, close, roll, calls?

        What's our story with upgrading to this? Are we ok with requiring down time ?

        If downtime, would imply a 2.0 (or 3.0) feature (IMO).

        Show
        stack stack added a comment - Since Hydrabase is based upon 0.89 most of the code is not directly applicable. So lots of work will probably need to be done in a feature branch before a merge vote. Agree. Feature branch would be way to go. Is this something that's wanted? Sounds good. Lets take a look at what is involved (From what I know of hydrabase, its a X-DC WAN story.. An in-DC, in-cluster deploy is probably what we'd be interested in doing first). Is there anything clean up that needs to be done before the log implementation is able to be replaced like this? See Sean's comment above. He is doing nice WAL Interface refactor/cleanup. Does current hydrabase need more than current WAL API or it just works using current API w/ the magic going on behind WAL append, sync, close, roll, calls? What's our story with upgrading to this? Are we ok with requiring down time ? If downtime, would imply a 2.0 (or 3.0) feature (IMO).
        Hide
        rshroff Rishit Shroff added a comment -

        stack HydraBase aims at X-DC WAN story, but it can be easily deployed to in-DC, in-cluster setup. We will be able to reap in the same benefits with the new quorum based WAL.

        Refactoring the WAL into a separate module is a good idea and should make it easier to plug in the RAFT Consensus protocol.

        We should start this effort in a separate feature branch as there will be changes not only in the WAL but in other modules like HRegion/HMaster/etc. I will discuss the approach with other team members first before we lay down the roadmap for this integration.

        Regarding upgrades, I think taking a down time will be much cleaner, but we can look into rolling-upgrades if that's a necessity.

        Show
        rshroff Rishit Shroff added a comment - stack HydraBase aims at X-DC WAN story, but it can be easily deployed to in-DC, in-cluster setup. We will be able to reap in the same benefits with the new quorum based WAL. Refactoring the WAL into a separate module is a good idea and should make it easier to plug in the RAFT Consensus protocol. We should start this effort in a separate feature branch as there will be changes not only in the WAL but in other modules like HRegion/HMaster/etc. I will discuss the approach with other team members first before we lay down the roadmap for this integration. Regarding upgrades, I think taking a down time will be much cleaner, but we can look into rolling-upgrades if that's a necessity.
        Hide
        gaurav.menghani Gaurav Menghani added a comment -

        Created the first patch for the consensus protocol in (HBASE-12476).

        Show
        gaurav.menghani Gaurav Menghani added a comment - Created the first patch for the consensus protocol in ( HBASE-12476 ).
        Hide
        rshroff Rishit Shroff added a comment -

        High Level Architecture about HydraBase

        Show
        rshroff Rishit Shroff added a comment - High Level Architecture about HydraBase
        Hide
        rshroff Rishit Shroff added a comment -

        I have attached a high level architecture overview document about the set of changes with the HydraBase project. Please take a look! Thanks!

        Show
        rshroff Rishit Shroff added a comment - I have attached a high level architecture overview document about the set of changes with the HydraBase project. Please take a look! Thanks!
        Hide
        yuzhihong@gmail.com Ted Yu added a comment -

        Does 'ACTIVE-WITNESS' correspond to 'Active follower' in the diagram ?

        Show
        yuzhihong@gmail.com Ted Yu added a comment - Does 'ACTIVE-WITNESS' correspond to 'Active follower' in the diagram ?
        Hide
        stack stack added a comment -

        Thanks for pasting the doc Rishit Shroff Does it describe what the code over in HBASE-12476 delivers? Thanks.

        Show
        stack stack added a comment - Thanks for pasting the doc Rishit Shroff Does it describe what the code over in HBASE-12476 delivers? Thanks.
        Hide
        rshroff Rishit Shroff added a comment -

        stack, No. The code over in HBASE-12476 is the RAFT Protocol implementation. The document in this JIRA is about the overall architecture. Please let me know if you need any information regarding HBASE-12476.

        Ted Yu, No. The ACTIVE-WITNESS|SHADOW-WITNESS are both shown as 'Witness' in the diagram to keep it simple. The ACTIVE follower in the diagram means that:
        1. From RAFT protocol perspective, that region server is a FOLLOWER
        2. The same region server is the ACTIVE guy for DC-2 and will be doing flushes/compactions to HDFS. However it does not serve any client traffic.

        Show
        rshroff Rishit Shroff added a comment - stack , No. The code over in HBASE-12476 is the RAFT Protocol implementation. The document in this JIRA is about the overall architecture. Please let me know if you need any information regarding HBASE-12476 . Ted Yu , No. The ACTIVE-WITNESS|SHADOW-WITNESS are both shown as 'Witness' in the diagram to keep it simple. The ACTIVE follower in the diagram means that: 1. From RAFT protocol perspective, that region server is a FOLLOWER 2. The same region server is the ACTIVE guy for DC-2 and will be doing flushes/compactions to HDFS. However it does not serve any client traffic.
        Hide
        rshroff Rishit Shroff added a comment -

        I have create a high-level set of features tasks(JIRA) to list the minimum amount of work that needs to be done to integrate HydraBase changes in HBase and bring Quorum based WAL to HBase. As and when people start working on these JIRA, we will add more details about the changes and if required file sub-tasks to make it easier to test/integrate.

        Show
        rshroff Rishit Shroff added a comment - I have create a high-level set of features tasks(JIRA) to list the minimum amount of work that needs to be done to integrate HydraBase changes in HBase and bring Quorum based WAL to HBase. As and when people start working on these JIRA, we will add more details about the changes and if required file sub-tasks to make it easier to test/integrate.
        Hide
        rshroff Rishit Shroff added a comment -

        Updating the doc correct a typo in the Deployment

        Show
        rshroff Rishit Shroff added a comment - Updating the doc correct a typo in the Deployment
        Hide
        rshroff Rishit Shroff added a comment -

        Details about of the implementation of the RaftProtocol in the hbase-consensus module.

        Show
        rshroff Rishit Shroff added a comment - Details about of the implementation of the RaftProtocol in the hbase-consensus module.
        Hide
        Apache9 Duo Zhang added a comment -

        Any progress here?
        Thanks.

        Show
        Apache9 Duo Zhang added a comment - Any progress here? Thanks.
        Hide
        patibandlas2 Siva Teja Patibandla added a comment -

        Hi, could someone please share the status of this feature integration?

        Thanks,
        Siva

        Show
        patibandlas2 Siva Teja Patibandla added a comment - Hi, could someone please share the status of this feature integration? Thanks, Siva
        Hide
        stack stack added a comment -

        Siva Teja Patibandla This project was abandoned sir.

        Show
        stack stack added a comment - Siva Teja Patibandla This project was abandoned sir.
        Hide
        stack stack added a comment -

        Marking as won't fix... since this project abandoned.

        Show
        stack stack added a comment - Marking as won't fix... since this project abandoned.

          People

          • Assignee:
            Unassigned
            Reporter:
            eclark Elliott Clark
          • Votes:
            5 Vote for this issue
            Watchers:
            79 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 24h
              24h
              Remaining:
              Remaining Estimate - 24h
              24h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development