Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-1564

Ozone multi-raft support

    XMLWordPrintableJSON

Details

    Description

      Apache Ratis supports multi-raft by allowing the same node to be a part of multiple raft groups. The proposal is to allow datanodes to be a part of multiple raft groups. The attached design doc explains the reasons for doing this as well a few initial design decisions.

      Some of the work in this feature also related to HDDS-700 which implements rack-aware container placement for closed containers.

      Attachments

        1. multiraft_performance_brief.pdf
          40 kB
          Li Cheng
        2. multi-raft.patch
          193 kB
          Li Cheng
        3. Ozone Multi-Raft Support.pdf
          358 kB
          Siddharth Wagle

        Issue Links

          1.
          Create an interface for pipeline placement policy to support network topologies Sub-task Resolved Li Cheng

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          2.
          Add default pipeline placement policy implementation Sub-task Resolved Li Cheng

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 5h 20m
          3.
          Add CLI createPipeline Sub-task Resolved Li Cheng

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 20m
          4.
          Add ability to SCM for creating multiple pipelines with same datanode Sub-task Resolved Li Cheng

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 8.5h
          5.
          Fix tests using MiniOzoneCluster for its memory related exceptions Sub-task Resolved Li Cheng  
          6.
          Implement a Pipeline scrubber to clean up non-OPEN pipeline Sub-task Resolved Li Cheng

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 40m
          7.
          Refactor heartbeat reports to report all the pipelines that are open Sub-task Resolved Li Cheng  
          8.
          Add scrubber metrics and pipeline metrics Sub-task Resolved Li Cheng  
          9.
          Fix createPipeline and make createPipeline CLI message based. Sub-task Resolved Li Cheng

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 20m
          10.
          Multi-raft support on single datanode integration test Sub-task Resolved Unassigned  
          11.
          Refactor to create pipeline via DN heartbeat response Sub-task Resolved Sammi Chen  
          12.
          Create SCMPipelineAllocationManager as background thread for pipeline creation Sub-task Resolved Li Cheng  
          13.
          Support join multiple pipelines on datanode Sub-task Resolved Sammi Chen  
          14.
          Add a srubber thread to detect creation failure pipelines in ALLOCATED state Sub-task Resolved Sammi Chen  
          15.
          Average out pipeline allocation on datanodes and add metrics/test Sub-task Resolved Li Cheng

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 10m
          16.
          Implement datanode level CLI to reveal pipeline info Sub-task Resolved Li Cheng

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 10m
          17.
          Support configure more than one raft log storage to host multiple pipelines Sub-task Resolved Sammi Chen

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 10m
          18.
          Handle pipeline creation failure in different way when it exceeds pipeline limit Sub-task Resolved Li Cheng

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 20m
          19.
          Add smoke/acceptance test for createPipeline CLI and datanode list CLI Sub-task Resolved Li Cheng

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 20m
          20.
          Better management for pipeline creation limitation Sub-task Resolved Li Cheng

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 20m
          21.
          Rename multi-raft ozone default configs for better understanding. Sub-task Resolved Li Cheng

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 20m
          22.
          Fix Pipeline#nodeIdsHash collision issue Sub-task Resolved Xiaoyu Yao

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 20m
          23.
          Add fall-back protection for rack awareness in pipeline creation Sub-task Resolved Li Cheng

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 20m
          24.
          Fix CI test failure for TestSCMNodeManager Sub-task Resolved Li Cheng

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 20m

          Activity

            People

              timmylicheng Li Cheng
              swagle Siddharth Wagle
              Votes:
              1 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20h 20m
                  20h 20m