1.
|
Standalone SCM RatisServer |
|
Resolved |
Li Cheng
|
|
2.
|
SCM StateMachine |
|
Resolved |
Li Cheng
|
|
3.
|
Introduce generic SCMRatisRequest and SCMRatisResponse |
|
Resolved |
Nandakumar
|
|
4.
|
SCM Invoke Handler for Ratis calls |
|
Resolved |
Nandakumar
|
|
5.
|
Refactor configuration in SCMRatisServer to Java-based configuration |
|
Resolved |
Li Cheng
|
|
6.
|
Handle AllocateContainer operation for HA |
|
Resolved |
Nandakumar
|
|
7.
|
New PipelineManager interface to persist to RatisServer |
|
Resolved |
Li Cheng
|
|
8.
|
Switch to PipelineStateManagerV2 and put PipelineFactory in PipelineManager |
|
Resolved |
Li Cheng
|
|
9.
|
Introduce SCMStateMachineHandler marker interface |
|
Resolved |
Nandakumar
|
|
10.
|
Add unit tests for new PipelineManager interface |
|
Resolved |
Li Cheng
|
|
11.
|
Add unit test for SCMRatisResponse |
|
Resolved |
Li Cheng
|
|
12.
|
Add unit test for SCMRatisRequest |
|
Resolved |
Li Cheng
|
|
13.
|
Handle inner classes in SCMRatisRequest and SCMRatisResponse |
|
Resolved |
Nandakumar
|
|
14.
|
decouple finalize and destroy pipeline |
|
Resolved |
Li Cheng
|
|
15.
|
Implement container related operations in ContainerManagerImpl |
|
Resolved |
Nandakumar
|
|
16.
|
Switch current pipeline interface to the new Replication based interface to write to Ratis |
|
Resolved |
Glen Geng
|
|
17.
|
Add isLeader check for SCM state updates |
|
Resolved |
Li Cheng
|
|
18.
|
remove the 1st edition of RatisServer of SCM HA which is copied from OM HA |
|
Resolved |
Glen Geng
|
|
19.
|
update RATIS version from 1.0.0 to 1.1.0-85281b2-SNAPSHOT |
|
Resolved |
Glen Geng
|
|
20.
|
RATIS ONE Pipeline is closed but not removed when a datanode goes stale |
|
Resolved |
Glen Geng
|
|
21.
|
Pipeline is not removed when a datanode goes stale |
|
Resolved |
Glen Geng
|
|
22.
|
Add failover proxy to SCM block protocol |
|
Resolved |
Li Cheng
|
|
23.
|
enable SCM Raft Group based on config ozone.scm.names |
|
Resolved |
Glen Geng
|
|
24.
|
CLI command to show current SCM leader and follower status |
|
Resolved |
Rui Wang
|
|
25.
|
Switch to ContainerManagerV2 |
|
Resolved |
Li Cheng
|
|
26.
|
SCMBlockLocationFailoverProxyProvider should use ScmBlockLocationProtocolPB.class in RPC.setProtocolEngine |
|
Resolved |
Glen Geng
|
|
27.
|
Handle PipelineAction and OpenPipline from DN to SCM |
|
Resolved |
Unassigned
|
|
28.
|
Make sure AllocateBlock can only be executed on leader SCM |
|
Resolved |
Unassigned
|
|
29.
|
Handle NodeReport from DN to SCMs |
|
Resolved |
Unassigned
|
|
30.
|
Handle events fired from PipelineManager to close container |
|
Resolved |
Unassigned
|
|
31.
|
Handle ContainerReport and IncrementalContainerReport |
|
Resolved |
Unassigned
|
|
32.
|
Replication can only be executed on leader |
|
Resolved |
Unassigned
|
|
33.
|
Use new ContainerManager in SCM |
|
Resolved |
Nandakumar
|
|
34.
|
Add failover proxy for SCM container client |
|
Resolved |
Li Cheng
|
|
35.
|
DN can distinguish SCMCommand from stale leader SCM |
|
Resolved |
Glen Geng
|
|
36.
|
Fix CI and test failures after force push on 2020/10/26 |
|
Resolved |
Nandakumar
|
|
37.
|
Fix TestMiniOzoneHACluster.testGetOMLeader() |
|
Resolved |
Rui Wang
|
|
38.
|
Add ReadWriteLock into PipelineStateManagerV2Impl to protect contentions between RaftServer and PipelineManager |
|
Resolved |
Glen Geng
|
|
39.
|
Need throw exception to trigger FailoverProxyProvider of SCM client to work |
|
Resolved |
Glen Geng
|
|
40.
|
Remove checkLeader in PipelineManager. |
|
Resolved |
Glen Geng
|
|
41.
|
Add tests for replication annotation |
|
Resolved |
Rui Wang
|
|
42.
|
SCM ServiceManager |
|
Resolved |
Glen Geng
|
|
43.
|
Use getRoleInfoProto() in isLeader check |
|
Resolved |
Glen Geng
|
|
44.
|
Handle stale leader issue |
|
Resolved |
Unassigned
|
|
45.
|
Add Snapshot into new SCMRatisServer and SCMStateMachine |
|
Resolved |
Rui Wang
|
|
46.
|
SCM needs to replay RaftLog for recovery |
|
Resolved |
Rui Wang
|
|
47.
|
BackgroundPipelineCreator can only serve leader |
|
Resolved |
Unassigned
|
|
48.
|
Implement Ratis Snapshots on SCM |
|
Resolved |
Rui Wang
|
|
49.
|
DeleteBlock via Ratis in SCM HA |
|
Resolved |
runzhiwang
|
|
50.
|
Load Snapshot info upon SCM Ratis starts |
|
Resolved |
Rui Wang
|
|
51.
|
Allow Enabling Purge SCM Ratis log |
|
Resolved |
Rui Wang
|
|
52.
|
Stop BackgroundPipelineCreator when PipelineManager is closed |
|
Resolved |
Rui Wang
|
|
53.
|
SCMStateMachine::applyTransaction() should not invoke TransactionContext.getClientRequest() |
|
Resolved |
Glen Geng
|
|
54.
|
Fix SCMHAManager#getPeerIdFromRoleInfo |
|
Resolved |
Glen Geng
|
|
55.
|
Update pipeline db when pipeline state is changed |
|
Resolved |
Shashikant Banerjee
|
|
56.
|
Avoid rewriting pipeline information during PipelineStateManagerV2Impl initialization |
|
Resolved |
Rui Wang
|
|
57.
|
SCMContext Phase 1 - Raft Related Info |
|
Resolved |
Glen Geng
|
|
58.
|
SCMContext |
|
Resolved |
Glen Geng
|
|
59.
|
Handle potential data loss during ReplicationManager.handleOverReplicatedContainer() |
|
Resolved |
Glen Geng
|
|
60.
|
Refactor SCMHAManager and SCMRatisServer with RaftServer.Division |
|
Resolved |
Glen Geng
|
|
61.
|
Use OM style Configuration to initialize SCM HA |
|
Resolved |
Rui Wang
|
|
62.
|
PipelineStateManagerV2Impl#removePipeline will remove pipeline from db in case of failure |
|
Resolved |
Jie Yao
|
|
63.
|
acceptance test for SCM HA |
|
Resolved |
Bharat Viswanadham
|
|
64.
|
Use suggestedLeader for SCM failover proxy performing failover |
|
Resolved |
Unassigned
|
|
65.
|
Bootstrap SCM HA Security |
|
Resolved |
Bharat Viswanadham
|
|
66.
|
Use singe server raft cluster in MiniOzoneCluster. |
|
Resolved |
Glen Geng
|
|
67.
|
Fix set configs in SCMHAConfigration |
|
Resolved |
Rui Wang
|
|
68.
|
min/max election timeout of SCMRatisServer is not set properly. |
|
Resolved |
Glen Geng
|
|
69.
|
Solve deadlock triggered by PipelineActionHandler. |
|
Resolved |
Glen Geng
|
|
70.
|
Add term into SetNodeOperationalStateCommand. |
|
Resolved |
Glen Geng
|
|
71.
|
Fix SCMHAManagerImpl#isLeader after RATIS-1227 |
|
Resolved |
Unassigned
|
|
72.
|
Implement DB buffer in MockHAManager |
|
Resolved |
Rui Wang
|
|
73.
|
Change default SCM snapshot frequency to a lower value |
|
Resolved |
Rui Wang
|
|
74.
|
Ratis Snapshot should be loaded from the confg |
|
Resolved |
Rui Wang
|
|
75.
|
Implement Distributed Sequence ID Generator |
|
Closed |
Glen Geng
|
|
76.
|
replace scmID with clusterID for container and volume at Datanode side |
|
Resolved |
Glen Geng
|
|
77.
|
Fix Recon after HDDS-4133 |
|
Resolved |
Nandakumar
|
|
78.
|
Should disallow log purge before installSnapshot is implemented |
|
Resolved |
Rui Wang
|
|
79.
|
Backport updates from PipelineManager(V1) |
|
Resolved |
Unassigned
|
|
80.
|
Handle pipeline reports |
|
Resolved |
Unassigned
|
|
81.
|
Handle ContainerAction and CloseContainer |
|
Resolved |
Unassigned
|
|
82.
|
Provide docker-compose for SCM HA |
|
Resolved |
Unassigned
|
|
83.
|
SafeMode exit rule for all SCMs |
|
Resolved |
Swaminathan Balachandran
|
|
84.
|
Use applyTransactionSerial instead of applyTransaction |
|
Resolved |
Rui Wang
|
|
85.
|
Merge OMTransactionInfo with SCMTransactionInfo |
|
Resolved |
Shashikant Banerjee
|
|
86.
|
Support encode and decode ArrayList and Long |
|
Resolved |
runzhiwang
|
|
87.
|
Replace UniqueID by the Distributed Sequence ID Generator |
|
Resolved |
Rui Wang
|
|
88.
|
Bootstrap new SCM node |
|
Resolved |
Shashikant Banerjee
|
|
89.
|
Admin command should take effect on all SCM instance |
|
Resolved |
Glen Geng
|
|
90.
|
Add STOP state to SCMService. |
|
Resolved |
Unassigned
|
|
91.
|
activatePipeline/deactivatePipeline in PipelineManagerV2Impl should acquire lock before calling StateManager#updatePipelineState. |
|
Resolved |
Xu Shao Hong
|
|
92.
|
Add functionality to transfer Rocks db checkpoint from leader to follower |
|
Resolved |
Shashikant Banerjee
|
|
93.
|
Implement increment count optimization in DeletedBlockLog V2 |
|
Resolved |
Rui Wang
|
|
94.
|
Add transactionId into deletingTxIDs when remove it from DB |
|
Resolved |
runzhiwang
|
|
95.
|
Merge SCMRatisSnapshotInfo and OMRatisSnapshotInfo into a single class |
|
Resolved |
Shashikant Banerjee
|
|
96.
|
Disable Prevote in Ratis in SCM HA by default |
|
Resolved |
Rui Wang
|
|
97.
|
Fix findbugs issues after HDDS-2195 |
|
Resolved |
Glen Geng
|
|
98.
|
Fix TestContainerEndpoint after merging master to HDDS-2823. |
|
Resolved |
Glen Geng
|
|
99.
|
Add install checkpoint in SCMStateMachine |
|
Resolved |
Shashikant Banerjee
|
|
100.
|
Fix misc acceptance test: List pipelines on unknown host |
|
Resolved |
Glen Geng
|
|
101.
|
Fix TestReconContainerManager after merge master to HDDS-2823 |
|
Resolved |
Glen Geng
|
|
102.
|
Integrate DeleteBlockLog with PartialTableCache |
|
Resolved |
Unassigned
|
|
103.
|
Add multiple SCM nodes to MiniOzoneCluster |
|
Resolved |
Shashikant Banerjee
|
|
104.
|
[SCM HA Security] Implement generate SCM certificate |
|
Resolved |
Bharat Viswanadham
|
|
105.
|
Use SCM service ID in SCMBlockClient and SCM Client |
|
Resolved |
Bharat Viswanadham
|
|
106.
|
Implement scm --bootstrap command |
|
Resolved |
Shashikant Banerjee
|
|
107.
|
Make SCM Generic config support HA Style |
|
Resolved |
Bharat Viswanadham
|
|
108.
|
Move Ratis group creation to scm --init phase |
|
Resolved |
Shashikant Banerjee
|
|
109.
|
Rename MiniOzoneHACluster to MiniOzoneOMHACluster |
|
Resolved |
Mukul Kumar Singh
|
|
110.
|
Use SCM service ID in finding SCM Datanode address. |
|
Resolved |
Bharat Viswanadham
|
|
111.
|
Make changes required for SCM admin commands to work with SCM HA |
|
Resolved |
Bharat Viswanadham
|
|
112.
|
Reopen replication/wait.robot added by HDDS-4834 |
|
Resolved |
Glen Geng
|
|
113.
|
Provide docker-compose for SCM HA |
|
Resolved |
Attila Doroszlai
|
|
114.
|
Datanode with scmID format should work with clusterID directory format |
|
Resolved |
Mukul Kumar Singh
|
|
115.
|
[SCM HA Security] Implement listCertificates based on role |
|
Resolved |
Bharat Viswanadham
|
|
116.
|
[SCM HA Security] Add failover proxy to SCM Security Server Protocol |
|
Resolved |
Bharat Viswanadham
|
|
117.
|
Make SCM ratis server spin up time during initialization configurable |
|
Resolved |
Jie Yao
|
|
118.
|
Fix removing local SCM when submitting request to other SCM. |
|
Resolved |
Bharat Viswanadham
|
|
119.
|
Fix and enable TestReconTasks |
|
Resolved |
Mukul Kumar Singh
|
|
120.
|
Fix and enable TestEndpoints.java |
|
Resolved |
Mukul Kumar Singh
|
|
121.
|
SCM Ratis enable/disable switch |
|
Resolved |
Shashikant Banerjee
|
|
122.
|
Use PipelineManagerV2Impl in Recon and enable ignored Recon test cases. |
|
Resolved |
Glen Geng
|
|
123.
|
Need a tool to upgrade current non-HA SCM node to single node HA cluster |
|
Resolved |
Shashikant Banerjee
|
|
124.
|
[SCM HA Security] Create SCM Cert Client and change DefaultCA to allow self signed and intermediary |
|
Resolved |
Bharat Viswanadham
|
|
125.
|
[SCM HA Security] Ozone services should be disabled in SCM HA enabled and security enabled cluster |
|
Resolved |
Bharat Viswanadham
|
|
126.
|
Add SCM HA to Chaos tests |
|
Resolved |
Mukul Kumar Singh
|
|
127.
|
Support inline upgrade from containerId, delTxnId, localId to SequenceIdGenerator. |
|
Resolved |
Glen Geng
|
|
128.
|
[SCM HA Security] Integrate CertClient |
|
Resolved |
Bharat Viswanadham
|
|
129.
|
refactor code in SCMStateMachine. |
|
Resolved |
Glen Geng
|
|
130.
|
NullPointerException during SCM init |
|
Resolved |
Bharat Viswanadham
|
|
131.
|
[SCM HA Security] When Ratis is enabled, SCM secure cluster is not working |
|
Resolved |
Bharat Viswanadham
|
|
132.
|
Provide example k8s files to run full HA Ozone |
|
Resolved |
Marton Elek
|
|
133.
|
Return with exit code 0 in case of optional scm bootstrap/init |
|
Resolved |
Marton Elek
|
|
134.
|
[SCM HA Security] Implement listCAs and getRootCA API |
|
Resolved |
Bharat Viswanadham
|
|
135.
|
[SCM HA Security] Make CertStore DB updates for StoreValidateCertificate go via Ratis |
|
Resolved |
Bharat Viswanadham
|
|
136.
|
[SCM HA Security] Handle leader changes during bootstrap |
|
Resolved |
Bharat Viswanadham
|
|
137.
|
Fix flaky test TestSCMInstallSnapshotWithHA#testInstallCorruptedCheckpointFailure |
|
Resolved |
Shashikant Banerjee
|
|
138.
|
Adapt admincli tests for SCM HA |
|
Resolved |
Attila Doroszlai
|
|
139.
|
Back-port HDDS-4911 (List container by container state) to ContainerManagerV2 |
|
Resolved |
Jie Yao
|
|
140.
|
Solve intellj warnings on DBTransactionBuffer. |
|
Resolved |
Xu Shao Hong
|
|
141.
|
Remove SequenceIdGenerator#StateManagerImpl |
|
Resolved |
Jie Yao
|
|
142.
|
[SCM HA Security] Make storeValidCertificate method idempotent |
|
Resolved |
Bharat Viswanadham
|
|
143.
|
[SCM HA Security] Make changes required for ratis enabled with new model of RootCA/subCA |
|
Resolved |
Bharat Viswanadham
|
|
144.
|
[Doc] Add SCM HA Setup Doc |
|
Resolved |
Marton Elek
|
|
145.
|
localId is not consistent across SCMs when setup a multi node SCM HA cluster. |
|
Resolved |
Glen Geng
|
|
146.
|
SCM get roles command should provide Ratis Leader/Follower information. |
|
Resolved |
George Huang
|
|
147.
|
SCM may not be able to know full port list of Datanode after Datanode is started. |
|
Resolved |
Glen Geng
|
|
148.
|
Merge SCM HA configs to ScmConfigKeys |
|
Resolved |
Aswin Shakil
|
|
149.
|
[SCM HA Security] Handle leader changes between SCMInfo and getSCMSigned Cert in OM |
|
Resolved |
Bharat Viswanadham
|
|
150.
|
[SCM HA Security] Fix duration of sub-ca certs |
|
Resolved |
Bharat Viswanadham
|
|
151.
|
[SCM HA Security] Make InterSCM grpc channel secure |
|
Resolved |
Bharat Viswanadham
|
|
152.
|
[SCM HA Security] Remove code of not starting ozone services when Security is enabled on SCM HA cluster |
|
Resolved |
Bharat Viswanadham
|
|
153.
|
[SCM HA Security] NPE during secure SCM initialization with HA code updated to an already existing cluster |
|
Resolved |
Bharat Viswanadham
|
|
154.
|
Ensure failover to suggested leader if any for NotLeaderException |
|
Resolved |
Shashikant Banerjee
|
|
155.
|
[SCM HA Security] Enable s3 test suite for ozone-secure-ha |
|
Resolved |
Bharat Viswanadham
|
|
156.
|
make Decommission work under SCM HA. |
|
Resolved |
Glen Geng
|
|
157.
|
Fix Install Snapshot Mechanism in SCMStateMachine |
|
Resolved |
Shashikant Banerjee
|
|
158.
|
Divide snapshot related work into notifyInstallSnapshotFromLeader and reinitialize for SCMStateMachine. |
|
Resolved |
Glen Geng
|
|
159.
|
If primordial SCM id is set, a non-HA cluster can not be initialized. |
|
Resolved |
Mukul Kumar Singh
|
|
160.
|
Use scm#checkLeader before processing client requests |
|
Resolved |
Bharat Viswanadham
|
|
161.
|
Fix scm roles command if one of the host is unresolvable |
|
Resolved |
Bharat Viswanadham
|
|
162.
|
For AccessControlException do not perform failover |
|
Resolved |
Bharat Viswanadham
|
|
163.
|
ozone freon randomkeys failed after leader SCM node is down |
|
Resolved |
Bharat Viswanadham
|
|
164.
|
Change default grpc and ratis ports for scm ha |
|
Resolved |
Sadanand Shenoy
|
|
165.
|
Make admin check work for SCM HA cluster |
|
Resolved |
Bharat Viswanadham
|
|
166.
|
SCM subsequent init failed when previous scm init failed |
|
Resolved |
Bharat Viswanadham
|
|
167.
|
SCM UI should have leader/follower and Primordial SCM information |
|
Resolved |
Sadanand Shenoy
|
|
168.
|
Fix Suggested leader in Client |
|
Resolved |
Bharat Viswanadham
|
|
169.
|
Wait for ever to obtain CA list which is needed during OM/DN startup |
|
Resolved |
Bharat Viswanadham
|
|
170.
|
SCM HA: Continuous PipelineNotFoundException seen in SCM log |
|
Resolved |
Lokesh Jain
|
|
171.
|
Fix fall back of config in SCM HA Cluster |
|
Resolved |
Bharat Viswanadham
|
|
172.
|
Handle unsecure cluster convert to secure cluster for SCM |
|
Resolved |
Bharat Viswanadham
|
|
173.
|
Add reinitialize() for SequenceIdGenerator. |
|
Resolved |
Glen Geng
|
|
174.
|
[SCM-HA] SCM start failed with PipelineNotFoundException |
|
Resolved |
Shashikant Banerjee
|
|
175.
|
[SCM-HA] SCM start failed with PipelineNotFoundException |
|
Resolved |
Shashikant Banerjee
|
|
176.
|
Use OM style config to construct RaftGroup and initialize Raft Servers |
|
Resolved |
Rui Wang
|
|
177.
|
[SCM HA Security] generate certserialID in distributed sequence |
|
Resolved |
Ritesh Shukla
|
|
178.
|
remove scm from SCM HA group |
|
Resolved |
Unassigned
|
|
179.
|
Add ratis metric for scm |
|
Resolved |
Xu Shao Hong
|
|
180.
|
For any IOexception from @Replicated method we should throw it |
|
Resolved |
Jie Yao
|
|
181.
|
use `Fileutils.move` instead of `Files.move` when installing snapshot |
|
Resolved |
Jie Yao
|
|
182.
|
terminate om if statemachine is shut down by ratis |
|
Resolved |
Jie Yao
|
|
183.
|
[Doc] Update OM HA Setup Doc |
|
Resolved |
Navin Kumar
|
|
184.
|
Temporarily ignore failing Recon tests |
|
Resolved |
Nandakumar
|
|
185.
|
Backport updates from ContainerManager(V1) |
|
Resolved |
Unassigned
|
|
Hi Sammi, thanks for creating this Jira.
Could you share some documents about this idea ?