[HDDS-935] Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.4.0
Fix Version/s: 0.4.0
Component/s: Ozone Datanode
Labels:
None

Target Version/s:

0.4.0

Description

Currently, a container gets created when a writeChunk request comes to HddsDispatcher and if the container does not exist already. In case a disk on which a container exists gets removed and datanode restarts and now, if a writeChunkRequest comes , it might end up creating the same container again with an updated BCSID as it won't detect the disk is removed. This won't be detected by SCM as well as it will have the latest BCSID. This Jira aims to address this issue.

The proposed fix would be to persist the all the containerIds existing in the containerSet when a ratis snapshot is taken in the snapshot file. If the disk is removed and dn gets restarted, the container set will be rebuild after scanning all the available disks and the the container list stored in the snapshot file will give all the containers created in the datanode. The diff between these two will give the exact list of containers which were created but were not detected after the restart. Any writeChunk request now should validate the container Id from the list of missing containers. Also, we need to ensure container creation does not happen as part of applyTransaction of writeChunk request in Ratis.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDDS-935.000.patch
08/Jan/19 14:05
17 kB
Shashikant Banerjee
HDDS-935.001.patch
22/Jan/19 22:10
19 kB
Shashikant Banerjee
HDDS-935.002.patch
01/Feb/19 09:44
19 kB
Shashikant Banerjee
HDDS-935.003.patch
01/Feb/19 15:18
21 kB
Shashikant Banerjee
HDDS-935.004.patch
04/Feb/19 09:23
22 kB
Shashikant Banerjee
HDDS-935.005.patch
14/Feb/19 13:29
30 kB
Shashikant Banerjee
HDDS-935.006.patch
26/Feb/19 13:42
29 kB
Shashikant Banerjee
HDDS-935.007.patch
05/Mar/19 06:23
28 kB
Shashikant Banerjee
HDDS-935.008.patch
05/Mar/19 10:29
28 kB
Shashikant Banerjee

Issue Links

causes

HDDS-8140 Startup warning about adding containers to missing container set

Open

Activity

People

Assignee:: Shashikant Banerjee

Reporter:: Rakesh Radhakrishnan

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 18/Dec/18 11:21

Updated:: 10/Mar/23 21:27

Resolved:: 05/Mar/19 16:40