[HDFS-11402] HDFS Snapshots should capture point-in-time copies of OPEN files - ASF JIRA

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.6.0
Fix Version/s: 2.9.0, 3.0.0-alpha4
Component/s: hdfs
Labels:
None

Target Version/s:

2.9.0, 3.0.0-alpha4
Hadoop Flags:

Incompatible change, Reviewed
Release Note:

Hide
When the config param "dfs.namenode.snapshot.capture.openfiles" is enabled, HDFS snapshots taken will additionally capture point-in-time copies of the open files that have valid leases. Even when the current version open files grow or shrink in size, the snapshot will always retain the immutable versions of these open files, just as in for all other closed files. Note: The file length captured for open files in the snapshot was the one recorded in NameNode at the time of snapshot and it may be shorter than what the client has written till then. In order to capture the latest length, the client can call hflush/hsync with the flag SyncFlag.UPDATE_LENGTH on the open files handles.

Show
When the config param "dfs.namenode.snapshot.capture.openfiles" is enabled, HDFS snapshots taken will additionally capture point-in-time copies of the open files that have valid leases. Even when the current version open files grow or shrink in size, the snapshot will always retain the immutable versions of these open files, just as in for all other closed files. Note: The file length captured for open files in the snapshot was the one recorded in NameNode at the time of snapshot and it may be shorter than what the client has written till then. In order to capture the latest length, the client can call hflush/hsync with the flag SyncFlag.UPDATE_LENGTH on the open files handles.

Description

Problem:

1. When there are files being written and when HDFS Snapshots are taken in parallel, Snapshots do capture all these files, but these being written files in Snapshots do not have the point-in-time file length captured. That is, these open files are not frozen in HDFS Snapshots. These open files grow/shrink in length, just like the original file, even after the snapshot time.

2. At the time of File close or any other meta data modification operation on these files, HDFS reconciles the file length and records the modification in the last taken Snapshot. All the previously taken Snapshots continue to have those open Files with no modification recorded. So, all those previous snapshots end up using the final modification record in the last snapshot. Thus after the file close, file lengths in all those snapshots will end up same.

Assume File1 is opened for write and a total of 1MB written to it. While the writes are happening, snapshots are taken in parallel.

|---Time---T1-----------T2-------------T3----------------T4------>
|-----------------------Snap1----------Snap2-------------Snap3--->
|---File1.open---write---------write-----------close------------->

Then at time,
T2:
Snap1.File1.length = 0

T3:
Snap1.File1.length = 0
Snap2.File1.length = 0

T4:
Snap1.File1.length = 1MB
Snap2.File1.length = 1MB
Snap3.File1.length = 1MB

Proposal

1. At the time of taking Snapshot, SnapshotManager#createSnapshot can optionally request DirectorySnapshottableFeature#addSnapshot to freeze open files.

2. DirectorySnapshottableFeature#addSnapshot can consult with LeaseManager and get a list INodesInPath for all open files under the snapshot dir.

3. DirectorySnapshottableFeature#addSnapshot after the Snapshot creation, Diff creation and updating modification time, can invoke INodeFile#recordModification for each of the open files. This way, the Snapshot just taken will have a FileDiff with fileSize captured for each of the open files.

4. Above model follows the current Snapshot and Diff protocols and doesn't introduce any any disk formats. So, I don't think we will be needing any new FSImage Loader/Saver changes for Snapshots.

5. One of the design goals of HDFS Snapshot was ability to take any number of snapshots in O(1) time. LeaseManager though has all the open files with leases in-memory map, an iteration is still needed to prune the needed open files and then run recordModification on each of them. So, it will not be a strict O(1) with the above proposal. But, its going be a marginal increase only as the new order will be of O(open_files_under_snap_dir). In order to avoid HDFS Snapshots change in behavior for open files and avoid change in time complexity, this improvement can be made under a new config "dfs.namenode.snapshot.freeze.openfiles" which by default can be false.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-11402-branch-2.01.patch
16/Jun/17 21:27
67 kB
Manoj Govindassamy
HDFS-11402.08.patch
21/Apr/17 23:20
68 kB
Manoj Govindassamy
HDFS-11402.07.patch
21/Apr/17 21:13
68 kB
Manoj Govindassamy
HDFS-11402.06.patch
20/Apr/17 23:19
68 kB
Manoj Govindassamy
HDFS-11402.05.patch
20/Apr/17 00:22
67 kB
Manoj Govindassamy
HDFS-11402.04.patch
12/Apr/17 18:12
67 kB
Manoj Govindassamy
HDFS-11402.03.patch
08/Apr/17 01:58
62 kB
Manoj Govindassamy
HDFS-11402.02.patch
17/Feb/17 03:33
52 kB
Manoj Govindassamy
HDFS-11402.01.patch
16/Feb/17 04:26
44 kB
Manoj Govindassamy

Issue Links

causes

HDFS-14514 Actual read size of open file in encryption zone still larger than listing size even after enabling HDFS-11402 in Hadoop 2

Resolved

is depended upon by

HDFS-11220 SnapshotDiffReport should detect open files in HDFS Snapshots

Resolved

is duplicated by

HDFS-10825 Snapshot read can reveal future bytes if snapshotted while writing

Resolved

is required by

HDFS-12217 HDFS snapshots doesn't capture all open files when one of the open files is deleted

Resolved

relates to

HDFS-12316 Verify HDFS snapshot deletion doesn't crash the ongoing file writes

Resolved

HDFS-11218 Add option to skip open files during HDFS Snapshots

Resolved

HDFS-11220 SnapshotDiffReport should detect open files in HDFS Snapshots

Resolved

(2 relates to)

HDFS Snapshots should capture point-in-time copies of OPEN files

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates