[HDFS-4529] Decide the semantic of concat with snapshots - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: Snapshot (HDFS-2802)
Component/s: namenode
Labels:
None

Hadoop Flags:

Reviewed

Description

The use case of concat is for copying large files across clusters using the following steps.

Step 1: The blocks of a file in the source cluster are copied in parallel to transient files in the destination cluster.
Step 2: Then the transient files in the destination cluster are concatenated in order to obtain the original file.

If a snapshot is taken in the destination cluster before Step 2, some transient files may be captured in the snapshot. Then what should happen? The following are some alternatives:

(1) fail concat and keep the transient files in the snapshots;
(2) allow concat and keep the transient files in the snapshots;
(3) allow concat but remove the transient files from all snapshots.

All solutions above are not perfect. Here are their drawbacks:

For (1) and (2), the transient files will remain in the system until the snapshots are deleted. It is inefficient to the system since the files are known to be transient. (1) may be able to force user to create files under some non-snapshottable tmp directory in the first place. However, it complicates the user applications and the existing applications may need to be updated for the new policy. Also, non-snapshottable directory may not exists since admin may set the system root directory to be snapshottable. For (2), the problem seems to break the Read-Only snapshot contract - some files appear in a snapshot may disappear later on.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

h4529_20130416.patch
16/Apr/13 22:33
8 kB
Tsz-wo Sze
h4529_20130415.patch
16/Apr/13 02:24
8 kB
Tsz-wo Sze

Issue Links

is related to

HDFS-4523 Fix INodeFile replacement, TestQuota and javac error

Resolved

relates to

HDFS-4704 Add a transient flag to file so that transient files won't be included in any snapshot

Open

Activity

People

Assignee:: Tsz-wo Sze

Reporter:: Tsz-wo Sze

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 26/Feb/13 01:36

Updated:: 17/Apr/13 13:15

Resolved:: 16/Apr/13 23:17