[HDFS-10797] Disk usage summary of snapshots causes renamed blocks to get counted twice - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.8.0
Fix Version/s: 2.8.0, 3.0.0-alpha2
Component/s: snapshots
Labels:
None

Hadoop Flags:

Reviewed
Release Note:

Hide
Disk usage summaries previously incorrectly counted files twice if they had been renamed (including files moved to Trash) since being snapshotted. Summaries now include current data plus snapshotted data that is no longer under the directory either due to deletion or being moved outside of the directory.

Show
Disk usage summaries previously incorrectly counted files twice if they had been renamed (including files moved to Trash) since being snapshotted. Summaries now include current data plus snapshotted data that is no longer under the directory either due to deletion or being moved outside of the directory.

Description

DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how much disk usage is used by a snapshot by tallying up the files in the snapshot that have since been deleted (that way it won't overlap with regular files whose disk usage is computed separately). However that is determined from a diff that shows moved (to Trash or otherwise) or renamed files as a deletion and a creation operation that may overlap with the list of blocks. Only the deletion operation is taken into consideration, and this causes those blocks to get represented twice in the disk usage tallying.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-10797.001.patch
14/Sep/16 17:27
6 kB
Sean Mackrory
HDFS-10797.002.patch
15/Sep/16 18:57
7 kB
Sean Mackrory
HDFS-10797.003.patch
16/Sep/16 15:30
7 kB
Sean Mackrory
HDFS-10797.004.patch
29/Sep/16 15:10
11 kB
Sean Mackrory
HDFS-10797.005.patch
29/Sep/16 15:25
21 kB
Sean Mackrory
HDFS-10797.006.patch
29/Sep/16 19:19
25 kB
Sean Mackrory
HDFS-10797.007.patch
30/Sep/16 17:48
24 kB
Sean Mackrory
HDFS-10797.008.patch
30/Sep/16 20:21
24 kB
Sean Mackrory
HDFS-10797.009.patch
30/Sep/16 23:28
25 kB
Sean Mackrory
HDFS-10797.010.patch
05/Oct/16 17:26
25 kB
Sean Mackrory
HDFS-10797.010.patch
05/Oct/16 17:22
25 kB
Sean Mackrory

Issue Links

breaks

HDFS-11661 GetContentSummary uses excessive amounts of memory

Resolved

is broken by

HDFS-11515 -du throws ConcurrentModificationException

Resolved

is related to

HDFS-11705 BUG: Inconsistent storagespace for directory

Open

Activity

People

Assignee:: Sean Mackrory

Reporter:: Sean Mackrory

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 25/Aug/16 16:45

Updated:: 24/Apr/18 20:51

Resolved:: 08/Oct/16 00:38