Issue Details (XML | Word | Printable)

Key: HADOOP-2703
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Lohit Vijayarenu
Reporter: Koji Noguchi
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

New files under lease (before close) still shows up as MISSING files/blocks in fsck

Created: 24/Jan/08 07:59 PM   Updated: 08/Jul/09 04:42 PM
Return to search
Component/s: None
Affects Version/s: 0.15.3
Fix Version/s: 0.18.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works HADOOP-2703-1.patch 2008-04-11 05:06 PM Lohit Vijayarenu 11 kB
Text File Licensed for inclusion in ASF works HADOOP-2703-2.patch 2008-04-11 10:50 PM Lohit Vijayarenu 12 kB
Text File Licensed for inclusion in ASF works HADOOP-2703-3.patch 2008-04-15 08:31 PM Lohit Vijayarenu 14 kB
Text File Licensed for inclusion in ASF works HADOOP-2703-4.patch 2008-04-15 09:07 PM Lohit Vijayarenu 14 kB
Text File Licensed for inclusion in ASF works HADOOP-2703-5.patch 2008-04-16 08:30 AM Lohit Vijayarenu 14 kB
Issue Links:
Reference
 

Hadoop Flags: Reviewed, Incompatible change
Release Note: Changed fsck to ignore files opened for writing. Introduced new option "-openforwrite" to explicitly show open files.
Resolution Date: 16/Apr/08 08:51 PM


 Description  « Hide
On 0.15.3 which includes HADOOP-2540 ("Empty blocks make fsck report corrupt, even when it isn't") fix, we are still seeing fsck reporting missing blocks when they aren't. Basically, any unclosed small files under lease shows up as MISSING.

Can we have an option in fsck to skip those files under lease?
Or count those files/blocks separately?



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Lohit Vijayarenu added a comment - 11/Apr/08 05:06 PM
Attached patch handles this case. This patch ignores open files by default but lists them as total number of open files. It provides an option '-open' when used will print the open files.

dhruba borthakur added a comment - 11/Apr/08 05:38 PM
1. The ClientProtocol # has to be bumped because LocatedBlocks is used in a few RPCs. Please check if NamenodeProtocol uses it too.

2. Maybe it is better to add a new field to DFSFileInfo to say whether a file is opened for writing. The open/close characteristics is more related to a file rather than a block. This can replace the new field that you added to LocatedBlocks.

3. The summary prints total file and total open files in two separate lines. One confusion is whether the total file count includes the open files or not. It might be better to put these two in the same line
Total Files: 589 (Files that are currently being written to: 10)

4. The "open" parameter to fsck might confuse people to wonder whether these files are opened for reading or writing. It would be worthwhile to come up with a different name of this command line parameter.


Hadoop QA added a comment - 11/Apr/08 07:10 PM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12379934/HADOOP-2703-1.patch
against trunk revision 645773.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 3 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests -1. The patch failed core unit tests.

contrib tests -1. The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2210/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2210/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2210/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2210/console

This message is automatically generated.


Hadoop QA added a comment - 11/Apr/08 09:12 PM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12379934/HADOOP-2703-1.patch
against trunk revision 645773.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 3 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests -1. The patch failed core unit tests.

contrib tests -1. The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2211/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2211/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2211/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2211/console

This message is automatically generated.


Lohit Vijayarenu added a comment - 11/Apr/08 10:50 PM
Thanks Dhruba. I have incorporated changes suggested by you. This patch changes -open to -openforwrite, changes the way output is displayed and also bumped ClientProtocol versionID.

Hadoop QA added a comment - 14/Apr/08 06:11 PM
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12379966/HADOOP-2703-2.patch
against trunk revision 645773.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 3 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2225/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2225/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2225/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2225/console

This message is automatically generated.


Lohit Vijayarenu added a comment - 15/Apr/08 08:31 PM
This updated patch adds few more details. When hadoop fsck is run without any options it along with total number of open files, it also lists total number of blocks owned by those open files and also its size if it is greater than 0. When -openforwrite option is used, all of them are added up and reported as total blocks.

Sample output when default fsck is invoked while a file is being written

[lohit@ hadoop-core-trunk]$ ./bin/hadoop fsck /
...Status: HEALTHY
 Total size:    420912 B
 Total dirs:    5
 Total files:   3 (Files currently being written: 1)
 Total blocks:  3 (avg. block size 140304 B) (Total open file blocks: 1)
 Minimally replicated blocks:   3 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    1
 Average block replication:     1.0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          1
 Number of racks:               1


The filesystem under path '/' is HEALTHY
[lohit@ hadoop-core-trunk]$


Lohit Vijayarenu added a comment - 15/Apr/08 09:07 PM
Attached slightly modified patch with changes to messages printed. Here is the summary.
  • hadoop fsck / would ignore validating blocks owned by open files, but it does report number of files open, number of blocks owned by open files as not validated and total file size of such files
  • hadoop fsck / -openforwrite defaults to checking all files and validating all blocks reporting for missing blocks as done now.

Here is sample output of hadoop fsck /

[lohit@ hadoop-core-trunk]$ ./bin/hadoop fsck / 
....Status: HEALTHY
 Total size:    2148912 B
 Total dirs:    5
 Total files:   4 (Files currently being written: 1)
 Total blocks (validated):      4 (avg. block size 537228 B) (Total open file blocks (not validated): 1)
 Minimally replicated blocks:   4 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    1
 Average block replication:     1.0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          1
 Number of racks:               1


The filesystem under path '/' is HEALTHY
[lohit@ hadoop-core-trunk]$ 

Koji Noguchi added a comment - 15/Apr/08 09:20 PM
+1 on the output.

dhruba borthakur added a comment - 16/Apr/08 06:25 AM
+1 code looks good.

Hadoop QA added a comment - 16/Apr/08 06:44 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12380220/HADOOP-2703-4.patch
against trunk revision 645773.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 3 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests -1. The patch failed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2248/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2248/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2248/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2248/console

This message is automatically generated.


Lohit Vijayarenu added a comment - 16/Apr/08 08:30 AM
Had made error while printing out message which was caught by my testcase itself.

Hadoop QA added a comment - 16/Apr/08 03:45 PM
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12380257/HADOOP-2703-5.patch
against trunk revision 645773.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 3 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2254/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2254/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2254/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2254/console

This message is automatically generated.


dhruba borthakur added a comment - 16/Apr/08 08:51 PM
I just committed this. Thanks Lohit!

Hudson added a comment - 17/Apr/08 12:11 PM

Hudson added a comment - 19/Jun/08 12:35 PM