Issue Details (XML | Word | Printable)

Key: HADOOP-2063
Type: New Feature New Feature
Status: Closed Closed
Resolution: Fixed
Priority: Blocker Blocker
Assignee: Tsz Wo (Nicholas), SZE
Reporter: Koji Noguchi
Votes: 0
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Command to pull corrupted files

Created: 16/Oct/07 06:40 PM   Updated: 21/May/08 08:05 PM
Return to search
Component/s: fs
Affects Version/s: 0.14.1
Fix Version/s: 0.17.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works 2063_20080219.patch 2008-02-20 12:52 AM Tsz Wo (Nicholas), SZE 17 kB
Text File Licensed for inclusion in ASF works 2063_20080220.patch 2008-02-20 08:36 PM Tsz Wo (Nicholas), SZE 24 kB
Text File Licensed for inclusion in ASF works 2063_20080221.patch 2008-02-22 02:21 AM Tsz Wo (Nicholas), SZE 27 kB
Text File Licensed for inclusion in ASF works 2063_20080222.patch 2008-02-22 09:18 PM Tsz Wo (Nicholas), SZE 25 kB

Release Note: Added a new option -ignoreCrc to fs -get, or equivalently, fs -copyToLocal, such that crc checksum will be ignored for the command. The use of this option is to download the corrupted files.
Resolution Date: 28/Feb/08 02:43 PM


 Description  « Hide
Before 0.14, dfs -get didn't perform checksum checking.
Users were able to download the corrupted files to see if they want to delete them.

After 0.14, dfs -get also does the checksumming.

Requesting a command for no-checksum-get command.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Robert Chansler added a comment - 22/Jan/08 07:00 PM
Promoted for consideration in 16.

Nigel Daley added a comment - 22/Jan/08 07:17 PM
This is a new feature. Feature freeze for 0.16 has past. Assigning this to 0.17.

Marco Nicosia added a comment - 28/Jan/08 05:28 PM
I'm disappointed this didn't go in for Hadoop 0.16, despite having been set as target for 0.16. I understand that this is a pretty big change, but I really want to be sure we get something for Hadoop 0.17.

While we wait for this, any old Hadoop DFS' with corrupted files will need to sit, waiting for their owners to have a way to retrieve the files. For that time fsck will always return corrupt. The inability to do anything with these files (except delete them) could be masking us from being able to detect other hadoop issues.


Raghu Angadi added a comment - 28/Jan/08 06:39 PM
There is a hack : you can delete (or better save) .meta file for a corrupted block and try to read the file. It should work.

I don't think this is very large feature.


Tsz Wo (Nicholas), SZE added a comment - 15/Feb/08 12:09 AM
How about adding an option "-f" to "fs -get" so that it forces getting the corrupted files?

Tsz Wo (Nicholas), SZE added a comment - 20/Feb/08 08:36 PM
2063_20080220.patch: added a junit test.

Raghu Angadi added a comment - 20/Feb/08 10:25 PM

Initial Comments:

  1. why isn't "-f" valid if output is stdout?
  2. I would rename doChecksum to something like verifyChecksum
  3. The current implementation creates a new static FileSystem implementation inside DFS, and thus we need a new static getSrcFileSystem(), which checks type at runtime.
    • Since this could apply for any filesystem, how about making this a property of FileSystem? So FsShell.get() would just set srcFS.setVerifyChecksum(false) in case of "-f" option?
  4. Not sure why we need small code reorg around decimalFormat?
  5. "-f" for option ok? Ignoring checksum is used rare enough that it could be just named "-ignoreCrc" or "-ignoreCrcErrors". But this is subjective.

Tsz Wo (Nicholas), SZE added a comment - 20/Feb/08 11:30 PM
Thanks, Raghu!
  1. I think it might not be useful to sent it to stdout
  2. sure
  3. The checksum feature is not available for some other FileSystem. So I am not sure whether we should make it as a FileSystem property. I will think about this.
  4. I was fixing the indentation (it was using tabs). It could be reverted if the changes are not good.
  5. I totally agree that ignoring checksum is used rare enough. Since the feature is not common, it might be even better to create a new command, instead of making it a option to "-get".

Raghu Angadi added a comment - 20/Feb/08 11:38 PM
> I think it might not be useful to sent it to stdout
hmm.. it is just file data right?

> The checksum feature is not available for some other FileSystem. So I am not sure whether we should make it as a FileSystem property. I will think about this.

Though this is not available in other filesystems, it is used by the generic FsShell layer as an argument to generic command. In that sense the flag might belong in FileSystem. LocalFileSystem can certainly support it. I think it also makes the implementation simpler.


Tsz Wo (Nicholas), SZE added a comment - 22/Feb/08 02:21 AM
2063_20080221.patch

1,2,5: done
3: make verifyChecksum as a DistributedFileSystem property
4: reverted


Raghu Angadi added a comment - 22/Feb/08 08:32 PM
+1. looks good.

Minor suggestions:

  • though getSrcFileSystem is private, it will be nice to have JavaDoc. FsShell.java is read by a lot of developers.
  • There are a few code clean up changes. Please double check these changes. Do we need extra output in TestDFSShell or was it a debug change?

Tsz Wo (Nicholas), SZE added a comment - 22/Feb/08 09:18 PM
2063_20080222.patch:

> though getSrcFileSystem is private, it will be nice to have JavaDoc. FsShell.java is read by a lot of developers.
Added javadoc

> There are a few code clean up changes. Please double check these changes. Do we need extra output in TestDFSShell or was it a debug change?
Reverted some code clean up and check the others. Also removed some log messages.


Tsz Wo (Nicholas), SZE added a comment - 25/Feb/08 06:55 PM
hudson problem? My last submission is still not tested for 3 days. Try again.

Hadoop QA added a comment - 27/Feb/08 04:58 AM
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12376266/2063_20080222.patch
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 7 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1837/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1837/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1837/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1837/console

This message is automatically generated.


dhruba borthakur added a comment - 28/Feb/08 02:43 PM
I just committed this. Thanks Nicholas.

Hudson added a comment - 29/Feb/08 12:52 PM