Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.14.1
    • Fix Version/s: 0.17.0
    • Component/s: fs
    • Labels:
      None
    • Release Note:
      Added a new option -ignoreCrc to fs -get, or equivalently, fs -copyToLocal, such that crc checksum will be ignored for the command. The use of this option is to download the corrupted files.

      Description

      Before 0.14, dfs -get didn't perform checksum checking.
      Users were able to download the corrupted files to see if they want to delete them.

      After 0.14, dfs -get also does the checksumming.

      Requesting a command for no-checksum-get command.

      1. 2063_20080219.patch
        17 kB
        Tsz Wo Nicholas Sze
      2. 2063_20080220.patch
        24 kB
        Tsz Wo Nicholas Sze
      3. 2063_20080221.patch
        27 kB
        Tsz Wo Nicholas Sze
      4. 2063_20080222.patch
        25 kB
        Tsz Wo Nicholas Sze

        Activity

        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-trunk #415 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/415/ )
        Hide
        dhruba borthakur added a comment -

        I just committed this. Thanks Nicholas.

        Show
        dhruba borthakur added a comment - I just committed this. Thanks Nicholas.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12376266/2063_20080222.patch
        against trunk revision 619744.

        @author +1. The patch does not contain any @author tags.

        tests included +1. The patch appears to include 7 new or modified tests.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new javac compiler warnings.

        release audit +1. The applied patch does not generate any new release audit warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests +1. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1837/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1837/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1837/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1837/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12376266/2063_20080222.patch against trunk revision 619744. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 7 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1837/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1837/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1837/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1837/console This message is automatically generated.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        hudson problem? My last submission is still not tested for 3 days. Try again.

        Show
        Tsz Wo Nicholas Sze added a comment - hudson problem? My last submission is still not tested for 3 days. Try again.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        2063_20080222.patch:

        > though getSrcFileSystem is private, it will be nice to have JavaDoc. FsShell.java is read by a lot of developers.
        Added javadoc

        > There are a few code clean up changes. Please double check these changes. Do we need extra output in TestDFSShell or was it a debug change?
        Reverted some code clean up and check the others. Also removed some log messages.

        Show
        Tsz Wo Nicholas Sze added a comment - 2063_20080222.patch: > though getSrcFileSystem is private, it will be nice to have JavaDoc. FsShell.java is read by a lot of developers. Added javadoc > There are a few code clean up changes. Please double check these changes. Do we need extra output in TestDFSShell or was it a debug change? Reverted some code clean up and check the others. Also removed some log messages.
        Hide
        Raghu Angadi added a comment -

        +1. looks good.

        Minor suggestions:

        • though getSrcFileSystem is private, it will be nice to have JavaDoc. FsShell.java is read by a lot of developers.
        • There are a few code clean up changes. Please double check these changes. Do we need extra output in TestDFSShell or was it a debug change?
        Show
        Raghu Angadi added a comment - +1. looks good. Minor suggestions: though getSrcFileSystem is private, it will be nice to have JavaDoc. FsShell.java is read by a lot of developers. There are a few code clean up changes. Please double check these changes. Do we need extra output in TestDFSShell or was it a debug change?
        Hide
        Tsz Wo Nicholas Sze added a comment -

        2063_20080221.patch

        1,2,5: done
        3: make verifyChecksum as a DistributedFileSystem property
        4: reverted

        Show
        Tsz Wo Nicholas Sze added a comment - 2063_20080221.patch 1,2,5: done 3: make verifyChecksum as a DistributedFileSystem property 4: reverted
        Hide
        Raghu Angadi added a comment -

        > I think it might not be useful to sent it to stdout
        hmm.. it is just file data right?

        > The checksum feature is not available for some other FileSystem. So I am not sure whether we should make it as a FileSystem property. I will think about this.

        Though this is not available in other filesystems, it is used by the generic FsShell layer as an argument to generic command. In that sense the flag might belong in FileSystem. LocalFileSystem can certainly support it. I think it also makes the implementation simpler.

        Show
        Raghu Angadi added a comment - > I think it might not be useful to sent it to stdout hmm.. it is just file data right? > The checksum feature is not available for some other FileSystem. So I am not sure whether we should make it as a FileSystem property. I will think about this. Though this is not available in other filesystems, it is used by the generic FsShell layer as an argument to generic command. In that sense the flag might belong in FileSystem. LocalFileSystem can certainly support it. I think it also makes the implementation simpler.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Thanks, Raghu!

        1. I think it might not be useful to sent it to stdout
        2. sure
        3. The checksum feature is not available for some other FileSystem. So I am not sure whether we should make it as a FileSystem property. I will think about this.
        4. I was fixing the indentation (it was using tabs). It could be reverted if the changes are not good.
        5. I totally agree that ignoring checksum is used rare enough. Since the feature is not common, it might be even better to create a new command, instead of making it a option to "-get".
        Show
        Tsz Wo Nicholas Sze added a comment - Thanks, Raghu! I think it might not be useful to sent it to stdout sure The checksum feature is not available for some other FileSystem. So I am not sure whether we should make it as a FileSystem property. I will think about this. I was fixing the indentation (it was using tabs). It could be reverted if the changes are not good. I totally agree that ignoring checksum is used rare enough. Since the feature is not common, it might be even better to create a new command, instead of making it a option to "-get".
        Hide
        Raghu Angadi added a comment -

        Initial Comments:

        1. why isn't "-f" valid if output is stdout?
        2. I would rename doChecksum to something like verifyChecksum
        3. The current implementation creates a new static FileSystem implementation inside DFS, and thus we need a new static getSrcFileSystem(), which checks type at runtime.
          • Since this could apply for any filesystem, how about making this a property of FileSystem? So FsShell.get() would just set srcFS.setVerifyChecksum(false) in case of "-f" option?
        4. Not sure why we need small code reorg around decimalFormat?
        5. "-f" for option ok? Ignoring checksum is used rare enough that it could be just named "-ignoreCrc" or "-ignoreCrcErrors". But this is subjective.
        Show
        Raghu Angadi added a comment - Initial Comments: why isn't "-f" valid if output is stdout? I would rename doChecksum to something like verifyChecksum The current implementation creates a new static FileSystem implementation inside DFS, and thus we need a new static getSrcFileSystem() , which checks type at runtime. Since this could apply for any filesystem, how about making this a property of FileSystem? So FsShell.get() would just set srcFS.setVerifyChecksum(false) in case of "-f" option? Not sure why we need small code reorg around decimalFormat ? "-f" for option ok? Ignoring checksum is used rare enough that it could be just named "-ignoreCrc" or "-ignoreCrcErrors". But this is subjective.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        2063_20080220.patch: added a junit test.

        Show
        Tsz Wo Nicholas Sze added a comment - 2063_20080220.patch: added a junit test.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        How about adding an option "-f" to "fs -get" so that it forces getting the corrupted files?

        Show
        Tsz Wo Nicholas Sze added a comment - How about adding an option "-f" to "fs -get" so that it forces getting the corrupted files?
        Hide
        Raghu Angadi added a comment -

        There is a hack : you can delete (or better save) .meta file for a corrupted block and try to read the file. It should work.

        I don't think this is very large feature.

        Show
        Raghu Angadi added a comment - There is a hack : you can delete (or better save) .meta file for a corrupted block and try to read the file. It should work. I don't think this is very large feature.
        Hide
        Marco Nicosia added a comment -

        I'm disappointed this didn't go in for Hadoop 0.16, despite having been set as target for 0.16. I understand that this is a pretty big change, but I really want to be sure we get something for Hadoop 0.17.

        While we wait for this, any old Hadoop DFS' with corrupted files will need to sit, waiting for their owners to have a way to retrieve the files. For that time fsck will always return corrupt. The inability to do anything with these files (except delete them) could be masking us from being able to detect other hadoop issues.

        Show
        Marco Nicosia added a comment - I'm disappointed this didn't go in for Hadoop 0.16, despite having been set as target for 0.16. I understand that this is a pretty big change, but I really want to be sure we get something for Hadoop 0.17. While we wait for this, any old Hadoop DFS' with corrupted files will need to sit, waiting for their owners to have a way to retrieve the files. For that time fsck will always return corrupt. The inability to do anything with these files (except delete them) could be masking us from being able to detect other hadoop issues.
        Hide
        Nigel Daley added a comment -

        This is a new feature. Feature freeze for 0.16 has past. Assigning this to 0.17.

        Show
        Nigel Daley added a comment - This is a new feature. Feature freeze for 0.16 has past. Assigning this to 0.17.
        Hide
        Robert Chansler added a comment -

        Promoted for consideration in 16.

        Show
        Robert Chansler added a comment - Promoted for consideration in 16.

          People

          • Assignee:
            Tsz Wo Nicholas Sze
            Reporter:
            Koji Noguchi
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development