Hadoop Common
  1. Hadoop Common
  2. HADOOP-3941

Extend FileSystem API to return file-checksums/file-digests

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.19.0
    • Component/s: fs
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Added new FileSystem APIs: FileChecksum and FileSystem.getFileChecksum(Path).

      Description

      Suppose we have two files in two locations (may be two clusters) and these two files have the same size. How could we tell whether the content of them are the same?

      Currently, the only way is to read both files and compare the content of them. This is a very expensive operation if the files are huge.

      So, we would like to extend the FileSystem API to support returning file-checksums/file-digests.

      1. 3941_20080904.patch
        10 kB
        Tsz Wo Nicholas Sze
      2. 3941_20080827.patch
        17 kB
        Tsz Wo Nicholas Sze
      3. 3941_20080826.patch
        14 kB
        Tsz Wo Nicholas Sze
      4. 3941_20080820.patch
        18 kB
        Tsz Wo Nicholas Sze
      5. 3941_20080819b.patch
        18 kB
        Tsz Wo Nicholas Sze
      6. 3941_20080819.patch
        15 kB
        Tsz Wo Nicholas Sze
      7. 3941_20080818.patch
        10 kB
        Tsz Wo Nicholas Sze

        Issue Links

          Activity

          Hide
          Tsz Wo Nicholas Sze added a comment -

          How about we add the following optional method in the FileSystem API?

          //a new optional method in FileSystem.java
          public abstract FileChecksum getFileChecksum(String algorithm, Path p);
          

          where FileChecksum is a new interface in hadoop.fs package

          interface FileChecksum {
            String getAlgorithm();
          
            int getLength();
          
            byte[] getBytes();
          }
          
          Show
          Tsz Wo Nicholas Sze added a comment - How about we add the following optional method in the FileSystem API? //a new optional method in FileSystem.java public abstract FileChecksum getFileChecksum( String algorithm, Path p); where FileChecksum is a new interface in hadoop.fs package interface FileChecksum { String getAlgorithm(); int getLength(); byte [] getBytes(); }
          Hide
          Doug Cutting added a comment -

          How about making FileChecksum an abstract class, adding the method:

          public abstract equals(Object other);

          Show
          Doug Cutting added a comment - How about making FileChecksum an abstract class, adding the method: public abstract equals(Object other);
          Hide
          Doug Cutting added a comment -

          Sorry, that should have been something like:

          public boolean equals(Object other) {
            if (!(other instanceof FileChecksum))
              return false;
            FileChecksum that = (FileChecksum)other;
            return this.getAlgorithm().equals(that.getAlgorithm())
              && this.getBytes().equals(that.getBytes());
          }
          
          Show
          Doug Cutting added a comment - Sorry, that should have been something like: public boolean equals( Object other) { if (!(other instanceof FileChecksum)) return false ; FileChecksum that = (FileChecksum)other; return this .getAlgorithm().equals(that.getAlgorithm()) && this .getBytes().equals(that.getBytes()); }
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3941_20080818.patch: API change preview

          • Added getFileChecksum(...) in FileSystem
          • Added abstract class FileChecksum
            • implemented equals(...) and hashCode()
            • renamed the method getAlgorithm() mentioned above to getAlgorithmName()

          I am going to implement MD5FileChecksum for LocalFileSysetm.

          Show
          Tsz Wo Nicholas Sze added a comment - 3941_20080818.patch: API change preview Added getFileChecksum(...) in FileSystem Added abstract class FileChecksum implemented equals(...) and hashCode() renamed the method getAlgorithm() mentioned above to getAlgorithmName() I am going to implement MD5FileChecksum for LocalFileSysetm.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3941_20080819.patch: implemented MD5FileChecksum for RawLocalFileSystem. Need unit tests.

          Show
          Tsz Wo Nicholas Sze added a comment - 3941_20080819.patch: implemented MD5FileChecksum for RawLocalFileSystem. Need unit tests.
          Hide
          Doug Cutting added a comment -

          Why not have the default implementation of getFileChecksum() throw the "unsupported operation" exception so that we don't have duplicated code in every subclass? Also, should this really throw an exception or return null? I would guess that most applications would want to handle this not as an exceptional condition somewhere higher on the stack, but rather explicitly where getFileChecksum() is called, so perhaps null would be better.

          Do you intend to implement this for HDFS here, or as a separate issue?

          Show
          Doug Cutting added a comment - Why not have the default implementation of getFileChecksum() throw the "unsupported operation" exception so that we don't have duplicated code in every subclass? Also, should this really throw an exception or return null? I would guess that most applications would want to handle this not as an exceptional condition somewhere higher on the stack, but rather explicitly where getFileChecksum() is called, so perhaps null would be better. Do you intend to implement this for HDFS here, or as a separate issue?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Why not have the default implementation of getFileChecksum() throw the "unsupported operation" exception so that we don't have duplicated code in every subclass? Also, should this really throw an exception or return null? I would guess that most applications would want to handle this not as an exceptional condition somewhere higher on the stack, but rather explicitly where getFileChecksum() is called, so perhaps null would be better.

          For other optional operaions (e.g. append), we declare an abstract method in FileSystem and let other FileSystem sub-classes throw "Not supported". Should we do the same for getFileChecksum()?

          I think throwing IOException might be better than returning null. Otherwise, applications have to check null, or they may get NPE which is a RuntimeException.

          The methods defined in java.security.MessageDigest, e.g. getInstance(String algorithm), throw NoSuchAlgorithmException. We might want to do something similar.

          Do you intend to implement this for HDFS here, or as a separate issue?

          Yes, it is because there are more works for implementing HDFS.

          Show
          Tsz Wo Nicholas Sze added a comment - Why not have the default implementation of getFileChecksum() throw the "unsupported operation" exception so that we don't have duplicated code in every subclass? Also, should this really throw an exception or return null? I would guess that most applications would want to handle this not as an exceptional condition somewhere higher on the stack, but rather explicitly where getFileChecksum() is called, so perhaps null would be better. For other optional operaions (e.g. append), we declare an abstract method in FileSystem and let other FileSystem sub-classes throw "Not supported". Should we do the same for getFileChecksum()? I think throwing IOException might be better than returning null. Otherwise, applications have to check null, or they may get NPE which is a RuntimeException. The methods defined in java.security.MessageDigest, e.g. getInstance(String algorithm), throw NoSuchAlgorithmException. We might want to do something similar. Do you intend to implement this for HDFS here, or as a separate issue? Yes, it is because there are more works for implementing HDFS.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3941_20080819b.patch: added a unit test and fixed a Findbugs warning.

          Show
          Tsz Wo Nicholas Sze added a comment - 3941_20080819b.patch: added a unit test and fixed a Findbugs warning.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Below is a summary of the default getFileChecksum() implementation options. We mentioned the first three before. I added the fourth.

          1. no implementation, declare it as abstract
          2. returning null
          3. throwing "Not supported" IOException
          4. if algorithm is MD5, return a MD5FileChecksum. Otherwise, do #2 or #3.

          However, MD5 in #4 may not be efficient for HDFS since it will read the entire file.

          Show
          Tsz Wo Nicholas Sze added a comment - Below is a summary of the default getFileChecksum() implementation options. We mentioned the first three before. I added the fourth. no implementation, declare it as abstract returning null throwing "Not supported" IOException if algorithm is MD5, return a MD5FileChecksum. Otherwise, do #2 or #3. However, MD5 in #4 may not be efficient for HDFS since it will read the entire file.
          Hide
          dhruba borthakur added a comment -

          This patch does not implement checksum on HDFS files, right?

          Do you plan to generate MD5s for HDFS files too? For HDFS, does it make sense to create a checksum from the blk*.meta files because the size of the meta file will be much much lesser than the size of the data file?

          Show
          dhruba borthakur added a comment - This patch does not implement checksum on HDFS files, right? Do you plan to generate MD5s for HDFS files too? For HDFS, does it make sense to create a checksum from the blk*.meta files because the size of the meta file will be much much lesser than the size of the data file?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          This patch does not implement checksum on HDFS files, right?

          You are correct. The patch only throws a "not supported" exception for HDFS.

          Do you plan to generate MD5s for HDFS files too? For HDFS, does it make sense to create a checksum from the blk*.meta files because the size of the meta file will be much much lesser than the size of the data file?

          No, the original MD5 algorithm may not be efficient for large files. I think we need a distributed file digest algorithm for HDFS. Yes, one way is to compute MD5 over the meta files. This will reduce the overhead dramatically. I probably will implement a MD5-over-CRC32 for HDFS.

          Show
          Tsz Wo Nicholas Sze added a comment - This patch does not implement checksum on HDFS files, right? You are correct. The patch only throws a "not supported" exception for HDFS. Do you plan to generate MD5s for HDFS files too? For HDFS, does it make sense to create a checksum from the blk*.meta files because the size of the meta file will be much much lesser than the size of the data file? No, the original MD5 algorithm may not be efficient for large files. I think we need a distributed file digest algorithm for HDFS. Yes, one way is to compute MD5 over the meta files. This will reduce the overhead dramatically. I probably will implement a MD5-over-CRC32 for HDFS.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Created HADOOP-3981 - Need a distributed file checksum algorithm for HDFS

          Show
          Tsz Wo Nicholas Sze added a comment - Created HADOOP-3981 - Need a distributed file checksum algorithm for HDFS
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3941_20080820.patch: fixed a bug.

          All the patches up to now implements option #1, which is our usual approach.

          Show
          Tsz Wo Nicholas Sze added a comment - 3941_20080820.patch: fixed a bug. All the patches up to now implements option #1, which is our usual approach.
          Hide
          Doug Cutting added a comment -

          > Below is a summary of the default getFileChecksum() implementation options [ ... ]

          The default should minimize code duplication, if possible. An abstract method should only be used for mandatory methods. Since this is an optional method, a default implementation should be provided.

          The choice of an exception or null depends on the expected use. An exception should be thrown for unusual situations that are best handled non-locally, somewhere above the call. The absence of a checksum should probably be handled at the site of the call, so returning null seems a better choice than an exception here. Another option might be to return a trivial checksum, e.g., the file's length.

          Perhaps we should include a use of this new feature in the patch, to better guide its implementation. Should we extend distcp to use this? Or do you have another canonical application in mind? If we add features without applications of them, we risk a design that does not meet any needs.

          Show
          Doug Cutting added a comment - > Below is a summary of the default getFileChecksum() implementation options [ ... ] The default should minimize code duplication, if possible. An abstract method should only be used for mandatory methods. Since this is an optional method, a default implementation should be provided. The choice of an exception or null depends on the expected use. An exception should be thrown for unusual situations that are best handled non-locally, somewhere above the call. The absence of a checksum should probably be handled at the site of the call, so returning null seems a better choice than an exception here. Another option might be to return a trivial checksum, e.g., the file's length. Perhaps we should include a use of this new feature in the patch, to better guide its implementation. Should we extend distcp to use this? Or do you have another canonical application in mind? If we add features without applications of them, we risk a design that does not meet any needs.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > ... so returning null seems a better choice than an exception here. Another option might be to return a trivial checksum, e.g., the file's length.

          I think returning null make sense. We cannot return a trivial checksum since an algorithm is specified in the call. We should only return the checksum generated by the specified algorithm.

          > Should we extend distcp to use this?

          Yes, the canonical application is distcp. I could also change distcp to use the new API.

          Show
          Tsz Wo Nicholas Sze added a comment - > ... so returning null seems a better choice than an exception here. Another option might be to return a trivial checksum, e.g., the file's length. I think returning null make sense. We cannot return a trivial checksum since an algorithm is specified in the call. We should only return the checksum generated by the specified algorithm. > Should we extend distcp to use this? Yes, the canonical application is distcp. I could also change distcp to use the new API.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3941_20080826.patch:

          • The default implementation is provided in the FileSystem class.
          • It returns null when the algorithm is not found.
          • Added FileLengthChecksum which uses file lengths as checksums
          Show
          Tsz Wo Nicholas Sze added a comment - 3941_20080826.patch: The default implementation is provided in the FileSystem class. It returns null when the algorithm is not found. Added FileLengthChecksum which uses file lengths as checksums
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3941_20080827.patch: changed DistCp to use LengthFileChecksum

          Show
          Tsz Wo Nicholas Sze added a comment - 3941_20080827.patch: changed DistCp to use LengthFileChecksum
          Hide
          Tsz Wo Nicholas Sze added a comment -

          passed all tests locally. try Hudson

          Show
          Tsz Wo Nicholas Sze added a comment - passed all tests locally. try Hudson
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12389024/3941_20080827.patch
          against trunk revision 689733.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3134/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3134/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3134/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3134/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12389024/3941_20080827.patch against trunk revision 689733. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3134/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3134/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3134/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3134/console This message is automatically generated.
          Hide
          Doug Cutting added a comment -

          I don't see the point in passing the checksum algorithm name to getFileChecksum(). Do we expect a FileSystem to actually checksum a file on demand? I assume not, that this feature is primarily for accessing pre-computed checksums, and that most filesystems will only support a single checksum algorithm.

          There are two primary cases to consider:
          1. Copying files between filesystems that have pre-computed checksums using the same algorithm.
          2. Copying files between filesystems which either do not have pre-computed checksums or use different algorithms.

          In (2) copies should use flie lengths or perhaps fail, and in (1) we should use checksums. Right?

          In any case, hardwiring distcp to use FileLengthChecksum doesn't seem like an improvement.

          Show
          Doug Cutting added a comment - I don't see the point in passing the checksum algorithm name to getFileChecksum(). Do we expect a FileSystem to actually checksum a file on demand? I assume not, that this feature is primarily for accessing pre-computed checksums, and that most filesystems will only support a single checksum algorithm. There are two primary cases to consider: 1. Copying files between filesystems that have pre-computed checksums using the same algorithm. 2. Copying files between filesystems which either do not have pre-computed checksums or use different algorithms. In (2) copies should use flie lengths or perhaps fail, and in (1) we should use checksums. Right? In any case, hardwiring distcp to use FileLengthChecksum doesn't seem like an improvement.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > Do we expect a FileSystem to actually checksum a file on demand? I assume not, that this feature is primarily for accessing pre-computed checksums, ...

          For HDFS, I am not sure whether sending all crcs to client is good enough since the size of all crcs is 1/128 of the file size, which is big for large files. We might want to reduce the network traffic (especially in the case of distcp) by computing a second level of checksums (e.g. compute a MD5 for all the crcs of a block). So, I think this feature is not only for accessing pre-computed checksums, but indeed a framework for supporting checksum algorithms.

          > In (2) copies should use flie lengths or perhaps fail, ...

          It should not fail. Otherwise, we cannot copy from local fs to hdfs. We are currently using file length as checksum. It is simply too easy to have false positive.

          > In any case, hardwiring distcp to use FileLengthChecksum doesn't seem like an improvement.

          It is only temporary. Once we have a distributed checksum implementation, we could change DistCp to use it. The distributed checksum implementation will optimize for HDFS, so that coping from HDFS to HDFS will be very efficient (which is the main purpose of distcp). If necessary, we could provide an option in distcp for users to specify the checksum algorithm.

          Show
          Tsz Wo Nicholas Sze added a comment - > Do we expect a FileSystem to actually checksum a file on demand? I assume not, that this feature is primarily for accessing pre-computed checksums, ... For HDFS, I am not sure whether sending all crcs to client is good enough since the size of all crcs is 1/128 of the file size, which is big for large files. We might want to reduce the network traffic (especially in the case of distcp) by computing a second level of checksums (e.g. compute a MD5 for all the crcs of a block). So, I think this feature is not only for accessing pre-computed checksums, but indeed a framework for supporting checksum algorithms. > In (2) copies should use flie lengths or perhaps fail, ... It should not fail. Otherwise, we cannot copy from local fs to hdfs. We are currently using file length as checksum. It is simply too easy to have false positive. > In any case, hardwiring distcp to use FileLengthChecksum doesn't seem like an improvement. It is only temporary. Once we have a distributed checksum implementation, we could change DistCp to use it. The distributed checksum implementation will optimize for HDFS, so that coping from HDFS to HDFS will be very efficient (which is the main purpose of distcp). If necessary, we could provide an option in distcp for users to specify the checksum algorithm.
          Hide
          Doug Cutting added a comment -

          Distcp should not hardwire any algorithm, but rather use the preferred algorithm of the filesystems involved. That way checksums will not be used just for HDFS->HDFS, but also for S3->S3, etc.

          Show
          Doug Cutting added a comment - Distcp should not hardwire any algorithm, but rather use the preferred algorithm of the filesystems involved. That way checksums will not be used just for HDFS->HDFS, but also for S3->S3, etc.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > Distcp should not hardwire any algorithm

          That is true. We might need a method for getting the supported algorithms of a file system. Algorithms will be sorted by the preference. For example, if S3 supports

          {MD5, FileLength}

          , HDFS supports

          {HDFS-Checksum, FileLength}

          and LocalFS supports

          {MD5, HDFS-Checksum, FileLength}

          , then

          • S3 -> HDFS or HDFS -> S3 will use FileLength
          • S3 -> S3 will use MD5
          • S3 -> LocalFS will use MD5
          • LocalFS -> HDFS will use HDFS-Checksum
          Show
          Tsz Wo Nicholas Sze added a comment - > Distcp should not hardwire any algorithm That is true. We might need a method for getting the supported algorithms of a file system. Algorithms will be sorted by the preference. For example, if S3 supports {MD5, FileLength} , HDFS supports {HDFS-Checksum, FileLength} and LocalFS supports {MD5, HDFS-Checksum, FileLength} , then S3 -> HDFS or HDFS -> S3 will use FileLength S3 -> S3 will use MD5 S3 -> LocalFS will use MD5 LocalFS -> HDFS will use HDFS-Checksum
          Hide
          Doug Cutting added a comment -

          > We might need a method for getting the supported algorithms of a file system.

          If we remove the "algorithm" parameter to getFileChecksum() then each FileSystem would simply return checksums using its native algorithm. When these match, cross-filesystem copies would be checksummed. Later, if we have filesystems that implement multiple checksum algorithms, we might consider something more elaborate, but that seems sufficient for now, no?

          Show
          Doug Cutting added a comment - > We might need a method for getting the supported algorithms of a file system. If we remove the "algorithm" parameter to getFileChecksum() then each FileSystem would simply return checksums using its native algorithm. When these match, cross-filesystem copies would be checksummed. Later, if we have filesystems that implement multiple checksum algorithms, we might consider something more elaborate, but that seems sufficient for now, no?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          If we remove the "algorithm" parameter to getFileChecksum() then each FileSystem would simply return checksums using its native algorithm. When these match, cross-filesystem copies would be checksummed. Later, if we have filesystems that implement multiple checksum algorithms, we might consider something more elaborate, but that seems sufficient for now, no?

          +1 Then, I will only add

          public FileChecksum getFileChecksum(Path f) throws IOException
          

          in this patch. If we need more checksum algorithms later, we should add these two methods:

          public FileChecksum getFileChecksum(String algorithm, Path f) throws IOException
          
          public String[] getSupportedChecksumAlgorithms()
          
          Show
          Tsz Wo Nicholas Sze added a comment - If we remove the "algorithm" parameter to getFileChecksum() then each FileSystem would simply return checksums using its native algorithm. When these match, cross-filesystem copies would be checksummed. Later, if we have filesystems that implement multiple checksum algorithms, we might consider something more elaborate, but that seems sufficient for now, no? +1 Then, I will only add public FileChecksum getFileChecksum(Path f) throws IOException in this patch. If we need more checksum algorithms later, we should add these two methods: public FileChecksum getFileChecksum( String algorithm, Path f) throws IOException public String [] getSupportedChecksumAlgorithms()
          Hide
          Tsz Wo Nicholas Sze added a comment -

          3941_20080904.patch: Changed FileSystem API to getFileChecksum(Path).

          Show
          Tsz Wo Nicholas Sze added a comment - 3941_20080904.patch: Changed FileSystem API to getFileChecksum(Path).
          Hide
          Doug Cutting added a comment -

          +1 This looks good to me.

          Show
          Doug Cutting added a comment - +1 This looks good to me.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12389550/3941_20080904.patch
          against trunk revision 692492.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3190/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3190/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3190/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3190/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12389550/3941_20080904.patch against trunk revision 692492. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3190/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3190/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3190/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3190/console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I just committed this.

          Show
          Tsz Wo Nicholas Sze added a comment - I just committed this.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I forgot to mention that the test failed is nothing to do with this issue. See HADOOP-4078.

          Show
          Tsz Wo Nicholas Sze added a comment - I forgot to mention that the test failed is nothing to do with this issue. See HADOOP-4078 .
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #595 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/595/ )

            People

            • Assignee:
              Tsz Wo Nicholas Sze
              Reporter:
              Tsz Wo Nicholas Sze
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development