Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.15.2
    • Fix Version/s: 0.16.0
    • Component/s: fs
    • Labels:
      None

      Description

      To remove the cache of FileStatus in DFSPath (HADOOP-2565) without hurting performance, we must use file enumeration APIs that return FileStatus[] rather than Path[]. Currently we have FileSystem#globPaths(), but that method should be deprecated and replaced with a FileSystem#globStatus().

      We need to deprecate FileSystem#globPaths() in 0.16 in order to remove the cache in 0.17.

      1. globStatus9.patch
        20 kB
        Hairong Kuang
      2. globStatus8.patch
        20 kB
        Hairong Kuang
      3. globStatus7.patch
        20 kB
        Hairong Kuang
      4. globStatus6.patch
        20 kB
        Hairong Kuang
      5. globStatus5.patch
        19 kB
        Hairong Kuang
      6. globStatus4.patch
        20 kB
        Hairong Kuang
      7. globStatus3.patch
        17 kB
        Hairong Kuang
      8. globStatus2.patch
        16 kB
        Hairong Kuang
      9. globStatus10.patch
        20 kB
        Hairong Kuang
      10. globStatus1.patch
        14 kB
        Hairong Kuang
      11. globStatus.patch
        13 kB
        Hairong Kuang

        Issue Links

          Activity

          Hide
          Hairong Kuang added a comment -

          Did you mean that we need FileStatus[] listStatus rather than Path[] listPaths?

          Show
          Hairong Kuang added a comment - Did you mean that we need FileStatus[] listStatus rather than Path[] listPaths?
          Hide
          Doug Cutting added a comment -

          No, we need 'FileStatus[] globStatus(Path pattern)' instead of 'Path[] globPaths(Path pattern)'.

          Show
          Doug Cutting added a comment - No, we need 'FileStatus[] globStatus(Path pattern)' instead of 'Path[] globPaths(Path pattern)'.
          Hide
          Hairong Kuang added a comment -

          I do not see why we need globStatus. GlobPath is essentially pattern matching. If the provided path does not contain any pattern, the given path is returned without talking to the namenode.

          Show
          Hairong Kuang added a comment - I do not see why we need globStatus. GlobPath is essentially pattern matching. If the provided path does not contain any pattern, the given path is returned without talking to the namenode.
          Hide
          Raghu Angadi added a comment - - edited

          globStatus would certainly be useful since globPaths() is used in many places where we really want to do globStatus(). globStatus is much more efficient in those cases since we aften do '{{for(path : globPaths(pattern))

          { stat = listStatus(path) ... }

          }}'.

          I am not sure if globPaths() can go away. One difference I see is that globPath("/non/existent/path/withoutglob") returns simple path without any filesystem interaction (as expected). But globStatus("/non/existent/path/withoutglob") will ask filesystem and will return NULL (or array with zero entries).

          Show
          Raghu Angadi added a comment - - edited globStatus would certainly be useful since globPaths() is used in many places where we really want to do globStatus(). globStatus is much more efficient in those cases since we aften do '{{for(path : globPaths(pattern)) { stat = listStatus(path) ... } }}'. I am not sure if globPaths() can go away. One difference I see is that globPath("/non/existent/path/withoutglob") returns simple path without any filesystem interaction (as expected). But globStatus("/non/existent/path/withoutglob") will ask filesystem and will return NULL (or array with zero entries).
          Hide
          Raghu Angadi added a comment -

          Also, this would not duplicate code. globPaths() would just be implemented with globStatus() (when there is a glob in the path).

          Show
          Raghu Angadi added a comment - Also, this would not duplicate code. globPaths() would just be implemented with globStatus() (when there is a glob in the path).
          Hide
          Hairong Kuang added a comment -

          GlobPath is intended to return all pathes that matches the given glob. It is not intended to do "'for(path : globPaths(pattern))

          { stat = listStatus(path) ... }

          '. The feature that you want is listing all the pathes that matches the glob.

          Show
          Hairong Kuang added a comment - GlobPath is intended to return all pathes that matches the given glob. It is not intended to do "'for(path : globPaths(pattern)) { stat = listStatus(path) ... } '. The feature that you want is listing all the pathes that matches the glob.
          Hide
          Raghu Angadi added a comment -

          > "'for(path : globPaths(pattern))

          { stat = listStatus(path) ... }

          '.
          FsShell.setReplication() is an example of this pattern of use (essentially).

          I agree that globStatus() may not replace all uses of globPaths().

          Show
          Raghu Angadi added a comment - > "'for(path : globPaths(pattern)) { stat = listStatus(path) ... } '. FsShell.setReplication() is an example of this pattern of use (essentially). I agree that globStatus() may not replace all uses of globPaths().
          Hide
          Doug Cutting added a comment -

          Globbing is implemented on top of listPaths() which is implemented on top of listStatus(). The primitive globbing API should not throw away that status information. It should keep it so that glob clients which need it do not have to call getStatus() for each file that matches. Currently the cache of FileStatus hides the cost of these getStatus() calls, but that cache will break things once files and their status can change. So we need globStatus() before we can remove the cache.

          FileInputFormat, for example, uses globPaths() to list files matching the input specification, then it uses getStatus() on each matching path when building splits. This must change to call globStatus() before the cache is removed.

          Long-term, globPaths() and listPaths() may perhaps still be useful as a utility methods implemented in terms of of globStatus() and listStatus(), but since most current users of these will be broken performancewise once the cache is removed, we should deprecate them now to strongly encourage folks to stop using them before that cache is removed, to give fair warning.

          Show
          Doug Cutting added a comment - Globbing is implemented on top of listPaths() which is implemented on top of listStatus(). The primitive globbing API should not throw away that status information. It should keep it so that glob clients which need it do not have to call getStatus() for each file that matches. Currently the cache of FileStatus hides the cost of these getStatus() calls, but that cache will break things once files and their status can change. So we need globStatus() before we can remove the cache. FileInputFormat, for example, uses globPaths() to list files matching the input specification, then it uses getStatus() on each matching path when building splits. This must change to call globStatus() before the cache is removed. Long-term, globPaths() and listPaths() may perhaps still be useful as a utility methods implemented in terms of of globStatus() and listStatus(), but since most current users of these will be broken performancewise once the cache is removed, we should deprecate them now to strongly encourage folks to stop using them before that cache is removed, to give fair warning.
          Hide
          Hairong Kuang added a comment - - edited

          I am still not comfortable with this change:

          1. Some of shell commands like delete, copy, and rename use globPath but don't need FileStatus.
          2. GlobPath does not always call listPath for every directory. For example, globPath("/user/*/data") needs only to listPath("/user"). Returning FileStatus[] requires additional listPath calls on each user xx's home directory /user/xx and the root /. This is a lot of overhead.

          Show
          Hairong Kuang added a comment - - edited I am still not comfortable with this change: 1. Some of shell commands like delete, copy, and rename use globPath but don't need FileStatus. 2. GlobPath does not always call listPath for every directory. For example, globPath("/user/*/data") needs only to listPath("/user"). Returning FileStatus[] requires additional listPath calls on each user xx's home directory /user/xx and the root /. This is a lot of overhead.
          Hide
          Doug Cutting added a comment -

          > For example, globPath("/user/*/data") needs only to listPath("/user").

          But listPaths() is not a primitive, it is a utility method defined in terms of listStatus(). So this example is calling listStatus("/user") and then stripping the list of FileStatus objects down to a list of Path objects. We should remove that stripping, or at least make it optional. To make it optional, the primitive glob operation should be globStatus, and globPaths() should become a utility method defined in terms of globStatus().

          > Some of shell commands like delete, copy, and rename use globPath but don't need FileStatus.

          These actually all do need the FileStatus. They need to find out whether each file is a directory or not, to find out when to recurse. Copy also needs other attributes so that they can be set on the copy too. So we'll end up needing to rework these.

          We will not remove globPaths() in this release, so these commands do not need to change right now. But before we can remove the cache we need to examine every place that calls globPaths to check whether these must be converted to use globStatus. That's why we're deprecating globPaths(), to force folks to do this. Then, in 0.17, we can remove the cache from trunk, and start identifying all the problems. But we want users who upgrade to 0.17 to be forwarned, and to have an API that supports cache-free use before we remove the cache, so that they can upgrade to 0.17 more smoothly.

          Show
          Doug Cutting added a comment - > For example, globPath("/user/*/data") needs only to listPath("/user"). But listPaths() is not a primitive, it is a utility method defined in terms of listStatus(). So this example is calling listStatus("/user") and then stripping the list of FileStatus objects down to a list of Path objects. We should remove that stripping, or at least make it optional. To make it optional, the primitive glob operation should be globStatus, and globPaths() should become a utility method defined in terms of globStatus(). > Some of shell commands like delete, copy, and rename use globPath but don't need FileStatus. These actually all do need the FileStatus. They need to find out whether each file is a directory or not, to find out when to recurse. Copy also needs other attributes so that they can be set on the copy too. So we'll end up needing to rework these. We will not remove globPaths() in this release, so these commands do not need to change right now. But before we can remove the cache we need to examine every place that calls globPaths to check whether these must be converted to use globStatus. That's why we're deprecating globPaths(), to force folks to do this. Then, in 0.17, we can remove the cache from trunk, and start identifying all the problems. But we want users who upgrade to 0.17 to be forwarned, and to have an API that supports cache-free use before we remove the cache, so that they can upgrade to 0.17 more smoothly.
          Hide
          Raghu Angadi added a comment -

          Assuming globPaths() goes away.. Should a user of globStatus() be able to distinguish between a non-existent path and a glob that does not match any files? If yes, how? Many hadoop shell commands treat these two differently (I think that is consistent with normal shell behavior).

          Show
          Raghu Angadi added a comment - Assuming globPaths() goes away.. Should a user of globStatus() be able to distinguish between a non-existent path and a glob that does not match any files? If yes, how? Many hadoop shell commands treat these two differently (I think that is consistent with normal shell behavior).
          Hide
          Raghu Angadi added a comment -

          Also, until globPaths() is removed, its probably better if its behavior (or contract?) does not change.

          Show
          Raghu Angadi added a comment - Also, until globPaths() is removed, its probably better if its behavior (or contract?) does not change.
          Hide
          Doug Cutting added a comment -

          > Should a user of globStatus() be able to distinguish between a non-existent path and a glob that does not match any files?

          I'm not sure I completely understand the distinction. In one case are you passing a path without any meta characters but that does not exist, and in the other one with metacharacters but that matches no files?

          In any case it should probably handle this the same way globPaths() does. If the distinction is important then perhaps the non-existing file case should return null, while the non-matching expression case should return an empty array.

          Show
          Doug Cutting added a comment - > Should a user of globStatus() be able to distinguish between a non-existent path and a glob that does not match any files? I'm not sure I completely understand the distinction. In one case are you passing a path without any meta characters but that does not exist, and in the other one with metacharacters but that matches no files? In any case it should probably handle this the same way globPaths() does. If the distinction is important then perhaps the non-existing file case should return null, while the non-matching expression case should return an empty array.
          Hide
          Raghu Angadi added a comment -

          > I'm not sure I completely understand the distinction. In one case are you passing a path without any meta characters but that does not exist, and in the other one with metacharacters but that matches no files?

          yes. e.g. following two commands have different contents in stderr:

          • bin/hadoop fs -cat '/tmp/nonexistent*' /tmp/exists
          • bin/hadoop fs -cat '/tmp/nonexistent' /tmp/exists

          This is how the current behavior is.

          > If the distinction is important then perhaps the non-existing file case should return null, while the non-matching expression case should return an empty array.
          Sounds good. globPaths() can use this keep the current behavior unchanged.

          Show
          Raghu Angadi added a comment - > I'm not sure I completely understand the distinction. In one case are you passing a path without any meta characters but that does not exist, and in the other one with metacharacters but that matches no files? yes. e.g. following two commands have different contents in stderr: bin/hadoop fs -cat '/tmp/nonexistent*' /tmp/exists bin/hadoop fs -cat '/tmp/nonexistent' /tmp/exists This is how the current behavior is. > If the distinction is important then perhaps the non-existing file case should return null, while the non-matching expression case should return an empty array. Sounds good. globPaths() can use this keep the current behavior unchanged.
          Hide
          Hairong Kuang added a comment -

          My first attempt to this issue. Comments welcome!

          Show
          Hairong Kuang added a comment - My first attempt to this issue. Comments welcome!
          Hide
          Doug Cutting added a comment -

          A few comments:

          • should stat2paths be a public method on FileSystem? I'd prefer it were either private or perhaps on FileUtil.
          • globPaths() isn't deprecated. Do we think we'll keep this, or should it be deprecated? It is handy in some cases, but, on the other hand, we'd like to force folks to examine their uses of it, since in most cases performance will become abysmal once the FileStatus cache is removed, and we don't want to surprise folks with that. Thoughts?
          Show
          Doug Cutting added a comment - A few comments: should stat2paths be a public method on FileSystem? I'd prefer it were either private or perhaps on FileUtil. globPaths() isn't deprecated. Do we think we'll keep this, or should it be deprecated? It is handy in some cases, but, on the other hand, we'd like to force folks to examine their uses of it, since in most cases performance will become abysmal once the FileStatus cache is removed, and we don't want to surprise folks with that. Thoughts?
          Hide
          Hairong Kuang added a comment -

          Stat2Paths is public because I expect that users need to use it after we remove listPath etc. in release 0.17. OK, I will put it in FileUtil.

          I am thinking to keep globPath for now. After we make changes in FSShell, then we decide if we can live without it.

          Show
          Hairong Kuang added a comment - Stat2Paths is public because I expect that users need to use it after we remove listPath etc. in release 0.17. OK, I will put it in FileUtil. I am thinking to keep globPath for now. After we make changes in FSShell, then we decide if we can live without it.
          Hide
          Hairong Kuang added a comment -

          > If the distinction is important then perhaps the non-existing file case should return null, while the non-matching expression case should return an empty array.
          Just saw this. Sounds good. I will make changes to globStatus reflecting this.

          Show
          Hairong Kuang added a comment - > If the distinction is important then perhaps the non-existing file case should return null, while the non-matching expression case should return an empty array. Just saw this. Sounds good. I will make changes to globStatus reflecting this.
          Hide
          Hairong Kuang added a comment -

          I deprecated globPath in this patch and I changed the semantics of globStatus, which returns null if the user supplied path does not exist and returns an empty array if the input path has a glob but no path matches it.

          Show
          Hairong Kuang added a comment - I deprecated globPath in this patch and I changed the semantics of globStatus, which returns null if the user supplied path does not exist and returns an empty array if the input path has a glob but no path matches it.
          Hide
          Raghu Angadi added a comment -

          Doesn't this patch essentially do ' arr; for (path : old_globPaths()) arr[i++] = getFileStatus(path); return arr; '. Is this what we wanted? I thought we wanted other way around.

          Also this looks like regressions of HADOOP-2151 since 'hasPattern()' is only checked for last component.

          Show
          Raghu Angadi added a comment - Doesn't this patch essentially do ' arr; for (path : old_globPaths()) arr [i++] = getFileStatus(path); return arr; '. Is this what we wanted? I thought we wanted other way around. Also this looks like regressions of HADOOP-2151 since 'hasPattern()' is only checked for last component.
          Hide
          Raghu Angadi added a comment -

          Is this a blocker for 16 feature freeze?

          Show
          Raghu Angadi added a comment - Is this a blocker for 16 feature freeze?
          Hide
          Raghu Angadi added a comment -

          > Also this looks like regressions of HADOOP-2151 since 'hasPattern()' is only checked for last component.
          e.g:
          with the patch:

          $ bin/hadoop fs -D fs.default.name=local -ls '/tmp/x*/xxx'
          ls: Could not get listing for file:/tmp/x/xxx
          $ 

          trunk :

          bin/hadoop fs -D fs.default.name=local -ls '/tmp/x*/xxx'
          $ 

          Here, /tmp/x exists but not '/tmp/x/xxx'. It does not matter whether /tmp/x is a file or a directory.

          Show
          Raghu Angadi added a comment - > Also this looks like regressions of HADOOP-2151 since 'hasPattern()' is only checked for last component. e.g: with the patch: $ bin/hadoop fs -D fs.default.name=local -ls '/tmp/x*/xxx' ls: Could not get listing for file:/tmp/x/xxx $ trunk : bin/hadoop fs -D fs.default.name=local -ls '/tmp/x*/xxx' $ Here, /tmp/x exists but not ' /tmp/x/xxx '. It does not matter whether /tmp/x is a file or a directory.
          Hide
          Doug Cutting added a comment -

          > Is this what we wanted? I thought we wanted other way around.

          I don't think it does that in all cases, but it does still appear to call getStatus() in places. I've not yet examined the logic to see if that's easily avoidable or not. But it's not a fatal problem at this point. For this release the important thing is to have globStatus() as the preferred, non-deprecated method. Once we remove the status cache, during 0.17 development, we'll soon find out whether the globStatus() implementation needs more work to perform well without a cache, and fix that before 0.17 is released. But that aspect shouldn't block this for 0.16, since we still have the cache in 0.16.

          Show
          Doug Cutting added a comment - > Is this what we wanted? I thought we wanted other way around. I don't think it does that in all cases, but it does still appear to call getStatus() in places. I've not yet examined the logic to see if that's easily avoidable or not. But it's not a fatal problem at this point. For this release the important thing is to have globStatus() as the preferred, non-deprecated method. Once we remove the status cache, during 0.17 development, we'll soon find out whether the globStatus() implementation needs more work to perform well without a cache, and fix that before 0.17 is released. But that aspect shouldn't block this for 0.16, since we still have the cache in 0.16.
          Hide
          Hairong Kuang added a comment - - edited

          > Doesn't this patch essentially do ' arr; for (path : old_globPaths()) arr[i++] = getFileStatus(path); return arr; '. Is this what we wanted? I thought we wanted other way around.
          No, this patch does not do what you described. Basically it only listStatus on the parent directories when there is a glob in the component and it calls getFileStatus on the last component if there is no glob in there.

          For example 1, globStatus("/user/hairong/file*") only does listStatus("/user/hairong") and returns status for the matched files/subdirectories; Previously globPath("/user/hairong/file*") would listStatus("/user/hairong") then discard all statuses of the matched files/subdirectories then return only paths. The caller has to call getFileStatus again for each returned path.

          For example 2, globStatus("/user/*/file") calls listStaus("/user") and then calls getFileStatus on all the paths matched /user/*/file. It does as what you described.

          Show
          Hairong Kuang added a comment - - edited > Doesn't this patch essentially do ' arr; for (path : old_globPaths()) arr [i++] = getFileStatus(path); return arr; '. Is this what we wanted? I thought we wanted other way around. No, this patch does not do what you described. Basically it only listStatus on the parent directories when there is a glob in the component and it calls getFileStatus on the last component if there is no glob in there. For example 1, globStatus("/user/hairong/file*") only does listStatus("/user/hairong") and returns status for the matched files/subdirectories; Previously globPath("/user/hairong/file*") would listStatus("/user/hairong") then discard all statuses of the matched files/subdirectories then return only paths. The caller has to call getFileStatus again for each returned path. For example 2, globStatus("/user/*/file") calls listStaus("/user") and then calls getFileStatus on all the paths matched /user/*/file. It does as what you described.
          Hide
          Raghu Angadi added a comment - - edited

          You are right, it does what I described only in example 2 above and not in the first example. When I get some more time I will think about how to avoid it in second example as well. Something like replacing globPathLevel() with with globStatusLevel(). btw, glob path in second example is /usrer/*/file.

          Show
          Raghu Angadi added a comment - - edited You are right, it does what I described only in example 2 above and not in the first example. When I get some more time I will think about how to avoid it in second example as well. Something like replacing globPathLevel() with with globStatusLevel(). btw, glob path in second example is /usrer/*/file .
          Hide
          Hairong Kuang added a comment -

          Isn't the behavior in example 2 what we expect? I must have misunderstood something. What do you want to avoid?

          Show
          Hairong Kuang added a comment - Isn't the behavior in example 2 what we expect? I must have misunderstood something. What do you want to avoid?
          Hide
          Raghu Angadi added a comment -

          > What do you want to avoid?
          exactly, after talking to you it looks more like there isn't anything we can avoid. My initial thought was we could avoid the final getStatus() loop. Thanks.

          Show
          Raghu Angadi added a comment - > What do you want to avoid? exactly, after talking to you it looks more like there isn't anything we can avoid. My initial thought was we could avoid the final getStatus() loop. Thanks.
          Hide
          Hairong Kuang added a comment - - edited

          Thanks Raghu! You comment relieved my mind. Don't want to have a wrong algorithm right before the feature freeze.

          Regarding the regression, yes, I removed what HADOOP-2151 did because I think it is not efficient to call exist for each component when there is a glob on the path. My algorithm depends on getFileStatus to throw an exception that indicates an non-existent path. It works on dfs. But LocalFileSystem.getFileStatus returns a valid FileStatus object on a non-existent path. I will fix this. I'd like to change the semantics of getFileStaus to return null on a non-existent path. Thanks for helping me test this feature.

          Show
          Hairong Kuang added a comment - - edited Thanks Raghu! You comment relieved my mind. Don't want to have a wrong algorithm right before the feature freeze. Regarding the regression, yes, I removed what HADOOP-2151 did because I think it is not efficient to call exist for each component when there is a glob on the path. My algorithm depends on getFileStatus to throw an exception that indicates an non-existent path. It works on dfs. But LocalFileSystem.getFileStatus returns a valid FileStatus object on a non-existent path. I will fix this. I'd like to change the semantics of getFileStaus to return null on a non-existent path. Thanks for helping me test this feature.
          Hide
          Raghu Angadi added a comment -

          hmm... by luck, there is one small side benefit from my misread of the code . I checked with localfs only because it did not need a running dfs.

          Show
          Raghu Angadi added a comment - hmm... by luck, there is one small side benefit from my misread of the code . I checked with localfs only because it did not need a running dfs.
          Hide
          Hairong Kuang added a comment -

          Raghu, whether it is by luck or not, it is always good to have somebody to take extra effort to make sure that your code works correctly.

          The patch fixed the non-existent path problem. It introduces an incompatible change that made getFileStatus to return null for a non-existent path. In the current trunk, dfs throws a RemoteException with a "file does not exist" message and local file system returns a valid FileStatus object.

          Show
          Hairong Kuang added a comment - Raghu, whether it is by luck or not, it is always good to have somebody to take extra effort to make sure that your code works correctly. The patch fixed the non-existent path problem. It introduces an incompatible change that made getFileStatus to return null for a non-existent path. In the current trunk, dfs throws a RemoteException with a "file does not exist" message and local file system returns a valid FileStatus object.
          Hide
          Hairong Kuang added a comment -

          On the second thought, I saw a lot of calls like getFileStatus().isDir in our code base. Making getFileStatus to return null would break those calls. How about letting getFileStatus throws a FileNotFoundException when the path does not exist?

          Show
          Hairong Kuang added a comment - On the second thought, I saw a lot of calls like getFileStatus().isDir in our code base. Making getFileStatus to return null would break those calls. How about letting getFileStatus throws a FileNotFoundException when the path does not exist?
          Hide
          Doug Cutting added a comment -

          Do listStatus(Path[]) and globStatus(Path[]) need to be public? Does anyone use these but the globbing code? I generally prefer not to make something public without a strong need. Other than that, this looks good to me.

          Show
          Doug Cutting added a comment - Do listStatus(Path[]) and globStatus(Path[]) need to be public? Does anyone use these but the globbing code? I generally prefer not to make something public without a strong need. Other than that, this looks good to me.
          Hide
          Hairong Kuang added a comment -

          This patch removed the method listStatus(Path[]) and made listStatus(Path[], PathFilter) and getFileStatus(Path[]) private in FileSystem.

          Show
          Hairong Kuang added a comment - This patch removed the method listStatus(Path[]) and made listStatus(Path[], PathFilter) and getFileStatus(Path[]) private in FileSystem.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12373456/globStatus5.patch
          against trunk revision r613115.

          @author +1. The patch does not contain any @author tags.

          javadoc -1. The javadoc tool appears to have generated messages.

          javac +1. The applied patch does not generate any new compiler warnings.

          findbugs -1. The patch appears to introduce 1 new Findbugs warnings.

          core tests -1. The patch failed core unit tests.

          contrib tests +1. The patch passed contrib unit tests.

          Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1638/testReport/
          Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1638/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1638/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1638/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12373456/globStatus5.patch against trunk revision r613115. @author +1. The patch does not contain any @author tags. javadoc -1. The javadoc tool appears to have generated messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs -1. The patch appears to introduce 1 new Findbugs warnings. core tests -1. The patch failed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1638/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1638/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1638/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1638/console This message is automatically generated.
          Hide
          Hairong Kuang added a comment -

          Fixed the javadoc and findbugs warnings. It also included a minor optimization for globStatus.

          Show
          Hairong Kuang added a comment - Fixed the javadoc and findbugs warnings. It also included a minor optimization for globStatus.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12373547/globStatus6.patch
          against trunk revision r613359.

          @author +1. The patch does not contain any @author tags.

          javadoc +1. The javadoc tool did not generate any warning messages.

          javac +1. The applied patch does not generate any new compiler warnings.

          findbugs -1. The patch appears to introduce 3 new Findbugs warnings.

          core tests -1. The patch failed core unit tests.

          contrib tests +1. The patch passed contrib unit tests.

          Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1651/testReport/
          Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1651/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1651/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1651/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12373547/globStatus6.patch against trunk revision r613359. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs -1. The patch appears to introduce 3 new Findbugs warnings. core tests -1. The patch failed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1651/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1651/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1651/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1651/console This message is automatically generated.
          Hide
          Hairong Kuang added a comment -

          The patch fixed findbugs errors.

          Show
          Hairong Kuang added a comment - The patch fixed findbugs errors.
          Hide
          Hairong Kuang added a comment -

          Merged with the current trunk.

          Show
          Hairong Kuang added a comment - Merged with the current trunk.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12373956/globStatus8.patch
          against trunk revision 614721.

          @author +1. The patch does not contain any @author tags.

          javadoc +1. The javadoc tool did not generate any warning messages.

          javac -1. The applied patch generated 607 javac compiler warnings (more than the trunk's current 598 warnings).

          findbugs -1. The patch appears to introduce 3 new Findbugs warnings.

          core tests -1. The patch failed core unit tests.

          contrib tests +1. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1668/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1668/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1668/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1668/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12373956/globStatus8.patch against trunk revision 614721. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac -1. The applied patch generated 607 javac compiler warnings (more than the trunk's current 598 warnings). findbugs -1. The patch appears to introduce 3 new Findbugs warnings. core tests -1. The patch failed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1668/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1668/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1668/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1668/console This message is automatically generated.
          Hide
          Hairong Kuang added a comment -

          fixed findBugs errors. The patch deprecated listPath and globPath, so it increases the number of javac warnings.

          Show
          Hairong Kuang added a comment - fixed findBugs errors. The patch deprecated listPath and globPath, so it increases the number of javac warnings.
          Hide
          Nigel Daley added a comment -

          +1. fix looks good. I also ran these tests against http://issues.apache.org/jira/secure/attachment/12374010/globStatus9.patch

          @author +1. The patch does not contain any @author tags.

          javadoc +1. The javadoc tool did not generate any warning messages.

          javac -1. The applied patch generated 607 javac compiler warnings (more than the trunk's current 585 warnings).

          findbugs +1. The patch does not introduce any new Findbugs warnings.

          The additional javac warnings are expected since this patch deprecates a couple of methods.

          Hairong, once you run "ant test" on your own machine with no failures, please commit this.

          Show
          Nigel Daley added a comment - +1. fix looks good. I also ran these tests against http://issues.apache.org/jira/secure/attachment/12374010/globStatus9.patch @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac -1. The applied patch generated 607 javac compiler warnings (more than the trunk's current 585 warnings). findbugs +1. The patch does not introduce any new Findbugs warnings. The additional javac warnings are expected since this patch deprecates a couple of methods. Hairong, once you run "ant test" on your own machine with no failures, please commit this.
          Hide
          Hairong Kuang added a comment -

          Merged with the trunk.

          Show
          Hairong Kuang added a comment - Merged with the trunk.
          Hide
          Hairong Kuang added a comment -

          I committed this.

          Show
          Hairong Kuang added a comment - I committed this.
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #380 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/380/ )

            People

            • Assignee:
              Hairong Kuang
              Reporter:
              Doug Cutting
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development