|
[
Permlink
| « Hide
]
Konstantin Shvachko added a comment - 30/Jan/08 09:59 PM
Is this similar or a part of HADOOP-2002?
Yes, but this one is mainly for mkdir.
2423_20080303.patch: updated with current trunk
This patch actually does 2 things.
I agree both tasks are important, but if you just want to optimize mkDirs() as stated above then it could be done much easier. FSDirectory.mkDirs(src, ... ) {
byte[][] components = INode.getPathComponents(src);
INode[] inodes = new INode[components.length];
rootDir.getExistingPathINodes(components, inodes);
// create missing inodes
...............
}
After that you just find first null in the array of inodes, and start adding directories using addNode() and fill in the null elements one by one. Introducing the caching class is important but will take more effort, because we would probably want to to convert all String-path IMO optimizing mkDirs is important by itself, since it is extensively used by our clients. 2423_20080304.patch:
The ops per sec in NNThroughputBenchmark create increases from 82.6 to 107.0 in my machine. 2423_20080304b.patch: further improved ops per sec to 127.2
2423_20080304c.patch: fixed a bug.
2423_20080304d.patch: fixed another bug. The one passed all the tests.
This looks much better. There is room for more optimization.
1. This is indeed a very good point. The codes becomes much simpler.
2. Sure. 3. Since path should not have consecutive "/", split with limit -1 or 0 are the same. So, I think the split with less number of parameters is preferred. 4. Because of (1), the changes of addNode(String path, T newNode, boolean inheritPermission) are reverted. We could make the change from throwing FileNotFoundException to returning null later. 5. Since addChild(...) does not throw FileNotFoundException, nothing to catch now. +1
I am only worried about limit 0 in String.split(). There was a reason for -1, which I don't remember now, but could you please verify some corner cases, like creating a file/directory in the root, probably renaming, etc.. > I am only worried about limit 0 in String.split().
The only different for split(p, 0) and split(p, -1) is the root path, i.e. "/". However, root path is treated as a special case in the codes (for both before and after the patch.) In addition to the regression tests, I manually tested it. 2423_20080311.patch: fixed a bug. Since there are already a lot of tests for mkdir, no additional tests added.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12377629/2423_20080311.patch against trunk revision 619744. @author +1. The patch does not contain any @author tags. tests included -1. The patch doesn't appear to include any new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1950/testReport/ This message is automatically generated. I just committed this. Thanks Nicholas.
Integrated in Hadoop-trunk #428 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/428/
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||