(hadoop-native-core/src/main/native/fs/fs.c) Pull this out into a separate function? Seems like an operation that will have to be done frequently.
This is a bit of a special case just for the connection URI. I guess the issue is that you have people connecting with stuff like "localhost:8020", which isn't technically a well-formed URI, but which we sort of have to handle (by looking at it as authority=localhost, port=8020). On the other hand, when someone gives you a path that looks like "myfile:123", you just want to parse it with the standard URI parsing code. We might need more massaging for files with colons in them later, but it's a bit of a grey area (see
HDFS-13) so I'd like to avoid dealing with it for now. For now, I'd like to keep this hack for the connection uri, but not for others.
(hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c) Should precedence be given to the explicitly defined "port" member or the pre-existing port in the URI? It seems like an explicit definition in the builder should take precedence?
So there are three options:
1. fail with error message (current behavior in trunk)
2. hdfsBuilderSetNameNodePort wins if set
3. URI port wins hdfsBuilderSetNameNodePort if set
#2 is hard to implement for jniFS. If you're given a URI such as hdfs://server:123/foo/bar, you'd have to replace 123 with whatever port you liked through string operations, prior to sending along the URI to the java code.
I wish we had never added hdfsBuilderSetNameNodePort... it's definitely superfluous, since the port can be in the URI. Maybe we should just stick with option #1 for now and error out when there is a conflict.
(hadoop-native-core/src/main/native/ndfs/ndfs.c) Is this how the previous HDFS clients worked? Using the previous seen filename won't work if the file has been removed. Just curious...
Yes, this is how the Java code works. I don't think there's an issue with the previous filename getting removed, either. Doing a listStatus with a filename just means that you want filenames that sort after that filename, not that you necessarily think there is such a filename.
(hadoop-native-core/src/main/native/jni/jnifs.c) This code segment appears to be exactly the same as hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c. Maybe a utility function would be useful?
The src/main/native/libhdfs directory is going away, to be replaced by the jnifs/ directory. I haven't done that yet, but it's just an svn delete, not a very interesting patch.