Issue Details (XML | Word | Printable)

Key: HADOOP-3074
Type: New Feature New Feature
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Christophe Taton
Reporter: Christophe Taton
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

URLStreamHandler for the DFS

Created: 22/Mar/08 10:04 AM   Updated: 22/Aug/08 07:50 PM
Component/s: util
Affects Version/s: None
Fix Version/s: 0.18.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works 3074-20080324a.patch 2008-03-24 11:05 AM Christophe Taton 7 kB
Text File Licensed for inclusion in ASF works 3074-20080324b.patch 2008-03-24 09:15 PM Christophe Taton 9 kB
Text File Licensed for inclusion in ASF works 3074-20080325a.patch 2008-03-25 03:28 PM Christophe Taton 11 kB
Text File Licensed for inclusion in ASF works 3074-20080406a.patch 2008-04-06 06:46 PM Christophe Taton 11 kB
Text File Licensed for inclusion in ASF works 3074-20080412a.patch 2008-04-12 06:30 AM Christophe Taton 11 kB
Text File Licensed for inclusion in ASF works 3074-20080412b.patch 2008-04-12 09:11 AM Christophe Taton 12 kB
Text File Licensed for inclusion in ASF works 3074-20080414a.patch 2008-04-14 06:46 PM Christophe Taton 12 kB

Resolution Date: 15/Apr/08 06:26 AM


 Description  « Hide
This issue aims at providing a handler to resolve DFS URLs ("hdfs://host:port/file/to/path"), so that such URLs can be read using the URL API (mainly InputStream url.openStream()).

This allows the use of a URLClassLoader which would serve classes directly from the DFS.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Christophe Taton added a comment - 22/Mar/08 10:10 AM
Here is a first proposal. The handler has been put in package org.apache.hadoop.util.protocols.hdfs. Thus you can register it in the JVM as in: java -Djava.protocol.handler.pkgs=org.apache.hadoop.util.protocols <MainClass>.

Hadoop QA added a comment - 24/Mar/08 10:52 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12378434/3074-20080322a.patch
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 3 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit -1. The applied patch generated 197 release audit warnings (more than the trunk's current 195 warnings).

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2034/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2034/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2034/artifact/trunk/build/test/checkstyle-errors.html
Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2034/artifact/trunk/current/releaseAuditDiffWarnings.txt
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2034/console

This message is automatically generated.


Christophe Taton added a comment - 24/Mar/08 11:05 AM
Fixing missing license headers...

Hadoop QA added a comment - 24/Mar/08 01:38 PM
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12378477/3074-20080324a.patch
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 3 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2035/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2035/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2035/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2035/console

This message is automatically generated.


Doug Cutting added a comment - 24/Mar/08 04:39 PM
This need not be specific to HDFS, but could be generic to any FileSystem implementation, no? The FileSystem API subsumes Java's URL connection API. We could even provide a generic URLStreamHandlerFactory.

http://java.sun.com/javase/6/docs/api/java/net/URL.html#setURLStreamHandlerFactory(java.net.URLStreamHandlerFactory)

Note that we might special-case the "file:" protocol, since the JVM already includes an handler for that.


Christophe Taton added a comment - 24/Mar/08 09:15 PM
The patch 3074-20080324b.patch is not DFS specific anymore, as suggested by Doug.

Doug Cutting added a comment - 24/Mar/08 11:46 PM
+1 This looks good to me.

Raghu Angadi added a comment - 25/Mar/08 12:27 AM
Looks good.

minor:

Do you need to use FSDataInputStream or FSDataOutputStream in the test or in the implementation. Looks like just normal InputStream etc are enough.


Christophe Taton added a comment - 25/Mar/08 03:28 PM
I dug concerning file:// URLs and the previous patch does not work for these URLs.
This happens because of the late loading of Configuration resource files (which itself relies on URLs).
The new patch forces the load of such resource files before setting the new handler factory, thus allowing it to handle file:// URLs too.

The code has also been cleaned (FSDataInput/OutputStream replaced with Input/OutputStream).


Hadoop QA added a comment - 26/Mar/08 12:13 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12378573/3074-20080325a.patch
against trunk revision 619744.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 3 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs -1. The patch appears to cause Findbugs to fail.

core tests -1. The patch failed core unit tests.

contrib tests -1. The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2055/testReport/
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2055/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2055/console

This message is automatically generated.


Christophe Taton added a comment - 06/Apr/08 06:46 PM
New patch to conform to findbugs

Hadoop QA added a comment - 11/Apr/08 10:42 PM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12379511/3074-20080406a.patch
against trunk revision 645773.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 3 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs -1. The patch appears to cause Findbugs to fail.

core tests -1. The patch failed core unit tests.

contrib tests -1. The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2213/testReport/
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2213/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2213/console

This message is automatically generated.


Christophe Taton added a comment - 12/Apr/08 06:28 AM
Fixing compilation failure due to the constructor new IOException(Throwable) which does not exist before Java6.

Hadoop QA added a comment - 12/Apr/08 07:48 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12379973/3074-20080412a.patch
against trunk revision 645773.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 3 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs -1. The patch appears to introduce 3 new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2216/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2216/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2216/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2216/console

This message is automatically generated.


Christophe Taton added a comment - 12/Apr/08 09:10 AM
Fixing all findbugs warnings.

Hadoop QA added a comment - 12/Apr/08 11:53 AM
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12379976/3074-20080412b.patch
against trunk revision 645773.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 3 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2218/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2218/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2218/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2218/console

This message is automatically generated.


Doug Cutting added a comment - 14/Apr/08 06:31 PM
I would prefer this were in the fs package, not util. In general, we should aim to put things in the most-specific package possible, otherwise, everything winds up in util.

Also, I would prefer the classes were named FsUrlStreamHandler and FsUrlStreamHandlerFactory. Acronyms are easier to read when capitalized, not all-caps, in Java names.


Christophe Taton added a comment - 14/Apr/08 06:46 PM
Here you are.

Doug Cutting added a comment - 14/Apr/08 06:54 PM
+1

Hadoop QA added a comment - 15/Apr/08 12:19 AM
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12380090/3074-20080414a.patch
against trunk revision 645773.

@author +1. The patch does not contain any @author tags.

tests included +1. The patch appears to include 3 new or modified tests.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2229/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2229/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2229/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2229/console

This message is automatically generated.


Christophe Taton added a comment - 15/Apr/08 06:26 AM
I just committed this. Thanks!

Hudson added a comment - 15/Apr/08 12:27 PM