Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.0.1
-
None
Description
The hdfs uri is passed using config:
conf.put(Configs.HDFS_URI, hdfsUri);
I see two problems with this approach:
1. If someone wants to used two hdfsUri in same or different spouts - then that does not seem feasible.
https://github.com/apache/storm/blob/d17b3b9c3cbc89d854bfb436d213d11cfd4545ec/examples/storm-starter/src/jvm/storm/starter/HdfsSpoutTopology.java#L117-L117
https://github.com/apache/storm/blob/d17b3b9c3cbc89d854bfb436d213d11cfd4545ec/external/storm-hdfs/src/main/java/org/apache/storm/hdfs/spout/HdfsSpout.java#L331-L331
if ( !conf.containsKey(Configs.SOURCE_DIR) ) { LOG.error(Configs.SOURCE_DIR + " setting is required"); throw new RuntimeException(Configs.SOURCE_DIR + " setting is required"); } this.sourceDirPath = new Path( conf.get(Configs.SOURCE_DIR).toString() );
2. It does not fail fast i.e. at the time of topology submissing. We can fail fast if the hdfs path is invalid or credentials/permissions are not ok. Such errors at this time can only be detected at runtime by looking at the worker logs.
https://github.com/apache/storm/blob/d17b3b9c3cbc89d854bfb436d213d11cfd4545ec/external/storm-hdfs/src/main/java/org/apache/storm/hdfs/spout/HdfsSpout.java#L297-L297
Attachments
Issue Links
- links to