Details
-
Bug
-
Status: Resolved
-
P2
-
Resolution: Not A Bug
-
2.0.0
-
None
Description
I'm facing issue when trying to use HadoopFileSystem in my pipeline. It looks like HadoopFileSystem is registring itself under the `file` schema (https://github.com/apache/beam/pull/2777/files#diff-330bd0854dcab6037ef0e52c05d68eb2L79), hence the following Exception is thrown when trying to register HadoopFileSystem.
java.lang.IllegalStateException: Scheme: [file] has conflicting filesystems: [org.apache.beam.sdk.io.LocalFileSystem, org.apache.beam.sdk.io.hdfs.HadoopFileSystem]
at org.apache.beam.sdk.io.FileSystems.verifySchemesAreUnique(FileSystems.java:498)
What is the correct way to handle `hdfs` url out of the box with TextIO & AvroIO ?
String[] args = new String[]{ "--hdfsConfiguration=[{\"dfs.client.use.datanode.hostname\": \"true\"}]"}; HadoopFileSystemOptions options = PipelineOptionsFactory .fromArgs(args) .withValidation() .as(HadoopFileSystemOptions.class); Pipeline pipeline = Pipeline.create(options);
Attachments
Issue Links
- Is contained by
-
BEAM-2457 Error: "Unable to find registrar for hdfs" - need to prevent/improve error message
- Open