Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.12
-
None
-
None
-
Patch Available
Description
If a path (input or output) does not belong to the configured default FileSystem various Nutch tools may raise an exception like
Exception in ... java.lang.IllegalArgumentException: Wrong FS: s3a://..., expected: hdfs://...
This is fixed by getting a reference to the FileSystem from the Path object
FileSystem fs = path.getFileSystem(getConf());
instead of
FileSystem fs = FileSystem.get(getConf());
A given path (e.g., s3a://...) may not belong to the default file system (hdfs:// or file:// in local mode) and simple checks such as fs.exists(path) then will fail. Cf. FileSystem.checkPath(path), and FileSystem.get(conf) vs. FileSystem.get(URI,conf) which is called by Path.getFileSystem(conf).
Note that the FileSystem for input and output may be different, e.g., read from HDFS and write to S3.
Attachments
Issue Links
- relates to
-
NUTCH-2494 Fetcher: java.lang.IllegalArgumentException: Wrong FS: s3
- Closed
- links to