Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.8.0
-
None
-
Reviewed
Description
Got this exception when running distcp -diff with relative paths:
$ hadoop distcp -update -diff s1 s2 d1 d2 16/03/25 09:45:40 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[d1], targetPath=d2, targetPathExists=true, preserveRawXattrs=false, filtersFile='null'} 16/03/25 09:45:40 INFO client.RMProxy: Connecting to ResourceManager at jzhuge-balancer-1.vpc.cloudera.com/172.26.21.70:8032 16/03/25 09:45:41 ERROR tools.DistCp: Exception encountered java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2 at org.apache.hadoop.fs.Path.initialize(Path.java:206) at org.apache.hadoop.fs.Path.<init>(Path.java:197) at org.apache.hadoop.tools.SimpleCopyListing.getPathWithSchemeAndAuthority(SimpleCopyListing.java:193) at org.apache.hadoop.tools.SimpleCopyListing.addToFileListing(SimpleCopyListing.java:202) at org.apache.hadoop.tools.SimpleCopyListing.doBuildListingWithSnapshotDiff(SimpleCopyListing.java:243) at org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:172) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) at org.apache.hadoop.tools.DistCp.createInputFileListingWithDiff(DistCp.java:388) at org.apache.hadoop.tools.DistCp.execute(DistCp.java:164) at org.apache.hadoop.tools.DistCp.run(DistCp.java:123) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.tools.DistCp.main(DistCp.java:436) Caused by: java.net.URISyntaxException: Relative path in absolute URI: hdfs://jzhuge-balancer-1.vpc.cloudera.com:8020./d1/.snapshot/s2 at java.net.URI.checkPath(URI.java:1804) at java.net.URI.<init>(URI.java:752) at org.apache.hadoop.fs.Path.initialize(Path.java:203) ... 11 more
But theses commands worked:
- Absolute path: hadoop distcp -update -diff s1 s2 /user/systest/d1 /user/systest/d2
- No -diff: hadoop distcp -update d1 d2
However, everything was fine when I ran hadoop distcp -update -diff s1 s2 d1 d2 again. I am not sure the problem only exists with option -diff. Trying to reproduce.