Description
There was an enhancement to allow semicolon in source/target URLs for distcp use case as part of HDFS-13176 and backward compatibility fix as part of HDFS-13582 . Still there seems to be an issue when trying to trigger distcp from 3.x cluster to pull webhdfs data from 2.x hadoop cluster. We might need to deal with existing fix as described below by making sure if url is already encoded or not. That fixes it.
diff --git a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
index 5936603c34a..dc790286aff 100644
— a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
+++ b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
@@ -609,7 +609,10 @@ URL toUrl(final HttpOpParam.Op op, final Path fspath,
boolean pathAlreadyEncoded = false;
try {
fspathUriDecoded = URLDecoder.decode(fspathUri.getPath(), "UTF-8");
- pathAlreadyEncoded = true;
+ if(!fspathUri.getPath().equals(fspathUriDecoded))
+ { + pathAlreadyEncoded = true; + }} catch (IllegalArgumentException ex)
{ LOG.trace("Cannot decode URL encoded file", ex); }
Attachments
Attachments
Issue Links
- is duplicated by
-
HDFS-14379 WebHdfsFileSystem.toUrl double encodes characters
- Resolved
- is related to
-
HDFS-14341 Weird handling of plus sign in paths in WebHDFS REST API
- Open
-
HDFS-14379 WebHdfsFileSystem.toUrl double encodes characters
- Resolved
- relates to
-
HDFS-14466 Add a regression test for HDFS-14323
- Resolved