Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
0.11.0
-
None
-
None
Description
If a file with spaces in the name (e.g. foo bar.txt) is requested from HDFS, through WebHDFS and Knox - then Knox rewrites the %20 encoding in the URL sent by the client, with + encoding (e.g. foo%20bar.txt -> foo+bar.txt). This results in an HTTP 404 being returned by WebHDFS, and hence by Knox. Requesting the same file directly from WebHDFS works. Example
Client request
curl "https://<hostname>:18443/gateway/<cluster>/webhdfs/v1/docs/filename%20with%20spaces.pdf?op=OPEN" \ -<username>:<password> -k -s
Knox response body
{"exception":"FileNotFoundException", "javaClassName":"java.io.FileNotFoundException", "message":"File /docs/filename+with+spaces.pdf not found."}
Knox logs
==> /var/log/hadoop/knox/gateway-audit.log <== 17/05/24 15:51:05 ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS||||access|uri|/gateway/<cluster>/webhdfs/v1/docs/filename with spaces.pdf?op=OPEN|unavailable|Request method: GET 17/05/24 15:51:05 ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||authentication|uri|/gateway/<cluster>/webhdfs/v1/docs/filename with spaces.pdf?op=OPEN|success| 17/05/24 15:51:05 ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||authentication|uri|/gateway/<cluster>/webhdfs/v1/docs/filename with spaces.pdf?op=OPEN|success|Groups: [] 17/05/24 15:51:05 ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||authorization|uri|/gateway/<cluster>/webhdfs/v1/docs/filename with spaces.pdf?op=OPEN|success| 17/05/24 15:51:05 ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||dispatch|uri|http://<namenode>.<cluster>:50070/webhdfs/v1/docs/filename+with+spaces.pdf?op=OPEN&doAs=<username>|unavailable|Request method: GET 17/05/24 15:51:05 ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||dispatch|uri|http://<namenode>.<cluster>:50070/webhdfs/v1/docs/filename+with+spaces.pdf?op=OPEN&doAs=<username>|success|Response status: 404 17/05/24 15:51:05 ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||access|uri|/gateway/<cluster>/webhdfs/v1/docs/filename with spaces.pdf?op=OPEN|success|Response status: 404 ==> /var/log/hadoop/knox/gateway.log <== 2017-05-24 15:51:05,254 INFO hadoop.gateway (KnoxLdapRealm.java:getUserDn(691)) - Computed userDn: uid=<username>,cn=users,cn=accounts,dc=<cluster> using dnTemplate for principal: <username> 2017-05-24 15:51:05,259 INFO hadoop.gateway (AclsAuthorizationFilter.java:doFilter(85)) - Access Granted: true
Direct WebHDFS request for the same file
# curl -si -u: "http://<namenode>:50070/webhdfs/v1/docs/filename%20with%20spaces.pdf?op=OPEN" --negotiate -L | head -n40 HTTP/1.1 401 Authentication required Cache-Control: must-revalidate,no-cache,no-store Date: Wed, 24 May 2017 19:01:41 GMT Pragma: no-cache Date: Wed, 24 May 2017 19:01:41 GMT Pragma: no-cache X-FRAME-OPTIONS: SAMEORIGIN WWW-Authenticate: Negotiate Set-Cookie: hadoop.auth=; Path=/; HttpOnly Content-Type: text/html; charset=iso-8859-1 Content-Length: 1533 Server: Jetty(6.1.26.hwx) HTTP/1.1 307 TEMPORARY_REDIRECT Cache-Control: no-cache Expires: Wed, 24 May 2017 19:01:42 GMT Date: Wed, 24 May 2017 19:01:42 GMT Pragma: no-cache Expires: Wed, 24 May 2017 19:01:42 GMT Date: Wed, 24 May 2017 19:01:42 GMT Pragma: no-cache X-FRAME-OPTIONS: SAMEORIGIN WWW-Authenticate: Negotiate YGkGCSqGSIb3EgECAgIAb1owWKADAgEFoQMCAQ+iTDBKoAMCARKiQwRBQM/auuLcl2xey6wMp6EjCPJFSqK3snscxMzW7RvfgxOo7182GzD5N9jf+OWGr+tjpvlRX0c/7iTBfYKSetf4ekU= Set-Cookie: hadoop.auth="u=admin&p=admin@CYSAFA&t=kerberos&e=1495688502002&s=b7p35TgaxItAUTkKJuSXuynoq9E="; Path=/; HttpOnly Content-Type: application/octet-stream Location: http://<datanode3>:1022/webhdfs/v1/docs/filename%20with%20spaces.pdf?op=OPEN&delegation=HgAFYWRtaW4FYWRtaW4AigFcO9YJ8ooBXF_ijfJFAxSBYFUnsXY3up11ZNIi4hIi__5RvRJXRUJIREZTIGRlbGVnYXRpb24PMTcyLjE4LjAuOTo4MDIw&namenoderpcaddress=<namenode>:8020&offset=0 Content-Length: 0 Server: Jetty(6.1.26.hwx) HTTP/1.1 200 OK Access-Control-Allow-Methods: GET Access-Control-Allow-Origin: * Content-Type: application/octet-stream Connection: close Content-Length: 13365618 %����1.6 <</Filter/FlateDecode/First 157/Length 5350/N 16/Type/ObjStm>>stream ...
See also
Attachments
Attachments
Issue Links
- is related to
-
KNOX-1005 Special characters in HBase rows while called through Knox
- Closed