Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
Description
WebAppProxyServlet has a bug about the URL encode/decode. This was found when running Spark on Yarn.
When a user accesses "http://example.com:8088/proxy/application_1415344371838_0006/executors/threadDump/?executorId=%3Cdriver%3E", WebAppProxyServlet will require "http://example.com:36429/executors/threadDump/?executorId=%25253Cdriver%25253E". But Spark Web Server expects "http://example.com:36429/executors/threadDump/?executorId=%3Cdriver%3E".
Here are problems I found in WebAppProxyServlet.
1. java.net.URI.toString returns an encoded url string. So the following code in WebAppProxyServlet should use `true` instead of `false`.
org.apache.commons.httpclient.URI uri = new org.apache.commons.httpclient.URI(link.toString(), false);
2. HttpServletRequest.getPathInfo() will returns a decoded string. Therefore, if the link is http://example.com:8088/proxy/application_1415344371838_0006/John%2FHunter, pathInfo will be "/application_1415344371838_0006/John/Hunter". Then the URI created in WebAppProxyServlet will be something like ".../John/Hunter", but the correct link should be ".../John%2FHunber". We can use HttpServletRequest.getRequestURI() to get the raw path.
final String pathInfo = req.getPathInfo();
3. Use wrong URI constructor. URI(String scheme, String authority, String path, String query, String fragment) will encode the path and query which have already been encoded. Should use URI(String str) directly since the url has already been encoded.
URI toFetch = new URI(trackingUri.getScheme(), trackingUri.getAuthority(), StringHelper.ujoin(trackingUri.getPath(), rest), req.getQueryString(), null);
Attachments
Attachments
Issue Links
- relates to
-
SPARK-4313 "Thread Dump" link is broken in yarn-cluster mode
- Closed