Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
1.6.1
-
None
-
None
Description
When using Netty RPC implementation, which is the default one in Spark 1.6.x, the executor addresses that I see in the Spark application UI (the one on port 4040) are the IP addresses of the machines, even if I start the slaves with the -H option, in order to bind each slave to the hostname of the machine.
This is a big deal when using Spark with HDFS, as the executor addresses need to match the hostnames of the DataNodes, to achieve data locality.
When setting spark.rpc=akka everything works as expected, and the executor addresses in the Spark UI match the hostname, which the slaves are bound to.