The bug is fairly complicated. [~Konstantin Boudnik] spots most of them:
- absence of the setting of STANDALONE_SPARK_MASTER_HOST in the spark-env.sh
- export SPARK_MASTER_IP=<%= @master_port %>
Another bug I found is that all the code in spark init.pp are lookning for $common::master_host, but it's $common::spark_master_host in cluster.yaml instead.
The following is fine since the master_host is set to $fqdn only if $spark::common::master_host is missing, hence $fqdn should be a good enough default value.
class common ($master_host = $fqdn, $master_port = "7077", $master_ui_port = "18080")
I've uploaded a patch to fix the spark worker deployment. It works well on two nodes docker cluster.