When using sbin/start-master.sh to start spark master daemon, sometimes the daemon service started successfully, but the shell script print error message such as:
failed to launch org.apache.spark.deploy.master.Master...
it makes me confused.
This bug is because, sbin/spark-daemon.sh script use bin/spark-class shell to start daemon, then sleep 2s and check whether the daemon process exists, using shell script like following:
if [[ ! $(ps -p "$newpid" -o comm=) =~ "java" ]]
the problem is, some machine with bad performance may start the daemon using a long time(exceeding 2s), but still can start daemon successfully, but in this case, the shell script judgement ! $(ps -p "$newpid" -o comm=) =~ "java" will fail, because at this time, the $newpid process is still shell process, until the daemon started, it turns into java process.