Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
3.5.3
-
None
-
None
Description
bash -c executing user input is risky, does Spark need to be guarded? For example, the execution of a spark app in yarn mode executes input directly with bash
YarnSparkHadoopUtil.scala
/** * Escapes a string for inclusion in a command line executed by Yarn. Yarn executes commands * using either * * (Unix-based) `bash -c "command arg1 arg2"` and that means plain quoting doesn't really work. * The argument is enclosed in single quotes and some key characters are escaped. * * (Windows-based) part of a .cmd file in which case windows escaping for each argument must be * applied. Windows is quite lenient, however it is usually Java that causes trouble, needing to * distinguish between arguments starting with '-' and class names. If arguments are surrounded * by ' java takes the following string as is, hence an argument is mistakenly taken as a class * name which happens to start with a '-'. The way to avoid this, is to surround nothing with * a ', but instead with a ". * * @param arg A single argument. * @return Argument quoted for execution via Yarn's generated shell script. */ def escapeForShell(arg: String): String = { if (arg != null) { if (Utils.isWindows) { YarnCommandBuilderUtils.quoteForBatchScript(arg) } else { val escaped = new StringBuilder("'") arg.foreach { case '$' => escaped.append("\\$") case '"' => escaped.append("\\\"") case '\'' => escaped.append("'\\''") case c => escaped.append(c) } escaped.append("'").toString() } } else { arg } }