Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15165

Codegen can break because toCommentSafeString is not actually safe

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.5.2, 1.6.1, 2.0.0
    • Fix Version/s: 1.6.2, 2.0.0
    • Component/s: SQL
    • Labels:
      None

      Description

      toCommentSafeString method replaces "\u" with "\ \u" to avoid codegen breaking.
      But if the even number of "\" is put before "u", like "\ \u", in the string literal in the query, codegen can break.

      Following code causes compilation error.

      val df = Seq(...).toDF
      df.select("'\\\\\\\\u002A/'").show
      

      The reason of the compilation error is because "\\\\\\\\u002A/" is translated into "*/" (the end of comment).

      Due to this unsafety, arbitrary code can be injected like as follows.

      val df = Seq(...).toDF
      // Inject "System.exit(1)"
      df.select("'\\\\\\\\u002A/{System.exit(1);}/*'").show
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                sarutak Kousuke Saruta
                Reporter:
                sarutak Kousuke Saruta
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: