Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-14443

parse_url() does not escape query parameters

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 1.6.0
    • None
    • SQL
    • Databricks

    Description

      To reproduce, run the following SparkSQL statement:

      select parse_url('http://1168.xg4ken.com/media/redir.php?prof=457&camp=67116&affcode=kw54&k_inner_url_encoded=1&cid=adwords&kdv=Desktop&url[]=http%3A%2F%2Fwww.landroverusa.com%2Fvehicles%2Frange-rover-sport-off-road-suv%2Findex.html%3Futm_content%3Dcontent%26utm_source%fb%26utm_medium%3Dcpc%26utm_term%3DAdwords_Brand_Range_Rover_Sport%26utm_campaign%3DFB_Land_Rover_Brand', 'QUERY', 'url[]')
      

      The exception is ultimately caused by

      java.util.regex.PatternSyntaxException: Unclosed character class near index 17
      (&|^)url[]=([^&]*)
                       ^
      

      Looks like the code is building a regex internally without escaping the passed in query parameter name.

      Attachments

        Activity

          People

            Unassigned Unassigned
            simeons Simeon Simeonov
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: