Uploaded image for project: 'Commons Text'
  1. Commons Text
  2. TEXT-46

StringEscapeUtils.unescapeHtml: handle HTML escapes without semicolon

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • 1.x
    • None

    Description

      org.apache.commons.lang.StringEscapeUtils.unescapeHtml is useful in detecting and correcting Cross-Site Scripting (XSS) attempts by converting escaped chars like &# 60; or & lt; (remove spaces) into normal chars like < so patterns like HTML tags can be detected. Many browsers will allow variations without semicolons, particularly the long UTF-8 encoding like &#0000060. Please see: http://ha.ckers.org/xss.html

      Since this may not be standard HTML, maybe adding a boolean bLenient parameter to the method could allow better backward compatibility.

      Attachments

        1. commons-lang3-LANG-757.patch
          11 kB
          Duncan Jones

        Activity

          People

            Unassigned Unassigned
            slslhale Steve Hale
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: