Uploaded image for project: 'Xerces2-J'
  1. Xerces2-J
  2. XERCESJ-1389

RegEx matching: ranges not computed correctly in "ignore case" mode

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.9.1
    • 2.10.0
    • Other
    • None

    Description

      There are a couple of problems in interpreting character ranges in "case-insensitive" mode.

      When doing range subtraction (or negation), all the case-variants of the subtracted characters need to be considered. For example, "[^Q]" means, in case-insensitive mode, "any character except 'q' or 'Q'" but the regex engine matches both 'q' and 'Q' in this example.

      Also, in case-insensitive mode, all character classes must stay the same, so for example "\p

      {Lu}

      " would not match a lowercase letter, but the regex engine matches 'q'.

      Attachments

        Activity

          People

            knoaman@ca.ibm.com Khaled Noaman
            radup Radu Preotiuc
            Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: