[XERCESJ-1389] RegEx matching: ranges not computed correctly in "ignore case" mode - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.9.1
Fix Version/s: 2.10.0
Component/s: Other
Labels:
None

Description

There are a couple of problems in interpreting character ranges in "case-insensitive" mode.

When doing range subtraction (or negation), all the case-variants of the subtracted characters need to be considered. For example, "[^Q]" means, in case-insensitive mode, "any character except 'q' or 'Q'" but the regex engine matches both 'q' and 'Q' in this example.

Also, in case-insensitive mode, all character classes must stay the same, so for example "\p

{Lu}

" would not match a lowercase letter, but the regex engine matches 'q'.

Attachments

Activity

People

Assignee:: Khaled Noaman

Reporter:: Radu Preotiuc

Votes:: 1 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 31/Jul/09 03:48

Updated:: 22/Nov/09 18:49

Resolved:: 02/Nov/09 21:50