Bug 3879 - Expressions using {0,n} match 0 to n+1 times instead of 0 to n times
Summary: Expressions using {0,n} match 0 to n+1 times instead of 0 to n times
Status: CLOSED DUPLICATE of bug 19329
Alias: None
Product: Regexp
Classification: Unclassified
Component: Other (show other bugs)
Version: unspecified
Hardware: PC All
: P3 normal (vote)
Target Milestone: ---
Assignee: Jakarta Notifications Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2001-09-28 15:50 UTC by Chris Scheuble
Modified: 2005-03-20 17:06 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Scheuble 2001-09-28 15:50:36 UTC
Expressions using {0,n} match 0 to n+1 times instead of 0 to n times.

Expression "[a-z]{0,3}" against "123abcdefg123" matches "abcd" not "abc".

I fixed the problem in the compiler by changing the method void bracket()...

    /**
     * Match bracket {m,n} expression put results in bracket member variables
     * @exception RESyntaxException Thrown if the regular expression has 
invalid syntax.
     */
    void bracket() throws RESyntaxException
    {
        // Current character must be a '{'
        if (idx >= len || pattern.charAt(idx++) != '{')
        {
            internalError();
        }

        // Next char must be a digit
        if (idx >= len || !Character.isDigit(pattern.charAt(idx)))
        {
            syntaxError("Expected digit");
        }

        // Get min ('m' of {m,n}) number
        StringBuffer number = new StringBuffer();
        while (idx < len && Character.isDigit(pattern.charAt(idx)))
        {
            number.append(pattern.charAt(idx++));
        }
        try
        {
            bracketMin[brackets] = Integer.parseInt(number.toString());
        }
        catch (NumberFormatException e)
        {
            syntaxError("Expected valid number");
        }

        // If out of input, fail
        if (idx >= len)
        {
            syntaxError("Expected comma or right bracket");
        }

        // If end of expr, optional limit is 0
        if (pattern.charAt(idx) == '}')
        {
            if (bracketMin[brackets] < 1)
            {
                syntaxError("Bad zero range");
            }

            idx++;
            bracketOpt[brackets] = 0;
            return;
        }

        // Must have at least {m,} and maybe {m,n}.
        if (idx >= len || pattern.charAt(idx++) != ',')
        {
            syntaxError("Expected comma");
        }

        // If out of input, fail
        if (idx >= len)
        {
            syntaxError("Expected comma or right bracket");
        }

        // If {m,} max is unlimited
        if (pattern.charAt(idx) == '}')
        {
            idx++;
            bracketOpt[brackets] = bracketUnbounded;
            return;
        }

        // Next char must be a digit
        if (idx >= len || !Character.isDigit(pattern.charAt(idx)))
        {
            syntaxError("Expected digit");
        }

        // Get max number
        number.setLength(0);
        while (idx < len && Character.isDigit(pattern.charAt(idx)))
        {
            number.append(pattern.charAt(idx++));
        }
        try
        {
            bracketOpt[brackets] = Integer.parseInt(number.toString()) - 
bracketMin[brackets];
/**/
            if (bracketMin[brackets] < 1)
                bracketOpt[brackets]--;
/**/
        }
        catch (NumberFormatException e)
        {
            syntaxError("Expected valid number");
        }

        // Optional repetitions must be > 0
/*
        if (bracketOpt[brackets] <= 0)
*/
        if (bracketOpt[brackets] < 0)
        {
            syntaxError("Bad range");
        }

        // Must have close brace
        if (idx >= len || pattern.charAt(idx++) != '}')
        {
            syntaxError("Missing close brace");
        }
    }
Comment 1 Jon Stevens 2002-12-13 18:42:32 UTC
patches applied and tested
Comment 2 Jon Stevens 2002-12-13 18:42:45 UTC
closed
Comment 3 Vadim Gritsenko 2003-04-24 19:29:30 UTC
I belive this is not correct solution; testcase #174 output must be '', and Perl
agrees with me:

#!/usr/bin/perl
print "Matching '123abcdefg123' with regexp '([a-z]{0,3})':\n";
if ("123abcdefg123" =~ /([a-z]{0,3})/) {
    print "Matches. Result: '$1'\n";
}

Output:
Matching '123abcdefg123' with regexp '([a-z]{0,3})':
Matches. Result: ''

Patch will follow...
Comment 4 Vadim Gritsenko 2003-04-25 17:55:52 UTC

*** This bug has been marked as a duplicate of 19329 ***
Comment 5 Vadim Gritsenko 2003-05-02 01:09:33 UTC
Fixed by Bug #19329