Uploaded image for project: 'Groovy'
  1. Groovy
  2. GROOVY-8625

Groovy Lexer does not accept UTF-8 characters like ° or § ... and a lot more

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.5.0
    • None
    • Compiler

    Description

      The grammar uses a similar specification for LETTERs as the old Java-grammar. By intention most UTF-8 characters should possible to use for names to enable localization in languages using non-latin characters. This is especially important for DSLs.

      Ast-transformations will take place after the Lexer. With the Lexer accepting his characters, ast-transformations are now able to handle more things like creating custom operators and so on.

      This is a problem only for ANTLR 2.

      ANTLR 4 is only missing the '#'-sign.

      This maybe introduces a breaking change, because GStrings like "$first#$second" worked in the past, and now will not anymore. Before this change, "$first#" is interpreted as the value of the variable first plus a '#' sign. Now it is interpreted as the value of the variable first#.

      This, of cause, is a problem for all newly added letters. 

      Attachments

        Activity

          People

            Unassigned Unassigned
            saschaklein Alexander Klein
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: