Uploaded image for project: 'Daffodil'
  1. Daffodil
  2. DAFFODIL-2474

how to deal with Control chars and newlines in pattern.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.1.0
    • Clean Ups, QA
    • None

    Description

      Major because this issue was raised by a user, and it took me hours to figure it out!

      We need to cleanup some code and add tests to show how to do seemingly obvious things with XSD pattern facets that are in fact quite tricky to do, and we've gotten them wrong before in real schemas.

      E.g., use a pattern facet to restrict the characters of a string to only the characters with code points less than 7F.

      This turns out to be quite tricky due to XML illegal characters, combined with XML attribute normalization.

      The correct pattern facet definition is this:

      <xs:pattern value="[&#xE000;-&#xE008;\t\n&#xE00B;&#xE00C;\r&#xE00E;-&#xE01F;&#x20;-&#x7F]*"/>

      (that has to all be on one line)

      Various other combinations do NOT work. E.g., you can't replace the \n by

      &#xA;

      because XML attribute normalization will take that out.

      You can't use Daffodil's

      &#xE00A;

      either, because when Xerces-based full validation comes along, there will be an 0x0A in the data, not an 0xE00A, so Xerces will fail validation. You have to use \n for this.

       

       

       

       

      Attachments

        Activity

          People

            mbeckerle Mike Beckerle
            mbeckerle Mike Beckerle
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: