Santuario
  1. Santuario
  2. SANTUARIO-123

Canonicalization failed with some latin2 characters

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Java
    • Security Level: Public (Public issues, viewable by everyone)
    • Labels:
      None
    • Environment:
      Operating System: Windows XP
      Platform: PC

      Description

      Canonicalization failed with some latin2 characters 'čćžšđČĆŽŠĐ'(leters with
      caron, ... ).

      Release 1.3.0 don't have such problem.

      Code which demonstrates bug:

      // parse document
      DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
      dbf.setNamespaceAware(true);
      DocumentBuilder db = dbf.newDocumentBuilder();
      // text contains some latin2 characters 'čćžšđČĆŽŠĐ'
      String text = new
      String("<text>\u010D\u0107\u017E\u0161\u0111\u010C\u0106\u017D\u0160\u0110</text>");
      Document doc = db.parse(new ByteArrayInputStream(text.getBytes("UTF-8")));
      Element e_latin2 = doc.getDocumentElement();
      Canonicalizer20010315WithComments c14 = new Canonicalizer20010315WithComments();
      byte[] canon_bin = c14.engineCanonicalizeSubTree(e_latin2);

      if (Arrays.equals(text.getBytes("UTF-8"), canon_bin))
      System.out.println("OK");
      else
      System.out.println("Failed");

        Issue Links

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              matej
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development