Uploaded image for project: 'XalanJ2'
  1. XalanJ2
  2. XALANJ-1034

normalize-space() deletes an inner space

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Resolution: Fixed
    • 2.3
    • None
    • XPath
    • None
    • Operating System: Solaris
      Platform: Sun
    • 9441

    Description

      I use Xalan J 2.3.1 with CLASSPATH only set to include the xalan.jar and
      xercesImpl.jar binaries downloaded from http://xml.apache.org/dist/xalan-
      j/xalan-j_2_3_1.zip and the ordering of the CLASSPATH components is not
      important.

      My java is Sun's java -version:

      java version "1.3.0_02"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0_02)
      Java HotSpot(TM) Client VM (build 1.3.0_02, mixed mode)

      The result on a 1.2.2 from Sun is the same (only needs xml-apis.jar added to
      the CLASSPATH).

      My command line is

      java org.apache.xalan.xslt.Process IN normalize-space.xml -OUT normalize
      space.html

      I try to transform the 5-lines file normalize-space.xml

      <?xml version="1.0" encoding="UTF-8"?>
      <?xml-stylesheet href="normalize-space.xsl" type="text/xsl"?>
      <document>
      <p>a b c d e f g h i j k l m n o p q r s t u v w x y z</p>
      </document>

      with the 6-lines stylesheet normalize-space.xsl

      <?xml version="1.0" encoding="UTF-8"?>
      <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
      <xsl:template match="p">
      <html><body><xsl:value-of select="normalize-space()"/></body></html>
      </xsl:template>
      </xsl:stylesheet>

      and get a 3-lines file normalize-space.html

      <?xml version="1.0" encoding="UTF-8"?>

      <html><body>a b c d e f g h i j k l m n o pq r s t u v w x y z</body></html>

      As you see, the characters p and q are glued together.

      I have tried a set of similar transformations and seen that the behaviour is
      dependent on the xml file content. If e.g. the wrapping element "document" is
      removed, then I get the expected text node

      <html>
      <body>a b c d e f g h i j k l m n o p q r s t u v w x y z</body>
      </html>

      Also when using three times the line with element "p" in place of once, only
      the second of the <html>...</html> output lines shows the defect.

      When I tried the transformation on the original file on an NT station with with
      msxml 3.0, it delivered the expected result, i.e. p and q separated.

      Attachments

        Activity

          People

            Unassigned Unassigned
            bbodi@web.de Bernhard Bodenstorfer
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: