Uploaded image for project: 'Maven'
  1. Maven
  2. MNG-2932

Encoding chaos

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.4, 2.0.5, 2.0.6
    • 2.0.8
    • POM::Encoding
    • None
    • windows, linux

    Description

      I have tried maven on a project where javadocs, xdocs, pom-comments are in a native language with many NON-ASCII characters.
      This seems to reveal that maven is not acting clean with different encodings.

      For instance the xdocs are XML. And XML allows me to use different encodings if properly declared in the xml header. However it only works if I encode the XML as UTF-8. If I use ISO-8859-1 then the produced HTML contains UTF-8 characters from the nationalized site messages (resource bundles of maven plugins) and maven dumps the ISO-8859-1 encoded characters into that and ends up with mixed encodings in one HTML page.

      Additionally the JAVA files also cause trouble when I use a different encoding than UTF-8. I configured the "encoding" for javadoc plugin to ISO-8859-1 and used Java files in that encoding. The resulting javadoc HTML was written in ISO-8859-1 but the browser displayed it as UTF-8 and I had to switch explicitly to ISO-8859-1 in firefox in order to have the special characters displayed properly.

      Further I encounter trouble when I use special characters in pom.xml files that go onto the generated web-site. In the end I could NOT find a way to have a site without problems - even when I encode everything as UTF-8.

      Maybe there are too few developers involved from non english-speaking countries that are used to think beyond US-ASCII

      Unfortunatly I can not tell where the problems come from - it may be XPP, doxia, site-plugin or individual reports or all together.
      You need to properly distinguish between input and output encoding and have to be extremly careful with Stuff like byte[]
      and never parse XML from strings.

      Can you reproduce the problem or do you need dummy projects as test-cases?

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            hboutemy Herve Boutemy
            joerg@j-hohwiller.de Jörg Hohwiller
            Votes:
            3 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment