Uploaded image for project: 'Maven'
  1. Maven
  2. MNG-6205

Non-ascii chars in name element are displayed as question marks in Win CLI output (regression)

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.5.0
    • Fix Version/s: 3.5.2
    • Component/s: Logging
    • Labels:
      None
    • Environment:
      Windows 7, IBM JDK 7.1, Maven 3.5.0 (ff8f5e7444045639af65f6095c62210b5713f426)

      Description

      If non-ascii characters (such as Swedish chars å, ä, ö) are used in the pom name element, they are displayed as question mark ('?') in the CLI output when building on Windows. This can been seen e.g. in the Reactor Summary in the end.
      See attached pom for an example.

      This was not an issue in Maven 3.3.9.

      1. pom.xml
        0.6 kB
        Anders Hammar
      2. jansi-1.16-SNAPSHOT.jar
        150 kB
        Hervé Boutemy

        Issue Links

          Activity

          Hide
          hboutemy Hervé Boutemy added a comment -

          uh, could this be related to jansi tweaking Windows console output to add color support but that would cause issues?

          Can you test after removing jansi.jar in /lib, please, to confirm this guess?

          Show
          hboutemy Hervé Boutemy added a comment - uh, could this be related to jansi tweaking Windows console output to add color support but that would cause issues? Can you test after removing jansi.jar in /lib, please, to confirm this guess?
          Hide
          afloom Anders Hammar added a comment -

          Yes, it's related to jansi. The chars are displayed correctly if I remove jansi.jar.

          Show
          afloom Anders Hammar added a comment - Yes, it's related to jansi. The chars are displayed correctly if I remove jansi.jar.
          Hide
          hboutemy Hervé Boutemy added a comment -

          ok, thank you: I'll continue with JAnsi team to fix the root cause (and eventually find a workaround)

          Show
          hboutemy Hervé Boutemy added a comment - ok, thank you: I'll continue with JAnsi team to fix the root cause (and eventually find a workaround)
          Hide
          hboutemy Hervé Boutemy added a comment -
          Show
          hboutemy Hervé Boutemy added a comment - demo program pushed: https://github.com/fusesource/jansi/pull/79
          Hide
          hboutemy Hervé Boutemy added a comment -

          Anders HammarGuillaume Nodet just fixed the issue in JAnsi 1.16-SNAPSHOT: now it works on my french setup (with CP-1252 platform encoding)
          Can you check that it's ok for you also, please?

          Show
          hboutemy Hervé Boutemy added a comment - Anders Hammar Guillaume Nodet just fixed the issue in JAnsi 1.16-SNAPSHOT: now it works on my french setup (with CP-1252 platform encoding) Can you check that it's ok for you also, please?
          Hide
          afloom Anders Hammar added a comment -

          Hervé BoutemyCould you please provide the jansi snapshot jar please, I couldn't find it anywhere? Attach it to this ticket.

          Show
          afloom Anders Hammar added a comment - Hervé Boutemy Could you please provide the jansi snapshot jar please, I couldn't find it anywhere? Attach it to this ticket.
          Hide
          hboutemy Hervé Boutemy added a comment -

          nothing complex to build, but if it can help, here it is

          Show
          hboutemy Hervé Boutemy added a comment - nothing complex to build, but if it can help, here it is
          Hide
          afloom Anders Hammar added a comment -

          Ok, now it works with Oracle JDK (tested with 1.8.0_112). However, I have the same character issue with IBM JDK (JDK 7.1 for WAS 8.5.5).

          Show
          afloom Anders Hammar added a comment - Ok, now it works with Oracle JDK (tested with 1.8.0_112). However, I have the same character issue with IBM JDK (JDK 7.1 for WAS 8.5.5).
          Hide
          hboutemy Hervé Boutemy added a comment -

          great. Yes, I was using Oracle JDK, like the vast majority of people IMHO, then I think we're safe in the vast majority, that's already a good improvement

          now not so great, the IBM JDK case: you're building with this JDK? I don't know how many people do so... But ok, that's a valid use case
          How can I get this JDK, please, to test for myself and report to JAnsi back?

          Show
          hboutemy Hervé Boutemy added a comment - great. Yes, I was using Oracle JDK, like the vast majority of people IMHO, then I think we're safe in the vast majority, that's already a good improvement now not so great, the IBM JDK case: you're building with this JDK? I don't know how many people do so... But ok, that's a valid use case How can I get this JDK, please, to test for myself and report to JAnsi back?
          Hide
          afloom Anders Hammar added a comment -

          IBM JDK is not uncommon in larger companies.
          IBM JDK 1.8 is available here: https://www.ibm.com/developerworks/java/jdk/
          I tested and the char problem exists with that JDK as well. (Tested on attached pom.xml.)

          Show
          afloom Anders Hammar added a comment - IBM JDK is not uncommon in larger companies. IBM JDK 1.8 is available here: https://www.ibm.com/developerworks/java/jdk/ I tested and the char problem exists with that JDK as well. (Tested on attached pom.xml.)
          Hide
          hboutemy Hervé Boutemy added a comment -

          I made tests, here are the results:

          • with Oracle JDK, file.encoding is Windows-1252 and System.out encoding is CP850 (see chcp result): this is the fix/hack Guillaume coded
          • with IBM JDK, file.encoding is Windows-1252 and System.out encoding is also Windows-1252: then the previous fix for Oracle JDK breaks IBM JDK

          Anders, can you confirm that Jansi 1.15 gives good result with IBM JDK, please?

          if it's the case, we'll have to update the fix/hack to be executed only with Oracle JDK, and not with IBM JDK

          Show
          hboutemy Hervé Boutemy added a comment - I made tests, here are the results: with Oracle JDK, file.encoding is Windows-1252 and System.out encoding is CP850 (see chcp result): this is the fix/hack Guillaume coded with IBM JDK, file.encoding is Windows-1252 and System.out encoding is also Windows-1252: then the previous fix for Oracle JDK breaks IBM JDK Anders, can you confirm that Jansi 1.15 gives good result with IBM JDK, please? if it's the case, we'll have to update the fix/hack to be executed only with Oracle JDK, and not with IBM JDK
          Hide
          afloom Anders Hammar added a comment -

          It doesn't work with Jansi 1.15 either. Tested with IBM JDK 1.7 and IBM JDK 1.8.

          Show
          afloom Anders Hammar added a comment - It doesn't work with Jansi 1.15 either. Tested with IBM JDK 1.7 and IBM JDK 1.8.
          Hide
          hboutemy Hervé Boutemy added a comment -

          uh
          I really don't understand how the System.out encoding is chosen by the JVM on WIndows for every JVM vendor, based on the mess Windows does between GUI encoding (for example Windows-1252 in french) and CLI encoding (for example CP850, see chcp command)
          Perhaps the solution will be to change the whole JAnsi logic regarding AnsiConsole class: avoid creating a PrintWriter that does the encoding conversion (see https://github.com/fusesource/jansi/blob/master/jansi/src/main/java/org/fusesource/jansi/AnsiConsole.java#L45 ), then has to do it with the same voodoo than System.out, but delegate the conversion to System.out: whatever the voodoo is, it will use it.
          I'll try this

          Show
          hboutemy Hervé Boutemy added a comment - uh I really don't understand how the System.out encoding is chosen by the JVM on WIndows for every JVM vendor, based on the mess Windows does between GUI encoding (for example Windows-1252 in french) and CLI encoding (for example CP850, see chcp command) Perhaps the solution will be to change the whole JAnsi logic regarding AnsiConsole class: avoid creating a PrintWriter that does the encoding conversion (see https://github.com/fusesource/jansi/blob/master/jansi/src/main/java/org/fusesource/jansi/AnsiConsole.java#L45 ), then has to do it with the same voodoo than System.out, but delegate the conversion to System.out: whatever the voodoo is, it will use it. I'll try this
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build maven-3.x #1648 (See https://builds.apache.org/job/maven-3.x/1648/)
          MNG-6205 upgraded JAnsi to 1.16 for console encoding fix (hboutemy: http://git-wip-us.apache.org/repos/asf/?p=maven.git&a=commit&h=2a79d1e71edc0ddd0c0ba1612ce520f43961eef2)

          • (edit) pom.xml
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build maven-3.x #1648 (See https://builds.apache.org/job/maven-3.x/1648/ ) MNG-6205 upgraded JAnsi to 1.16 for console encoding fix (hboutemy: http://git-wip-us.apache.org/repos/asf/?p=maven.git&a=commit&h=2a79d1e71edc0ddd0c0ba1612ce520f43961eef2 ) (edit) pom.xml
          Hide
          hboutemy Hervé Boutemy added a comment -

          JAnsi upgraded to 1.16, which does a better job at detecting cmd.exe encoding

          Show
          hboutemy Hervé Boutemy added a comment - JAnsi upgraded to 1.16, which does a better job at detecting cmd.exe encoding
          Hide
          afloom Anders Hammar added a comment -

          Hervé Boutemy I've tested jansi 1.16 and it still doesn't hande non-ascii chars correctly with IBM JDK. Tested by patching Maven 3.5.0 and with IBM JDK 1.7.
          So, still a regression for us using IBM JDK.

          Show
          afloom Anders Hammar added a comment - Hervé Boutemy I've tested jansi 1.16 and it still doesn't hande non-ascii chars correctly with IBM JDK. Tested by patching Maven 3.5.0 and with IBM JDK 1.7. So, still a regression for us using IBM JDK.
          Hide
          hboutemy Hervé Boutemy added a comment - - edited

          yes, I didn't have time to work on the big change to Jansi to avoid requiring to guess which encoding to use.
          Honestly, the issue has to be tested and tracked at Jansi level, not Maven: I just proposed a new PR https://github.com/fusesource/jansi/pull/88 in Jansi to have a basic test inside Jansi to check how it behaves in any condition.
          From my first tests, Windows cmd.exe encoding is really strange, even without Jansi...
          With that Jansi alone test, everybody should be able to test Jansi and provide better algorithms: for the moment, Jansi 1.16 seems to be better than 1.13 when used on Oracle JDK, which is the most common case. With IBM JDK, people will need to test and propose improvements to Jansi

          Show
          hboutemy Hervé Boutemy added a comment - - edited yes, I didn't have time to work on the big change to Jansi to avoid requiring to guess which encoding to use. Honestly, the issue has to be tested and tracked at Jansi level, not Maven: I just proposed a new PR https://github.com/fusesource/jansi/pull/88 in Jansi to have a basic test inside Jansi to check how it behaves in any condition. From my first tests, Windows cmd.exe encoding is really strange, even without Jansi... With that Jansi alone test, everybody should be able to test Jansi and provide better algorithms: for the moment, Jansi 1.16 seems to be better than 1.13 when used on Oracle JDK, which is the most common case. With IBM JDK, people will need to test and propose improvements to Jansi
          Hide
          afloom Anders Hammar added a comment -

          I've filed a ticket on the jansi project to track the IBM JDK issue: https://github.com/fusesource/jansi/issues/93

          Show
          afloom Anders Hammar added a comment - I've filed a ticket on the jansi project to track the IBM JDK issue: https://github.com/fusesource/jansi/issues/93

            People

            • Assignee:
              hboutemy Hervé Boutemy
              Reporter:
              afloom Anders Hammar
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development