[NETBEANS-6233] Netbeans console input has unknown encoding - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 11.0, 12.5
Fix Version/s: 14
Component/s: None
Labels:
None
Environment:

Hide

I'm using Netbeans 12.5 on a Linux (Ubuntu 20.04) platform

Maven is Maven 3.6.3 (bundled) – I also tried with Maven 3.8.4

and Java is Open JDK 11 – I also tried running with Open JDK 15.0.2 (by adding that as a Java platform from within Netbeans)

Show
I'm using Netbeans 12.5 on a Linux (Ubuntu 20.04) platform Maven is Maven 3.6.3 (bundled) – I also tried with Maven 3.8.4 and Java is Open JDK 11 – I also tried running with Open JDK 15.0.2 (by adding that as a Java platform from within Netbeans)

Language:
- java

Description

I'm trying to understand what charset is being used when typing into the Netbeans console. I expected it to obey my current locale (which is UTF8 on my system), but it obviously isn't doing that. I then set on trying to understand what charset it is. And the answer seems to be : none. Here are my findings :

Using Netbeans 12.5 + Maven and the following code :

        final InputStream IN = System.in;
        do {
            System.out.println("Byte: " + IN.read());
        } while (IN.available() > 0);

When I enter "€" in the Netbeans console, I get the following unexpected output ("10" is just the newline char):

Byte: 172
Byte: 10

It is unexpected because € is never encoded as 172 (0xac) alone. In UTF8 it is three bytes (0xe2 0x82 0xac), and in UTF16 it is two (0x20 0xac)

Similarly, entering 𐐷 (DESERET SMALL LETTER YEE), I get something unexpected:

Byte: 1
Byte: 55
Byte: 10

IOW these are 0x01 0x37. In UTF8, it should be 0xf0 0x90 0x90 0xb7 – in UTF16 it should be 0xd8 0x01 0xdc 0x37

I'm on Linux, my locale (as reported by the command locale) is UTF8, but these results look like the encoding is "half UTF16" : it's like UTF16 but every other byte is missing.

If I run the same code within an Ant project or a Gradle project it works fine for the symbol €, and the bytes reported are consistent with my UTF8 locale ;

but if I enter 𐐷 (DESERET SMALL LETTER YEE), then :

with the Gradle project, it outputs "Byte: -1" (no other bytes reported), and
with the Ant project, it outputs nothing and the program does not seem to stop, I have to manually abort the Run.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Nicolas Richard

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 25/Nov/21 17:08

Updated:: 16/Jun/22 15:51

Resolved:: 16/Jun/22 15:51