Issue Details (XML | Word | Printable)

Key: IBATIS-349
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Unassigned
Reporter: Daigo Kobayashi
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
iBatis for Java

NodeletParser ignore XML encoding.

Created: 26/Sep/06 01:06 AM   Updated: 30/Nov/06 01:28 AM
Component/s: Core
Affects Version/s: 2.2.0
Fix Version/s: 2.3.0

Time Tracking:
Not Specified

File Attachments:
  Size
Java Source File Licensed for inclusion in ASF works Resources.java 2006-11-06 11:46 AM Daigo Kobayashi 11 kB
Text File Licensed for inclusion in ASF works Resources.patch 2006-11-06 12:36 AM Daigo Kobayashi 3 kB
Text File Licensed for inclusion in ASF works Resources.patch 2006-09-26 01:08 AM Daigo Kobayashi 3 kB
Environment:
Windows XP(Japanese)
jdk 1.5.0_06

Resolution Date: 29/Nov/06 06:41 PM


 Description  « Hide
NodeletParser ignore XML encoding. Because NodeletParser use Resources, and in some case Resources use ClassLoader#getResourceAsStream with OS default encoding not XML's.

In some environment, for example OS use Shift_JIS and XML use UTF-8, iBatis doesn't work correctly because of broken character.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Daigo Kobayashi added a comment - 26/Sep/06 01:08 AM
I create patch for this encoding problem.

Daigo Kobayashi added a comment - 06/Nov/06 12:36 AM
I find problem of previous patch. If InputStram is null, that cause NullPointerException. So I add null check code.

Daigo Kobayashi added a comment - 06/Nov/06 12:38 AM
Is it possible to merge attatched code before next release 2.2.1 (or 2.3) ?

Clinton Begin added a comment - 06/Nov/06 03:06 AM
Can you upload the whole file? Not just the patch. Believe it or not, it's much easier to deal with.

Daigo Kobayashi added a comment - 06/Nov/06 11:46 AM
Attach whole file of Resources.java.

Daigo Kobayashi added a comment - 08/Nov/06 03:15 PM
Is there any progress? Attachment is not enough?

Jeff Butler added a comment - 27/Nov/06 06:14 PM
Would it work for you if we added methods to Resources like this:

Resources.getResourceAsReader(String resource, String encoding) {
    return new InputStreamReader(getResourceAsStream(resource), Charset.forName(encoding));
}


Resources.getResourceAsReader(ClassLoader loader, String resource, String encoding) {
    return new InputStreamReader(getResourceAsStream(loader, resource), Charset.forName(encoding));
}


This would be a lot simpler than opening the XML file twice - the first time just to get the encoding.


Jeff Butler added a comment - 28/Nov/06 01:34 AM
Well, my first comment works for the SqlMapConfig file, but doesn't provide any help for the SqlMap files.

Take a look at IBATIS-340 - a similar problem. Of the two solutions, I like the solution proposed in IBATIS-340 better. That's what I'd like to do unless you have some compelling reason that it won't work.

Daigo Kobayashi added a comment - 28/Nov/06 03:44 PM
XML file have it's own encoding. So we should not ignore that encoding. If we follow this solution, we have to set encoding and ignore xml's encoding.

And xml specification says, if encoding is not specified parser must use UTF-8 or UTF-16 and if specified follow it. So this solution violate xml's encoding specification.

Jeff Butler added a comment - 28/Nov/06 06:48 PM
This is a complex issue. The real problem is that iBATIS ignores the encoding issue altogether in parsing - because we only accept a character reader.

I had a quick look at Xerces source to see how they autodetect the encoding and their method is quite different than what you propose. Your method still uses the system default encoding to open the file, a potential source of errors. And it ignores the fact that the real encoding of the file may be different than what's declared in the processing instruction. Xerces figures out the encoding by looking at the first 4 bytes of the file. This is a complex algorithm that we don't want to replicate in iBATIS.

I think the best way to resolve this is to allow iBATIS to accept a byte stream for parsing - then we could leverage the parser's built in support for dealing with different encodings. I'll have a look at doing that.

In the meantime, please try manually setting the encoding to UTF-8 using the fix for IBATIS-340. I'd like to know if this addresses your immediate issue.

Jeff Butler added a comment - 28/Nov/06 11:02 PM
I've committed some changes for IBATIS-373 that allow iBATIS to build the SqlMapClient from an InputStream rather than a Reader. This delegates the encoding issue to the parser - where it should be IMHO. You can use it like this:

String resource = "myconfig/SqlMapConfig.xml";
InputStream inputStream = Resources.getResourceAsStream(resource);
SqlMapClient client = SqlMapClientBuilder.buildSqlMapClient(inputStream);

Please give this a try (or you can wait for 2.3.0 later this week). If one of these alternatives resolves the issue, then I'd like to close this ticket.

Thanks for all the feedback!

Daigo Kobayashi added a comment - 29/Nov/06 07:50 AM
This works fine. And this is ideal solution.
I really appriciate your great job.

BTW, I'm using iBatis with springframework. So I have to fix spring's SqlMapClientFactoryBean.java code. (trivial change.)
So I think some action to spring team is necessary.

Could you report this problem to spring team? Or should I report?

Jeff Butler added a comment - 29/Nov/06 06:39 PM
Great! I'm glad this fix worked.

I think you should open the enhancement request with the Spring team. You could probably provide your modification directly to them, and you're in a better position to test than I am.

Jeff Butler added a comment - 29/Nov/06 06:41 PM
This issue is resolved with the changes for IBATIS-373.

See comments for more details.

Daigo Kobayashi added a comment - 30/Nov/06 01:28 AM