34874 – XMLLayout and HTMLLayout do not detect use of incompatible encoding

Bug 34874 - XMLLayout and HTMLLayout do not detect use of incompatible encoding

Summary: XMLLayout and HTMLLayout do not detect use of incompatible encoding

Status:	RESOLVED WONTFIX

Alias:	None

Product:	Log4j - Now in Jira
Classification:	Unclassified
Component:	Layout (show other bugs)
Version:	1.3alpha
Hardware:	Other other

Importance:	P2 normal
Target Milestone:	---
Assignee:	log4j-dev

URL:
Keywords:

Depends on:
Blocks:

Reported:	2005-05-11 22:30 UTC by Curt Arnold
Modified:	2007-08-10 15:19 UTC (History)
CC List:	0 users

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Curt Arnold 2005-05-11 22:30:51 UTC

XMLLayout and HTMLLayout assume that the encoding of any associated writer is either UTF-8 or 
UTF-16.  If an encoding is not explicitly specified in the appender, the default platform encoding will be 
used which is highly unlikely to be UTF-8 or UTF-16 on Windows.  A mismatch in encoding will result in 
non-wellformed XML documents if a non-US-ASCII character is emitted in the log.

The proposed resolution is to add a new interface

interface EncodingSensitiveLayout {
    /*  @return encoding selected by layout  */ 
    String setEncoding(final String proposedEncoding);
}

to be implemented by XMLLayout and HTMLLayout.  In  the WriterAppender.activateOptions, if the 
layout supported EncodingSensitiveLayout, it would be passed the proposed encoding and would have a 
chance to either modify its behavior to be consistent with that encoding or to override the choice of 
encoding.

Comment 1 Curt Arnold 2007-08-10 15:19:50 UTC

Added a notice to the javadoc for XMLLayout and HTMLLayout to use UTF-8 or UTF-16 encoding or risk 
corrupted documents (for log4j 1.2).  Will not fix in log4j 1.3.  log4j 2.0 will have a distinct support for 
byte (as opposed to character) layouts, so should not be a problem for it.  The proposed solution was an 
attempt to work-around the lack of a direct byte layout mechanism.