Bug 22894 - Single backslash not accepted in File param value
Summary: Single backslash not accepted in File param value
Status: RESOLVED WONTFIX
Alias: None
Product: Log4j - Now in Jira
Classification: Unclassified
Component: Configurator (show other bugs)
Version: 1.2
Hardware: PC All
: P3 normal
Target Milestone: ---
Assignee: log4j-dev
URL:
Keywords:
: 30838 (view as bug list)
Depends on:
Blocks:
 
Reported: 2003-09-02 17:43 UTC by Tom Caulkins
Modified: 2008-08-05 16:43 UTC (History)
2 users (show)



Attachments
Patch and supporting test files (2.55 KB, application/octet-stream)
2005-02-11 03:55 UTC, Curt Arnold
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tom Caulkins 2003-09-02 17:43:31 UTC
If you enter a filepath with single backslashes, the backslashes will be 
ignored ( or converted to special characters such as newline or tab, depending 
on the character after the backslash ).  So, for example, a filepath of:

C:\MyDirectory\MyLog.txt

will end up as the file:

MyDirectoryMyLog.txt

in the current folder.  I looked at the log4j code and discovered that *all* 
parameter values are processed via the method 
org.apache.log4j.helpers.OptionConverter.convertSpecialChars().  This method 
treats backslashes as an escape character, to support such things as \t for tab 
and \n for newline.  A single backslash without an appropriate following 
character is quietly consumed.

Since tabs, newlines, and other special characters are not valid in filepaths, 
it would be nice if parameter values of type "File" were not processed via the 
convertSpecialChars() method.  Since double backslashes should be supported, 
too, for backward compatibility, I would suggest replacing the invocation of 
convertSpecialChars() with an invocation of a new method that simply replaces 
double backslashes with single backslashes, in the case where the param is of 
type "File".  In this way, both double backslashes and single backslashes would 
be supported in filenames.

Thanks,
Tom Caulkins
Sr. Systems Developer
SAS Institute Inc.
Cary, NC 27513
Comment 1 Ceki Gulcu 2003-09-02 19:18:10 UTC
Tom,

Many thanks for this precise report. Interestingly enough, the problem you 
mention is fairly well known. See for example http://marc.theaimsgroup.com/?
t=103228919500003&r=1&w=2

While what you suggest makes some sense, another solution is to use forward 
slashes instead of back slash or two backslahes instead of a single one when 
specifying the file name.

Please do not hesitate to reopen this report if the above suggestion does not 
work for you.

I hope this helps,
Comment 2 Tom Caulkins 2003-09-02 19:52:41 UTC
I am disappointed that you do not feel fixing this bug is worthwhile.

I will not re-open this bug, but I'll add these additional comments:

Yes, I was aware of the ability to use forward slashes or doubling up the 
backslashes to make this work.  In my case, I am supporting a GUI which accepts 
user input for a file path which eventually becomes a log4j log file.  So the 
user of my application becomes confused when they enter a valid Windows 
filepath and then do not see the results they expected.  In order to work 
around this deficiency in log4j, I now have to search file path strings, 
looking for single backslashes, and then convert them to double 
backslashes...so that log4j can convert them back into single backslashes!  All 
this, when it's not really necessary or useful to be utilizing the 
convertSpecialChars() method on file path strings.

Sincerely,
Tom Caulkins
Sr. Systems Developer
SAS Institute Inc.
Cary, NC 27513
Comment 3 Curt Arnold 2005-02-11 00:00:54 UTC
This has also been reported in log4cxx as
http://issues.apache.org/jira/browse/LOGCXX-22.  In properties files, a
backslash is a special character and using double backslashes to represent a
literal backslash is appropriate and expected.  However, backslashes have no
special significane in XML files where you would use an ampersand to create an
"escape" sequence.  It should also not affect setFile() where you should be not
be required to escape a valid filename.  
Comment 4 Curt Arnold 2005-02-11 03:55:31 UTC
Created attachment 14248 [details]
Patch and supporting test files

I've committed the equivalent changes to log4cxx, but we have a much less
susceptibility to compatability issues.

With the patch, JoranDocument no longer calls
OptionConverter.convertSpecialChars.  FileAppender.setFile will replace double
backslashes with single and PatternLayout will process the pattern through
OptionConverter.convertSpecialChars.

I think this is a very desirable change and the incremental compatibility risk
is slight.  I can commit the changes if so desired.
Comment 5 Ceki Gulcu 2005-02-11 11:48:54 UTC
  
Please do, by all means! 
Comment 6 Curt Arnold 2005-02-11 19:19:18 UTC
Changes committed 11 Feb 2005
Comment 7 Tom Caulkins 2005-02-11 20:21:09 UTC
Curt,

Thanks for fixing this bug in log4j.

I followed the link to the log4cxx bug you also fixed, and noted that you 
report "to preserve compatibility with existing configurations, 
FileAppender::setOption will eliminate double backslashes from any specified 
file name".

This seems reasonable.  I see that you applied this compensation to log4j, too.

One special case will be impacted negatively by this: when a UNC-type file name 
is used, i.e. \\mymachine.apache.org\mydir\myfile.log.  My work-around for the 
exiting log4j bug includes replacing 2 leading backslashes with four ( so log4j 
can turn them back into two ).

Perhaps you could consider handling this special case, too, by not turning 2 
leading backslashes into one.  Note that a file name that begins with a single 
backslash is not a valid file name, so backward compatibility would not be 
compromised by adding this additional fix ( i.e. no one was using double 
backslashes in that position as a workaround to the bug ).  Without the 
additional fix, the above valid UNC file name will be rendered invalid.

I am marking this bug re-opened while you consider my request.

Thanks,
Tom

Comment 8 Tom Caulkins 2005-02-11 20:37:22 UTC
I should add:

If you do decide to incorporate my suggestion, for compatibility you would need 
to handle the ( special-special? ) case of four leading backslashes, too, since 
they would be present when someone or some code ( like mine! ) was employing a 
workaround in trying to construct a UNC file name that log4j wouldn't corrupt.

Thanks again,
Tom
Comment 9 Curt Arnold 2005-02-11 22:27:16 UTC
Good catch, I hadn't thought about UNC's.

I've refined the implementation of OptionConverter.stripDuplicateBackslashes so that it should properly 
distinguish handle quad-backslashes and distinguish between double backslashes intended to be 
doubles and those intended to be single.  If the path contains any single slashes (actually any odd 
number of slashes), it is returned unchanged, otherwise all double slashes are reduced to single slashes 
(quads are reduced to doubles).  Reopen it if I still missed something.  Check the 
OptionSubstitutionTest.testStripDuplicateBackslashes for the scenarios I tried to cover.
Comment 10 Curt Arnold 2005-07-21 05:09:30 UTC
*** Bug 30838 has been marked as a duplicate of this bug. ***
Comment 11 Curt Arnold 2005-07-21 05:10:14 UTC
Candidate for porting back to log4j 1.2 branch.
Comment 12 Curt Arnold 2005-07-21 22:37:48 UTC
After reviewing the 1.3 fix, I've concluded that I don't particularly like the way that I approached it and 
will likely redo it and any fix is not going to be perfect enough for the 1.2 branch.  I expect that the 
ultimate solution will be to preserve the existing backslash behavior for XML files with a "http://
jakarta.apache.org/log4j/" namespace and perform no backslash expansion when using an "http://
logging.apache.org/" namespace.

The problem is that the undesired backslash expansion occurs in the generic XML processing.  By the 
time the value gets to the FileAppender, it has lost too much information to be able to recreate the 
expected file name.  However, if the backslash expansion is removed from the generic processing, then 
any configuration parameter that expected or compensated for the backslash expansion would have a 
different value.  This most likely parameter that would be tripped up is the PatternLayout.pattern where 
the tests (and possibly other external users) used patterns like value="%m\n".  However, since the 
processing affects all parameters, including those in custom appenders and layouts, there is no way of 
always doing the "right" thing.
Comment 13 Thorbjørn Ravn Andersen 2008-08-02 09:38:55 UTC
If this behaviour will not be backported to log4j 1.2, would it be reasonable to close this issue?