This problem can manifest itself in a number of ways, but the underlying root
issue is that httpd runs with a locale of C, the default POSIX locale, which
uses a 7 bit ASCII character set. Internally in various places in Subversion we
attempt to convert from our internal utf8 strings into native encoded strings,
which fails spectacularly under DAV when the utf8 string includes multibyte
characters because those characters cannot be expressed in 7 bit ASCII encoding.
I brought this problem up on the HTTPD dev list, and the consensus is that httpd
runs in the C locale because using the system locale (i.e. respecting the LANG
or LC_ALL environment variables via a call to setlocale(LC_ALL, "") like svn
does for all its command line programs) results in unpredictable behavior for
various functions that depend on the locale. Apparently some modules take
matters into their own hands and use setlocale to set the locale manually, but
this is very much not a recommended practice because it's a global setting and
the results are not easily predicted.
What is the end result of this?
Well, you can't use 'svn lock' on a file that has multibyte characters in its
path, because when mod_dav_svn tries to call the pre-lock hook script it needs
to pass the filename, which it tries to translate into native encoding first.
This case is even weirder, since the hook script actually runs in an empty
environment, so it has no way to know what its locale actually is because it
can't access the appropriate environment variables from the parent httpd process.
Perhaps more disturbing is that if a repository has multibyte characters in its
path (i.e. it's just in a directory that's got multibyte charactes in its name)
you can't do anything at all with it, even browsing the repository results in
errors about not being able to open the repository, the underlying errors that
show up in error_log are predictably about translating from utf8 -> native.
What's the fix? I have no idea. We can't just trust that converting to
"native" encoding is the correct thing to do in all cases, but unfortunately
we've got an awful lot of code in svn that assumes that's what it should be
doing when paths are passed from svn's internals into the outside world.
Additionally in the case of hook scripts it's not clear that converting to
native is even desireable, since the script doesn't have any way to tell what
native encoding actually is.