The bugs with default encodings and Zookeeper found by forbidden are very serious. If we respin 4.6, we should backport this patch!
Solr copies files from filesystem to zookeeper by reading them as string with default charset and then writing them as byte array after converting the string to UTF-8. This corrumpts the data. Zookeeper should simply read as byte from disk and store as byte in ZK. It does this the other way round, fortunately.