Description
As part of the preprocessing, Joshua tokenizes input sentences, for example splitting punctuation off from words. This is currently done with a set of Perl preprocessing scripts [1], but it would be nice to move this to the decoder itself.
[1] : https://github.com/apache/incubator-joshua/tree/master/scripts/preparation