Description
In its current state the Leipzig corpus reader can only read from one language file. In order to create a model that can detect many languages all the input files must be converted and merged together.
It would be much easier to train a language identification model if the corpus reader could just read many sentences files form a directory.
This issue will change the Leipzig reader to read from all sentences file in a specified directory. The language category should be extracted from the file name itself.