Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.2
-
None
-
None
Description
The Wikipedia XML splitter inner loop erronously detects end of the line-iterator which causes it to create chunks with just one line worth of page content instead of respecting the --chunkSize cli option.
Simple patch to fix this will follow.