Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
The class `CommonCrawlFormatJackson` keeps appending the documents when it is used for more formatting more than one document.
This class shall be modified to handle states such that the same instance can be used instead of creating new one for each document being dumped.
This suggestion has been mentioned in the previous fix related to format issue : https://github.com/apache/nutch/pull/103
Attachments
Issue Links
- is duplicated by
-
NUTCH-2250 CommonCrawlDumper : Invalid format + skipped parts
- Closed