Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1949 Dump out the Nutch data into the Common Crawl format
  3. NUTCH-2251

Make CommonCrawlFormatJackson instance reusable by properly handling object state

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: commoncrawl
    • Labels:
      None

      Description

      The class `CommonCrawlFormatJackson` keeps appending the documents when it is used for more formatting more than one document.
      This class shall be modified to handle states such that the same instance can be used instead of creating new one for each document being dumped.

      This suggestion has been mentioned in the previous fix related to format issue : https://github.com/apache/nutch/pull/103

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                thammegowda Thamme Gowda
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: