Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1551

Improve WebTableReader field order and display batchId

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.1
    • 2.2
    • crawldb
    • None
    • Patch Available

    Description

      I've made slight modifications to WebTableReader to dump a more appropriately structured fields when dumping the webdb. The structure now more closely reflects the set out of the webpage.avsc file.
      Additionally, I've added the batchId however for backwards compatability with existing webdb's this is only appended to the string buffer if it is not null value.

      Attachments

        1. NUTCH-1551.patch
          2 kB
          Lewis John McGibbney

        Activity

          People

            lewismc Lewis John McGibbney
            lewismc Lewis John McGibbney
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: