Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1533

Implement getPrevModifiedTime(), setPrevModifiedTime(), getBatchId() and setBatchId() accessors in o.a.n.storage.WebPage

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.1
    • Fix Version/s: 2.2
    • Component/s: storage
    • Labels:
      None

      Description

      NUTCH-1532 needs to obtain a batchId to add to NutchDocument prior to indexing. This is currently not available as we do not store the information in the WebPage. Additionally, we do not store the other ModifiedTime's but incorrectly set them in o.a.n.crawl.FetchSchedule#setFetchSchedule.
      All the above accessors should be implemented.

        Attachments

        1. NUTCH-1533.patch
          23 kB
          lufeng
        2. NUTCH-1533v2.patch
          23 kB
          Lewis John McGibbney
        3. NUTCH-1533-v3.patch
          24 kB
          lufeng

          Issue Links

            Activity

              People

              • Assignee:
                amuseme.lu lufeng
                Reporter:
                lewismc Lewis John McGibbney
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: