Nutch
  1. Nutch
  2. NUTCH-1533

Implement getPrevModifiedTime(), setPrevModifiedTime(), getBatchId() and setBatchId() accessors in o.a.n.storage.WebPage

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 2.1
    • Fix Version/s: 2.2
    • Component/s: storage
    • Labels:
      None

      Description

      NUTCH-1532 needs to obtain a batchId to add to NutchDocument prior to indexing. This is currently not available as we do not store the information in the WebPage. Additionally, we do not store the other ModifiedTime's but incorrectly set them in o.a.n.crawl.FetchSchedule#setFetchSchedule.
      All the above accessors should be implemented.

      1. NUTCH-1533.patch
        23 kB
        lufeng
      2. NUTCH-1533v2.patch
        23 kB
        Lewis John McGibbney
      3. NUTCH-1533-v3.patch
        24 kB
        lufeng

        Issue Links

          Activity

            People

            • Assignee:
              lufeng
              Reporter:
              Lewis John McGibbney
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development