Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1533

Implement getPrevModifiedTime(), setPrevModifiedTime(), getBatchId() and setBatchId() accessors in o.a.n.storage.WebPage

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 2.1
    • 2.2
    • storage
    • None

    Description

      NUTCH-1532 needs to obtain a batchId to add to NutchDocument prior to indexing. This is currently not available as we do not store the information in the WebPage. Additionally, we do not store the other ModifiedTime's but incorrectly set them in o.a.n.crawl.FetchSchedule#setFetchSchedule.
      All the above accessors should be implemented.

      Attachments

        1. NUTCH-1533.patch
          23 kB
          lufeng
        2. NUTCH-1533v2.patch
          23 kB
          Lewis John McGibbney
        3. NUTCH-1533-v3.patch
          24 kB
          lufeng

        Issue Links

          Activity

            People

              amuseme.lu lufeng
              lewismc Lewis John McGibbney
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: