Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1784

modifiedTime and prevmodifiedTime never set

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 2.2.1
    • 2.3
    • None
    • None
    • Patch Available

    Description

      modifiedTime is never set. If you use DefaultFetchScheduler, modifiedTime is always zero as default. But if you use AdaptiveFetchScheduler, modifiedTime is set only once in the beginning by zero-control of AdaptiveFetchScheduler.
      But this is not sufficient since modifiedTime needs to be updated whenever last modified time is available. We corrected this with a patch.

      Also we noticed that prevModifiedTime is not written to database and we corrected that too.

      With this patch, whenever lastModifiedTime is available, we do two things. First we set modifiedTime in the Page object to prevModifiedTime. After that we set lastModifiedTime to modifiedTime.

      Attachments

        1. NUTCH-1651.patch
          2 kB
          hanchi

        Issue Links

          Activity

            People

              Unassigned Unassigned
              hanchi hanchi
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: