Uploaded image for project: 'ManifoldCF'
  1. ManifoldCF
  2. CONNECTORS-764

Hopcount logic fails to notice when the max number of hops is increased between crawls

    XMLWordPrintableJSON

Details

    Description

      When you do something like the following:

      (1) Set the max hops for a job relatively low
      (2) Crawl
      (3) Increase the max hops
      (4) Crawl again

      ... the documents that are labeled with the state "Hop count exceeded" at the end of the first crawl are never touched again. This is because there are no additional links added to the intrinsiclink table during the second crawl, and thus the method reactivateHopcountRemovedRecords() is never called, leaving the documents in an incorrect state.

      Attachments

        1. CONNECTORS-764.patch
          9 kB
          Karl Wright

        Activity

          People

            kwright@metacarta.com Karl Wright
            kwright@metacarta.com Karl Wright
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: