Uploaded image for project: 'ManifoldCF'
  1. ManifoldCF
  2. CONNECTORS-1497

Re-index seeded modified documents when the re-crawl interval is infinity and connector model is MODEL_ADD_CHANGE

    XMLWordPrintableJSON

Details

    Description

      Trying to avoid a full scan of all documents for a better efficiency with a large number of documents. I tried so many different setting for the Jobs but I couldn't accomplish that. Especially when the repository connector model is MODEL_ADD_CHANGE I was expecting the modified documents seeded should be re-indexed immediately similar to the new seeds but I found out it uses the re-crawl time as the scheduled time and it waits for the full scan to get re-indexed. I avoided full scan by setting the re-crawl interval to infinity but still, my modified documents seeds were not getting indexed. After digging into the code for quite good time. I did some modification to the JobManager and it worked for me. I would like to share the change with you for review so I opened this ticket.

      Attachments

        1. CONNECTORS-1497.patch3
          8 kB
          Ahmed Mahfouz
        2. CONNECTORS-1497.patch2
          2 kB
          Ahmed Mahfouz
        3. CONNECTORS-1497.patch
          2 kB
          Ahmed Mahfouz

        Activity

          People

            kwright@metacarta.com Karl Wright
            ahmedsafwat Ahmed Mahfouz
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: