Uploaded image for project: 'ManifoldCF'
  1. ManifoldCF
  2. CONNECTORS-290

Stuffer query will perform poorly under some conditions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3, ManifoldCF 0.4
    • ManifoldCF 0.4
    • None

    Description

      The stuffer query, which returns documents in index order by docpriority for processing, performs poorly when lots of documents are in the queue and have a good priority but can't be taken because of job state. This can happen when:
      (1) a large job is aborted, leaving lots of jobqueue records with docpriority values around;
      (2) a job is paused for an extended period of time, while others are running.

      In the second case, when the paused job is resumed, there's an added problem because, for a while, only documents from the paused job will be processed.

      The answer to (1) may well be to clean out all docpriority values on job abort. Right now there is no logic that sets
      docpriority values to null, but there clearly needs to be, or the docpriority index will remain polluted with rows that must be scanned but cannot be used for an extended period of time.

      The "correct" answer to (2) is to clear out docpriority values when a job is paused, and then redo them all when the job is resumed. Similarly, docpriority values should be set for all of a job's documents when a job is started, and should be nulled out when documents enter non-active states. The former currently occurs, but not the latter.

      Attachments

        Activity

          People

            kwright@metacarta.com Karl Wright
            kwright@metacarta.com Karl Wright
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: