Uploaded image for project: 'ManifoldCF'
  1. ManifoldCF
  2. CONNECTORS-999

We need a model type for handling chains with one-level versioned directories

    XMLWordPrintableJSON

Details

    Description

      The framework currently handles CHAINED models by expecting that reference extraction take place on all documents with references, regardless of whether those documents have changed. But a far more common kind of situation is when a document's references are accurately captured by its version information. It would be great to perform incremental crawls in this case without doing any additional reference extraction.

      Such a scan would require requeuing of all existing documents, which would make minimal scans have to process more documents. But this could be a positive tradeoff if versioning a document was cheap compared to processing it.

      Maybe a new class of models, e.g. VERSIONED_CHAINED, might make sense to consider adding.

      Attachments

        Activity

          People

            kwright@metacarta.com Karl Wright
            kwright@metacarta.com Karl Wright
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: