Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-684

Dedup support for Solr

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.0.0
    • indexer
    • None

    Description

      After NUTCH-442, nutch now can index to both solr and lucene. However, duplicate deletion feature (based on digests) is only available in lucene. It should also be available for solr.

      Attachments

        1. solrdedup_v2.patch
          13 kB
          Dogacan Guney
        2. NUTCH-684_solrdedup_v2.patch
          9 kB
          Dmitry Lihachev
        3. NUTCH-684_bin_nutch.patch
          1.0 kB
          Dmitry Lihachev
        4. solrdedup.patch
          9 kB
          Dogacan Guney

        Activity

          People

            dogacan Dogacan Guney
            dogacan Dogacan Guney
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: