Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-684

Dedup support for Solr

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.0.0
    • indexer
    • None

    Description

      After NUTCH-442, nutch now can index to both solr and lucene. However, duplicate deletion feature (based on digests) is only available in lucene. It should also be available for solr.

      Attachments

        1. NUTCH-684_bin_nutch.patch
          1.0 kB
          Dmitry Lihachev
        2. NUTCH-684_solrdedup_v2.patch
          9 kB
          Dmitry Lihachev
        3. solrdedup_v2.patch
          13 kB
          Dogacan Guney
        4. solrdedup.patch
          9 kB
          Dogacan Guney

        Activity

          People

            dogacan Dogacan Guney
            dogacan Dogacan Guney
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: