Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-684

Dedup support for Solr

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.0.0
    • Component/s: indexer
    • Labels:
      None

      Description

      After NUTCH-442, nutch now can index to both solr and lucene. However, duplicate deletion feature (based on digests) is only available in lucene. It should also be available for solr.

        Attachments

        1. solrdedup_v2.patch
          13 kB
          Dogacan Guney
        2. NUTCH-684_solrdedup_v2.patch
          9 kB
          Dmitry Lihachev
        3. NUTCH-684_bin_nutch.patch
          1.0 kB
          Dmitry Lihachev
        4. solrdedup.patch
          9 kB
          Dogacan Guney

          Activity

            People

            • Assignee:
              dogacan Dogacan Guney
              Reporter:
              dogacan Dogacan Guney
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: