Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-646

New Indexing Framework for Nutch

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0
    • Fix Version/s: 0.9.0, 1.0.0
    • Component/s: indexer
    • Labels:
      None
    • Environment:

      All

    • Patch Info:
      Patch Available

      Description

      New indexing framework for Nutch that provides a more generic field abstraction consistent with Lucene index semantics. Allows multiple MR jobs to be created for different fields and those fields to be aggregated and indexed in the end. Overcomes limitations of the current indexer that limits what databases are passed into the indexer. Creates a new extension point as well for field-filters for manipulation of fields during the indexing process.

        Attachments

        1. NUTCH-646-2-20081126.patch
          97 kB
          Dennis Kubes
        2. NUTCH-646-1-20080818.patch
          148 kB
          Dennis Kubes
        3. arity-1.3.2.jar
          46 kB
          Dennis Kubes

          Issue Links

            Activity

              People

              • Assignee:
                musepwizard Dennis Kubes
                Reporter:
                musepwizard Dennis Kubes
              • Votes:
                1 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Due:
                  Created:
                  Updated:
                  Resolved: