Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-646

New Indexing Framework for Nutch

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.9.0
    • 0.9.0, 1.0.0
    • indexer
    • None
    • All

    • Patch Available

    Description

      New indexing framework for Nutch that provides a more generic field abstraction consistent with Lucene index semantics. Allows multiple MR jobs to be created for different fields and those fields to be aggregated and indexed in the end. Overcomes limitations of the current indexer that limits what databases are passed into the indexer. Creates a new extension point as well for field-filters for manipulation of fields during the indexing process.

      Attachments

        1. NUTCH-646-1-20080818.patch
          148 kB
          Dennis Kubes
        2. arity-1.3.2.jar
          46 kB
          Dennis Kubes
        3. NUTCH-646-2-20081126.patch
          97 kB
          Dennis Kubes

        Issue Links

          Activity

            People

              musepwizard Dennis Kubes
              musepwizard Dennis Kubes
              Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: