Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-2129

Provide a Solr module for dynamic metadata extraction/indexing with Apache UIMA

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 3.1, 4.0-ALPHA
    • None
    • None

    Description

      Provide components to enable Apache UIMA automatic metadata extraction to be exploited when indexing documents.
      The purpose of this is to get unstructured information "inside" a document and create structured metadata (as fields) to enrich each document.

      Basically this can be done with a custom UpdateRequestProcessor which triggers UIMA while indexing documents.
      The basic UIMA implementation of UpdateRequestProcessor extracts sentences (with a tokenizer and an hidden Markov model tagger), named entities, language, suggested category, keywords and concepts (exploiting external services from OpenCalais and AlchemyAPI). Such an implementation can be easily extended adding or selecting different UIMA analysis engines, both from UIMA repositories on the web or creating new ones from scratch.

      More information can be found on the dedicated wiki page: http://wiki.apache.org/solr/SolrUIMA

      Attachments

        1. SOLR-2129.patch
          209 kB
          Tommaso Teofili
        2. SOLR-2129-asf-headers.patch
          225 kB
          Tommaso Teofili
        3. lib-jars.zip
          6.80 MB
          Tommaso Teofili
        4. SOLR-2129-version2.patch
          212 kB
          Tommaso Teofili
        5. SOLR-2129-version3.patch
          208 kB
          Tommaso Teofili
        6. SOLR-2129.patch
          200 kB
          Robert Muir
        7. SOLR-2129-version-5.patch
          211 kB
          Tommaso Teofili
        8. SOLR-2129-version-6.patch
          211 kB
          Tommaso Teofili

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            rcmuir Robert Muir
            teofili Tommaso Teofili
            Votes:
            6 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment