Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-1535

Pre-analyzed field type

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.5
    • 4.0-ALPHA
    • None
    • None

    Description

      PreAnalyzedFieldType provides a functionality to index (and optionally store) content that was already processed and split into tokens using some external processing chain. This implementation defines a serialization format for sending tokens with any currently supported Attributes (eg. type, posIncr, payload, ...). This data is de-serialized into a regular TokenStream that is returned in Field.tokenStreamValue() and thus added to the index as index terms, and optionally a stored part that is returned in Field.stringValue() and is then added as a stored value of the field.

      This field type is useful for integrating Solr with existing text-processing pipelines, such as third-party NLP systems.

      Attachments

        1. SOLR-1535.patch
          27 kB
          Andrzej Bialecki
        2. SOLR-1535.patch
          28 kB
          Andrzej Bialecki
        3. SOLR-1535.patch
          45 kB
          Andrzej Bialecki
        4. preanalyzed.patch
          25 kB
          Andrzej Bialecki
        5. preanalyzed.patch
          25 kB
          Andrzej Bialecki

        Issue Links

          Activity

            People

              ab Andrzej Bialecki
              ab Andrzej Bialecki
              Votes:
              9 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: