Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.5
    • Fix Version/s: 4.0-ALPHA
    • Component/s: None
    • Labels:
      None

      Description

      PreAnalyzedFieldType provides a functionality to index (and optionally store) content that was already processed and split into tokens using some external processing chain. This implementation defines a serialization format for sending tokens with any currently supported Attributes (eg. type, posIncr, payload, ...). This data is de-serialized into a regular TokenStream that is returned in Field.tokenStreamValue() and thus added to the index as index terms, and optionally a stored part that is returned in Field.stringValue() and is then added as a stored value of the field.

      This field type is useful for integrating Solr with existing text-processing pipelines, such as third-party NLP systems.

        Attachments

        1. preanalyzed.patch
          25 kB
          Andrzej Bialecki
        2. preanalyzed.patch
          25 kB
          Andrzej Bialecki
        3. SOLR-1535.patch
          45 kB
          Andrzej Bialecki
        4. SOLR-1535.patch
          28 kB
          Andrzej Bialecki
        5. SOLR-1535.patch
          27 kB
          Andrzej Bialecki

          Issue Links

            Activity

              People

              • Assignee:
                ab Andrzej Bialecki
                Reporter:
                ab Andrzej Bialecki
              • Votes:
                9 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: