Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-1690

JSONKeyValueTokenizerFactory -- JSON Tokenizer

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Schema and Analysis
    • Labels:
      None

      Description

      Sometimes it is nice to group structured data into a single field.

      This (rough) patch, takes JSON input and indexes tokens based on the key values pairs in the json.

      schema.xml
      <!-- JSON Field Type -->
          <fieldtype name="json" class="solr.TextField" positionIncrementGap="100" omitNorms="true">
            <analyzer type="index">
              <tokenizer class="solr.JSONKeyValueTokenizerFactory" keepArray="true" hierarchicalKey="false"/>
              <filter class="solr.TrimFilterFactory"/>
              <filter class="solr.LowerCaseFilterFactory"/>
            </analyzer>
            <analyzer type="query">
              <tokenizer class="solr.KeywordTokenizerFactory"/>
              <filter class="solr.TrimFilterFactory" />
              <filter class="solr.LowerCaseFilterFactory"/>
            </analyzer>
          </fieldtype>
      

      Given text:

       { "hello": "world", "rank":5 }
      

      indexed as two tokens:

      term position 1 2
      term text hello:world rank:5
      term type word word
      source start,end 12,17 27,28

        Attachments

        1. noggit-1.0-A1.jar
          21 kB
          Ryan McKinley
        2. SOLR-1690-JSONKeyValueTokenizerFactory.patch
          7 kB
          Ryan McKinley

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              ryantxu Ryan McKinley
            • Votes:
              2 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: