Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Won't Fix
    • Affects Version/s: 1.3
    • Fix Version/s: None
    • Component/s: Schema and Analysis
    • Labels:
      None

      Description

      An Analyzer that produce a TokenStream based on XML input that contains a marshalled TokenStream. Also contains static TokenStream XML marshaller.

      I kind of pulled this out of my pocket without testing it in a real environment in order to get some comments on the solution before I add it to my project. So cosider it a beta-patch.

      It use JSR173 XMLStream API available in Java 1.6, compatible with Java 1.5 and downloadable from https://sjsxp.dev.java.net/

      XSD:

      <?xml version="1.0" encoding="UTF-8"?>
      <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified"
                 xmlns:xs="http://www.w3.org/2001/XMLSchema">
          <xs:element name="tokens" type="tokensType"/>
          <xs:complexType name="tokensType">
              <xs:sequence>
                  <xs:element type="tokenType" name="token"/>
              </xs:sequence>
          </xs:complexType>
          <xs:complexType name="tokenType">
              <xs:sequence>
                  <xs:element type="xs:int" name="positionIncrement" maxOccurs="1"/>
                  <xs:element type="xs:string" name="term" minOccurs="1" maxOccurs="1"/>
                  <xs:element type="xs:string" name="type" maxOccurs="1"/>
                  <xs:element type="xs:int" name="startOffset" maxOccurs="1"/>
                  <xs:element type="xs:int" name="endOffset" maxOccurs="1"/>
                  <xs:element type="xs:int" name="flags" maxOccurs="1"/>
                  <xs:element type="payloadType" name="payload" maxOccurs="1"/>
              </xs:sequence>
          </xs:complexType>
          <xs:complexType name="payloadType">
              <xs:choice maxOccurs="1" minOccurs="1">
                  <xs:element type="bytesType" name="bytes"/>
                  <xs:element type="xs:string" name="hex"/>
                  <xs:element type="xs:string" name="base64"/>
              </xs:choice>
          </xs:complexType>
          <xs:complexType name="bytesType">
              <xs:sequence>
                  <xs:element type="xs:byte" name="byte" maxOccurs="unbounded" minOccurs="1"/>
              </xs:sequence>
          </xs:complexType>
      </xs:schema>
      

      Even though I've added a couple of variants to how to handle a Payload in the XSD only <hex> is supported.

      Example XML:

      <tokens>
        <token>
          <positionIncrement>1</positionIncrement>
          <term>term</term>
          <type>type</type>
          <startOffset>0</startOffset>
          <endOffset>3</endOffset>
          <flags>65535</flags>
          <payload><hex>fffefd</hex></payload>
        </token>
      </tokens>
      
      1. SOLR-1020.txt
        17 kB
        Karl Wettin

        Activity

        Karl Wettin created issue -
        Karl Wettin made changes -
        Field Original Value New Value
        Attachment SOLR-1020.txt [ 12400234 ]
        Erick Erickson made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Won't Fix [ 2 ]
        Erick Erickson made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Karl Wettin
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development