Details
-
New Feature
-
Status: Closed
-
Minor
-
Resolution: Won't Fix
-
1.3
-
None
-
None
Description
An Analyzer that produce a TokenStream based on XML input that contains a marshalled TokenStream. Also contains static TokenStream XML marshaller.
I kind of pulled this out of my pocket without testing it in a real environment in order to get some comments on the solution before I add it to my project. So cosider it a beta-patch.
It use JSR173 XMLStream API available in Java 1.6, compatible with Java 1.5 and downloadable from https://sjsxp.dev.java.net/
XSD:
<?xml version="1.0" encoding="UTF-8"?> <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="tokens" type="tokensType"/> <xs:complexType name="tokensType"> <xs:sequence> <xs:element type="tokenType" name="token"/> </xs:sequence> </xs:complexType> <xs:complexType name="tokenType"> <xs:sequence> <xs:element type="xs:int" name="positionIncrement" maxOccurs="1"/> <xs:element type="xs:string" name="term" minOccurs="1" maxOccurs="1"/> <xs:element type="xs:string" name="type" maxOccurs="1"/> <xs:element type="xs:int" name="startOffset" maxOccurs="1"/> <xs:element type="xs:int" name="endOffset" maxOccurs="1"/> <xs:element type="xs:int" name="flags" maxOccurs="1"/> <xs:element type="payloadType" name="payload" maxOccurs="1"/> </xs:sequence> </xs:complexType> <xs:complexType name="payloadType"> <xs:choice maxOccurs="1" minOccurs="1"> <xs:element type="bytesType" name="bytes"/> <xs:element type="xs:string" name="hex"/> <xs:element type="xs:string" name="base64"/> </xs:choice> </xs:complexType> <xs:complexType name="bytesType"> <xs:sequence> <xs:element type="xs:byte" name="byte" maxOccurs="unbounded" minOccurs="1"/> </xs:sequence> </xs:complexType> </xs:schema>
Even though I've added a couple of variants to how to handle a Payload in the XSD only <hex> is supported.
Example XML:
<tokens> <token> <positionIncrement>1</positionIncrement> <term>term</term> <type>type</type> <startOffset>0</startOffset> <endOffset>3</endOffset> <flags>65535</flags> <payload><hex>fffefd</hex></payload> </token> </tokens>