Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.3
    • Fix Version/s: 5.0
    • Component/s: query parsers
    • Labels:
      None

      Description

      Lucene contrib includes a query parser that is able to create the full-spectrum of Lucene queries, using an XML data structure.

      This patch adds "xml" query parser support to Solr.

      1. lucene-xml-query-parser-2.4-dev.jar
        44 kB
        Erik Hatcher
      2. SOLR-839.patch
        3 kB
        Erik Hatcher
      3. SOLR-839-object-parser.patch
        13 kB
        Erik Hatcher

        Activity

        Hide
        Ramkumar Aiyengar added a comment -

        We have improved on this since and added support for Solr schema and analysis chain, as well as a few new builder classes. Ideally we would like to move to the object parser as well, but this was an interim solution. We just haven't got it in a shape to be contributed as yet, but can do soon..

        Show
        Ramkumar Aiyengar added a comment - We have improved on this since and added support for Solr schema and analysis chain, as well as a few new builder classes. Ideally we would like to move to the object parser as well, but this was an interim solution. We just haven't got it in a shape to be contributed as yet, but can do soon..
        Hide
        Karl Wettin added a comment -

        Personally I didn't use this in anything new since my 2009 patch and comments . Actually I didn't use Solr at all since then.
        If my vote counts for anything then it would be for the JSON parser.

        Show
        Karl Wettin added a comment - Personally I didn't use this in anything new since my 2009 patch and comments . Actually I didn't use Solr at all since then. If my vote counts for anything then it would be for the JSON parser.
        Hide
        Jan Høydahl added a comment -

        Any plans on continuing on this or should effort go into the JSON parser in SOLR-4351 instead?

        Show
        Jan Høydahl added a comment - Any plans on continuing on this or should effort go into the JSON parser in SOLR-4351 instead?
        Hide
        Erik Hatcher added a comment -

        With the latest patch, these queries work (borrowed from SOLR-4351's tests):

          <term f="id">11</term>
        
          <field f="text">Now Cow</field>
        
          <prefix f="text">brow</prefix>
        
          <frange l="20" u="24">mul(foo_i,2)</frange>
        
          <join from="qqq_s" to="www_s">id:10</join>
        
          <join from="qqq_s" to="www_s"><term f="id">10</term></join>
        
          <lucene>text:Cow -id:1</lucene>
        

        The "object parser" path worked easily, but it's not as powerful as it needs to be. There needs to be a way to make BooleanQuery's (without having to use the lucene query parser) and then, like the XMLQueryParser, do stuff with span queries and so on.

        Maybe it's not worthwhile to have both JSON and XML query parsing as they both should probably use the same infrastructure. But I would hate to see a JSON form of XSLT used here. Ideally, the query "tree" would be defined server-side and lean/clean parameters would be passed in to fill in the blanks, but also possibly having some logic based on the values of the parameters (&in_stock=true, would if specified add a filter for inStock:true, for example)

        The XMLQParser in the last patch has xsl capability as well, so that you could define id.xsl to be:

        <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
          <xsl:template match="/Document">
            <term f="id"><xsl:value-of select="id"/></term>
          </xsl:template>
        </xsl:stylesheet>
        

        Then using &defType=xml&xsl=id&id=SOLR1000 a term query will be generated. (this is too simple of an example, as there would be other leaner/cleaner ways to do this exact one)

        Show
        Erik Hatcher added a comment - With the latest patch, these queries work (borrowed from SOLR-4351 's tests): <term f= "id" >11</term> <field f= "text" >Now Cow</field> <prefix f= "text" >brow</prefix> <frange l= "20" u= "24" >mul(foo_i,2)</frange> <join from= "qqq_s" to= "www_s" >id:10</join> <join from= "qqq_s" to= "www_s" ><term f= "id" >10</term></join> <lucene>text:Cow -id:1</lucene> The "object parser" path worked easily, but it's not as powerful as it needs to be. There needs to be a way to make BooleanQuery's (without having to use the lucene query parser) and then, like the XMLQueryParser, do stuff with span queries and so on. Maybe it's not worthwhile to have both JSON and XML query parsing as they both should probably use the same infrastructure. But I would hate to see a JSON form of XSLT used here. Ideally, the query "tree" would be defined server-side and lean/clean parameters would be passed in to fill in the blanks, but also possibly having some logic based on the values of the parameters (&in_stock=true, would if specified add a filter for inStock:true, for example) The XMLQParser in the last patch has xsl capability as well, so that you could define id.xsl to be: <xsl:stylesheet version= "1.0" xmlns:xsl= "http: //www.w3.org/1999/XSL/Transform" > <xsl:template match= "/Document" > <term f= "id" ><xsl:value-of select= "id" /></term> </xsl:template> </xsl:stylesheet> Then using &defType=xml&xsl=id&id=SOLR1000 a term query will be generated. (this is too simple of an example, as there would be other leaner/cleaner ways to do this exact one)
        Hide
        Erik Hatcher added a comment -

        This patch depends on the "object parsing" approach in SOLR-4351.

        This is a different approach from using Lucene's XML Query Parser. The XMLQueryParser is neat and all, but the builders aren't going to work well with Solr's schema.

        I tinkered with a SolrQueryBuilder, and that mostly works, but nested XML queries weren't working, so I revamped using the object parser.

        Show
        Erik Hatcher added a comment - This patch depends on the "object parsing" approach in SOLR-4351 . This is a different approach from using Lucene's XML Query Parser. The XMLQueryParser is neat and all, but the builders aren't going to work well with Solr's schema. I tinkered with a SolrQueryBuilder, and that mostly works, but nested XML queries weren't working, so I revamped using the object parser.
        Hide
        Daniel Collins added a comment -

        Other issues we are considering: should things like BoostingQuery really be in the extensions, why are they not part of core?

        Additionally, we've noticed that CoreParser is missing some queries:

        1) PhraseQuery
        2) PayloadTermQuery (it has it under the "old" name of BoostingTermQuery, should there be an alias?)
        3) FunctionQuery (not sure if this is even possible, presumably would require a lot of configuration about the function to call)

        Might look at some of those if I get bored of relatives over Xmas

        Show
        Daniel Collins added a comment - Other issues we are considering: should things like BoostingQuery really be in the extensions, why are they not part of core? Additionally, we've noticed that CoreParser is missing some queries: 1) PhraseQuery 2) PayloadTermQuery (it has it under the "old" name of BoostingTermQuery, should there be an alias?) 3) FunctionQuery (not sure if this is even possible, presumably would require a lot of configuration about the function to call) Might look at some of those if I get bored of relatives over Xmas
        Hide
        Daniel Collins added a comment - - edited

        We have a version of this we have built with Solr 4.0, it is still WIP, but this is what we have.

        import org.apache.solr.common.params.CommonParams;
        import org.apache.solr.common.params.SolrParams;
        import org.apache.solr.common.util.NamedList;
        import org.apache.solr.request.SolrQueryRequest;
        import org.apache.solr.search.*;
        import org.apache.lucene.search.Query;
        import org.apache.lucene.queryparser.classic.ParseException;
        import org.apache.lucene.queryparser.xml.*;
        
        import java.io.ByteArrayInputStream;
        import java.io.UnsupportedEncodingException;
        
        public class XmlQParserPlugin extends QParserPlugin {
        
            private String contentEncoding = "UTF8";
        
            public void init(NamedList args) {
            }
        
            public QParser createParser(String qstr, SolrParams localParams,
                    SolrParams params, SolrQueryRequest req) {
                return new XmlQParser(qstr, localParams, params, req);
            }
        
            class XmlQParser extends QParser {
                public XmlQParser(String qstr, SolrParams localParams,
                        SolrParams params, SolrQueryRequest req) {
                    super(qstr, localParams, params, req);
                }
        
                public Query parse() throws ParseException {
                    SolrQueryParser lparser;
        
                    String qstr = getString();
                    if (qstr == null || qstr.length() == 0)
                        return null;
        
                    String defaultField = getParam(CommonParams.DF);
                    if (defaultField == null) {
                        defaultField = getReq().getSchema().getDefaultSearchFieldName();
                    }
                    lparser = new SolrQueryParser(this, defaultField);
        
                    lparser.setDefaultOperator(QueryParsing
                            .getQueryParserDefaultOperator(getReq().getSchema(),
                                    getParam(QueryParsing.OP)));
        
                    CoreParser parser = new CoreParser(getReq().getSchema().getQueryAnalyzer(), lparser);
                    // CorePlusExtensions parser requires lucene sandbox, which isn't bundled with Solr (yet).
        //            CorePlusExtensionsParser parser = new CorePlusExtensionsParser(
        //                    getReq().getSchema().getQueryAnalyzer(), lparser);
                    try {
                        return parser.parse(new ByteArrayInputStream(getString()
                                .getBytes(contentEncoding)));
                    } catch (UnsupportedEncodingException e) {
                        throw new ParseException(e.getMessage());
                    } catch (ParserException e) {
                        throw new ParseException(e.getMessage());
                    }
                }
            }
        }
        

        As the comment mentions, we can't use the CorePlusExtensionsParser as it requires the lucene-sandbox.jar which isn't currently bundled with Solr 4.0?

        Show
        Daniel Collins added a comment - - edited We have a version of this we have built with Solr 4.0, it is still WIP, but this is what we have. import org.apache.solr.common.params.CommonParams; import org.apache.solr.common.params.SolrParams; import org.apache.solr.common.util.NamedList; import org.apache.solr.request.SolrQueryRequest; import org.apache.solr.search.*; import org.apache.lucene.search.Query; import org.apache.lucene.queryparser.classic.ParseException; import org.apache.lucene.queryparser.xml.*; import java.io.ByteArrayInputStream; import java.io.UnsupportedEncodingException; public class XmlQParserPlugin extends QParserPlugin { private String contentEncoding = "UTF8" ; public void init(NamedList args) { } public QParser createParser( String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) { return new XmlQParser(qstr, localParams, params, req); } class XmlQParser extends QParser { public XmlQParser( String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) { super (qstr, localParams, params, req); } public Query parse() throws ParseException { SolrQueryParser lparser; String qstr = getString(); if (qstr == null || qstr.length() == 0) return null ; String defaultField = getParam(CommonParams.DF); if (defaultField == null ) { defaultField = getReq().getSchema().getDefaultSearchFieldName(); } lparser = new SolrQueryParser( this , defaultField); lparser.setDefaultOperator(QueryParsing .getQueryParserDefaultOperator(getReq().getSchema(), getParam(QueryParsing.OP))); CoreParser parser = new CoreParser(getReq().getSchema().getQueryAnalyzer(), lparser); // CorePlusExtensions parser requires lucene sandbox, which isn't bundled with Solr (yet). // CorePlusExtensionsParser parser = new CorePlusExtensionsParser( // getReq().getSchema().getQueryAnalyzer(), lparser); try { return parser.parse( new ByteArrayInputStream(getString() .getBytes(contentEncoding))); } catch (UnsupportedEncodingException e) { throw new ParseException(e.getMessage()); } catch (ParserException e) { throw new ParseException(e.getMessage()); } } } } As the comment mentions, we can't use the CorePlusExtensionsParser as it requires the lucene-sandbox.jar which isn't currently bundled with Solr 4.0?
        Hide
        Erik Hatcher added a comment -

        I'll aim this for 5.0, and once it's done will consider backporting to 4.x. Might not be for 4.1, but will aim that way at least. Sorry for letting this one collect dust for so long.

        Show
        Erik Hatcher added a comment - I'll aim this for 5.0, and once it's done will consider backporting to 4.x. Might not be for 4.1, but will aim that way at least. Sorry for letting this one collect dust for so long.
        Hide
        Robert Muir added a comment -

        3.4 -> 3.5

        Show
        Robert Muir added a comment - 3.4 -> 3.5
        Hide
        Robert Muir added a comment -

        Bulk move 3.2 -> 3.3

        Show
        Robert Muir added a comment - Bulk move 3.2 -> 3.3
        Hide
        Hoss Man added a comment -

        Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email...

        http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E

        Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed.

        A unique token for finding these 240 issues in the future: hossversioncleanup20100527

        Show
        Hoss Man added a comment - Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email... http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed. A unique token for finding these 240 issues in the future: hossversioncleanup20100527
        Hide
        Erik Hatcher added a comment -

        Marking for 1.5 and unassigning. I'll come back to this eventually, and integrate it fully. Or someone else can take the lead.

        Show
        Erik Hatcher added a comment - Marking for 1.5 and unassigning. I'll come back to this eventually, and integrate it fully. Or someone else can take the lead.
        Hide
        Shalin Shekhar Mangar added a comment -

        As Yonik mentioned, we should use a SolrQueryParser with the XML QParser. Currently, queries on numeric fields (both legacy and trie) and date fields do not work. The current patch just enables one to use the Lucene XML QParser with Solr. It is not integrated with Solr as well as other qparsers are.

        Show
        Shalin Shekhar Mangar added a comment - As Yonik mentioned, we should use a SolrQueryParser with the XML QParser. Currently, queries on numeric fields (both legacy and trie) and date fields do not work. The current patch just enables one to use the Lucene XML QParser with Solr. It is not integrated with Solr as well as other qparsers are.
        Hide
        Karl Wettin added a comment -

        Any progress on this or should we mark for 1.5?

        Yonik had a comment there about the UTF8, is this what you are refeering to? Perhaps Solr isn't always using UTF8 as encoding for the XML? Let me know where I can pick up the prefered content encoding and I'll fix a new patch. Except for that I don't know what else might be required. I'd be happy to see this as a part of Solr as I'm using it in production, so let me know what I can do to help out.

        Show
        Karl Wettin added a comment - Any progress on this or should we mark for 1.5? Yonik had a comment there about the UTF8, is this what you are refeering to? Perhaps Solr isn't always using UTF8 as encoding for the XML? Let me know where I can pick up the prefered content encoding and I'll fix a new patch. Except for that I don't know what else might be required. I'd be happy to see this as a part of Solr as I'm using it in production, so let me know what I can do to help out.
        Hide
        Grant Ingersoll added a comment -

        Any progress on this or should we mark for 1.5?

        Show
        Grant Ingersoll added a comment - Any progress on this or should we mark for 1.5?
        Hide
        Yonik Seeley added a comment -

        It's a shame that the String needs to be re-encoded in UTF8 just for the XML parser to make Strings again... but that's just implementation and can always be sped up in the future.

        Instead of using getSchema().getSolrQueryParser(null), we should construct a SolrQueryParser with the current XML QParser.

        Show
        Yonik Seeley added a comment - It's a shame that the String needs to be re-encoded in UTF8 just for the XML parser to make Strings again... but that's just implementation and can always be sped up in the future. Instead of using getSchema().getSolrQueryParser(null), we should construct a SolrQueryParser with the current XML QParser.
        Hide
        Shalin Shekhar Mangar added a comment -

        I use this in a live environment and would very much like to see it committed.

        Ok sure, let's keep this for 1.4.

        Show
        Shalin Shekhar Mangar added a comment - I use this in a live environment and would very much like to see it committed. Ok sure, let's keep this for 1.4.
        Hide
        Karl Wettin added a comment -

        Erik/Karl, are you still interested in this issue? or should we defer it to 1.5?

        I use this in a live environment and would very much like to see it committed.

        Show
        Karl Wettin added a comment - Erik/Karl, are you still interested in this issue? or should we defer it to 1.5? I use this in a live environment and would very much like to see it committed.
        Hide
        Shalin Shekhar Mangar added a comment -

        Erik/Karl, are you still interested in this issue? or should we defer it to 1.5?

        Show
        Shalin Shekhar Mangar added a comment - Erik/Karl, are you still interested in this issue? or should we defer it to 1.5?
        Hide
        Karl Wettin added a comment -

        No, that was not a bug, it was that I don't know Solr that well. Sorry. I suppose I hit the upper limit of HTTP get request length.

        .query(solrQuery, SolrRequest.METHOD.POST);
        
        Show
        Karl Wettin added a comment - No, that was not a bug, it was that I don't know Solr that well. Sorry. I suppose I hit the upper limit of HTTP get request length. .query(solrQuery, SolrRequest.METHOD.POST);
        Hide
        Karl Wettin added a comment -

        There seems to be a bug here somewhere. As the xml query hit 6-7 kb data or 60-70 clauses i start getting connection resets. If I switch to BoostingTermQuery then it seems to indicate it has to do with the amount of xml data and not to do with number of clauses. I get nothing in my Solr log about this failing request.

            SolrQuery solrQuery = new SolrQuery();
            solrQuery.add("fl", "score");
            solrQuery.add("defType", "xml");
        
            StringBuilder xml = new StringBuilder(10000);
            xml.append("<BooleanQuery fieldName=\"").append(FieldNames.shingles).append("\">");
        
            for (int i = 0; i < 10000; i++) {
              xml.append("<Clause occurs=\"should\">");
              xml.append("<TermQuery boost=\"1.0\">");
              xml.append("foo");
              xml.append("</TermQuery>");
              xml.append("</Clause>");
        
              solrQuery.setQuery(xml.toString() + "</BooleanQuery>");
        
              System.out.println(i + "\t" + solrQuery.getQuery().length());
              
              try {
                SearchService.getInstance().getSolr().query(solrQuery);
              } catch (SolrServerException e) {
                if (e.getCause() != null && e.getCause().getCause() != null && e.getCause().getCause() instanceof java.net.SocketException) {
                  throw e;
                }
              }
            }
        
        
        0	121
        1	192
        2	263
        3	334
        4	405
        5	476
        6	547
        7	618
        8	689
        9	760
        10	831
        11	902
        12	973
        13	1044
        14	1115
        15	1186
        16	1257
        17	1328
        18	1399
        19	1470
        20	1541
        21	1612
        22	1683
        23	1754
        24	1825
        25	1896
        26	1967
        27	2038
        28	2109
        29	2180
        30	2251
        31	2322
        32	2393
        33	2464
        34	2535
        35	2606
        36	2677
        37	2748
        38	2819
        39	2890
        40	2961
        41	3032
        42	3103
        43	3174
        44	3245
        45	3316
        46	3387
        47	3458
        48	3529
        49	3600
        50	3671
        51	3742
        52	3813
        53	3884
        54	3955
        55	4026
        56	4097
        57	4168
        58	4239
        59	4310
        60	4381
        61	4452
        62	4523
        63	4594
        64	4665
        65	4736
        66	4807
        67	4878
        68	4949
        69	5020
        70	5091
        71	5162
        72	5233
        73	5304
        74	5375
        75	5446
        76	5517
        2097 [main] INFO  org.apache.commons.httpclient.HttpMethodDirector  - I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server localhost failed to respond
        2098 [main] INFO  org.apache.commons.httpclient.HttpMethodDirector  - Retrying request
        2100 [main] INFO  org.apache.commons.httpclient.HttpMethodDirector  - I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server localhost failed to respond
        2100 [main] INFO  org.apache.commons.httpclient.HttpMethodDirector  - Retrying request
        2102 [main] INFO  org.apache.commons.httpclient.HttpMethodDirector  - I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server localhost failed to respond
        2102 [main] INFO  org.apache.commons.httpclient.HttpMethodDirector  - Retrying request
        77	5588
        2108 [main] INFO  org.apache.commons.httpclient.HttpMethodDirector  - I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server localhost failed to respond
        2109 [main] INFO  org.apache.commons.httpclient.HttpMethodDirector  - Retrying request
        
        org.apache.solr.client.solrj.SolrServerException: Error executing query
        	at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:96)
        	at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:109)
        	at se.hundraartonetthundra.TestSearch.test(TestSearch.java:68)
        	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
        	at com.intellij.rt.junit4.Junit4ClassSuite.run(Junit4ClassSuite.java:99)
        	at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:40)
        Caused by: org.apache.solr.client.solrj.SolrServerException: java.net.SocketException: Connection reset
        	at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:391)
        	at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:183)
        	at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90)
        	... 22 more
        Caused by: java.net.SocketException: Connection reset
        	at java.net.SocketInputStream.read(SocketInputStream.java:168)
        	at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        	at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
        	at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78)
        	at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106)
        	at org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116)
        	at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413)
        	at org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1973)
        	at org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735)
        	at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098)
        	at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
        	at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
        	at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
        	at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
        	at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:335)
        	... 24 more
        
        
        
        Show
        Karl Wettin added a comment - There seems to be a bug here somewhere. As the xml query hit 6-7 kb data or 60-70 clauses i start getting connection resets. If I switch to BoostingTermQuery then it seems to indicate it has to do with the amount of xml data and not to do with number of clauses. I get nothing in my Solr log about this failing request. SolrQuery solrQuery = new SolrQuery(); solrQuery.add( "fl" , "score" ); solrQuery.add( "defType" , "xml" ); StringBuilder xml = new StringBuilder(10000); xml.append( "<BooleanQuery fieldName=\" ").append(FieldNames.shingles).append(" \ ">" ); for ( int i = 0; i < 10000; i++) { xml.append( "<Clause occurs=\" should\ ">" ); xml.append( "<TermQuery boost=\" 1.0\ ">" ); xml.append( "foo" ); xml.append( "</TermQuery>" ); xml.append( "</Clause>" ); solrQuery.setQuery(xml.toString() + "</BooleanQuery>" ); System .out.println(i + "\t" + solrQuery.getQuery().length()); try { SearchService.getInstance().getSolr().query(solrQuery); } catch (SolrServerException e) { if (e.getCause() != null && e.getCause().getCause() != null && e.getCause().getCause() instanceof java.net.SocketException) { throw e; } } } 0 121 1 192 2 263 3 334 4 405 5 476 6 547 7 618 8 689 9 760 10 831 11 902 12 973 13 1044 14 1115 15 1186 16 1257 17 1328 18 1399 19 1470 20 1541 21 1612 22 1683 23 1754 24 1825 25 1896 26 1967 27 2038 28 2109 29 2180 30 2251 31 2322 32 2393 33 2464 34 2535 35 2606 36 2677 37 2748 38 2819 39 2890 40 2961 41 3032 42 3103 43 3174 44 3245 45 3316 46 3387 47 3458 48 3529 49 3600 50 3671 51 3742 52 3813 53 3884 54 3955 55 4026 56 4097 57 4168 58 4239 59 4310 60 4381 61 4452 62 4523 63 4594 64 4665 65 4736 66 4807 67 4878 68 4949 69 5020 70 5091 71 5162 72 5233 73 5304 74 5375 75 5446 76 5517 2097 [main] INFO org.apache.commons.httpclient.HttpMethodDirector - I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server localhost failed to respond 2098 [main] INFO org.apache.commons.httpclient.HttpMethodDirector - Retrying request 2100 [main] INFO org.apache.commons.httpclient.HttpMethodDirector - I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server localhost failed to respond 2100 [main] INFO org.apache.commons.httpclient.HttpMethodDirector - Retrying request 2102 [main] INFO org.apache.commons.httpclient.HttpMethodDirector - I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server localhost failed to respond 2102 [main] INFO org.apache.commons.httpclient.HttpMethodDirector - Retrying request 77 5588 2108 [main] INFO org.apache.commons.httpclient.HttpMethodDirector - I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server localhost failed to respond 2109 [main] INFO org.apache.commons.httpclient.HttpMethodDirector - Retrying request org.apache.solr.client.solrj.SolrServerException: Error executing query at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:96) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:109) at se.hundraartonetthundra.TestSearch.test(TestSearch.java:68) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81) at com.intellij.rt.junit4.Junit4ClassSuite.run(Junit4ClassSuite.java:99) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:40) Caused by: org.apache.solr.client.solrj.SolrServerException: java.net.SocketException: Connection reset at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:391) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:183) at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90) ... 22 more Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:168) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78) at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106) at org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413) at org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1973) at org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735) at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:335) ... 24 more
        Hide
        Karl Wettin added a comment -

        The patch does not parse UTF8. Not sure if it is supposed to do that by default? Below is my version of the class. It needs to pick up the contentEncoding from the properties, but I'm not sure where and how.

        import org.apache.solr.common.params.SolrParams;
        import org.apache.solr.common.util.NamedList;
        import org.apache.solr.request.SolrQueryRequest;
        import org.apache.solr.search.QParserPlugin;
        import org.apache.solr.search.QParser;
        import org.apache.lucene.search.Query;
        import org.apache.lucene.queryParser.ParseException;
        import org.apache.lucene.xmlparser.CorePlusExtensionsParser;
        import org.apache.lucene.xmlparser.ParserException;
        
        import java.io.ByteArrayInputStream;
        import java.io.UnsupportedEncodingException;
        
        public class XmlQParserPlugin extends QParserPlugin {
        
          private String contentEncoding = "UTF8";
        
          public void init(NamedList args) {
          }
        
          public QParser createParser(String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) {
            return new XmlQParser(qstr, localParams, params, req);
          }
        
        
          class XmlQParser extends QParser {
            public XmlQParser(String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) {
              super(qstr, localParams, params, req);
            }
        
            public Query parse() throws ParseException {
              CorePlusExtensionsParser parser = new CorePlusExtensionsParser(getReq().getSchema().getQueryAnalyzer(), getReq().getSchema().getSolrQueryParser(null));
              try {
                return parser.parse(new ByteArrayInputStream(getString().getBytes(contentEncoding)));
              } catch (UnsupportedEncodingException e) {
                throw new ParseException(e.getMessage());
              } catch (ParserException e) {
                throw new ParseException(e.getMessage());
              }
            }
          }
        
        }
        
        Show
        Karl Wettin added a comment - The patch does not parse UTF8. Not sure if it is supposed to do that by default? Below is my version of the class. It needs to pick up the contentEncoding from the properties, but I'm not sure where and how. import org.apache.solr.common.params.SolrParams; import org.apache.solr.common.util.NamedList; import org.apache.solr.request.SolrQueryRequest; import org.apache.solr.search.QParserPlugin; import org.apache.solr.search.QParser; import org.apache.lucene.search.Query; import org.apache.lucene.queryParser.ParseException; import org.apache.lucene.xmlparser.CorePlusExtensionsParser; import org.apache.lucene.xmlparser.ParserException; import java.io.ByteArrayInputStream; import java.io.UnsupportedEncodingException; public class XmlQParserPlugin extends QParserPlugin { private String contentEncoding = "UTF8" ; public void init(NamedList args) { } public QParser createParser( String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) { return new XmlQParser(qstr, localParams, params, req); } class XmlQParser extends QParser { public XmlQParser( String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) { super (qstr, localParams, params, req); } public Query parse() throws ParseException { CorePlusExtensionsParser parser = new CorePlusExtensionsParser(getReq().getSchema().getQueryAnalyzer(), getReq().getSchema().getSolrQueryParser( null )); try { return parser.parse( new ByteArrayInputStream(getString().getBytes(contentEncoding))); } catch (UnsupportedEncodingException e) { throw new ParseException(e.getMessage()); } catch (ParserException e) { throw new ParseException(e.getMessage()); } } } }
        Hide
        Erik Hatcher added a comment -

        Thanks Mark. I've now read your LIA2 contribution and will implement the optional XSL'ing and caching as you suggest. Good stuff.

        Show
        Erik Hatcher added a comment - Thanks Mark. I've now read your LIA2 contribution and will implement the optional XSL'ing and caching as you suggest. Good stuff.
        Hide
        Mark Harwood added a comment -

        A couple of comments, Erik.

        You can probably look to cache the CorePlusExtensionsQueryParser - it's designed to be threadsafe.

        One alternative interface you can possibly look at providing also is to let clients send name/value pairs as criteria (e.g. typical HTML form input) and use QueryTemplateManager along with XSL files in Solr server to map the input parameters into executable XML. This keeps the client interface freer of Lucene internals e.g. filters vs queries etc and the Solr administrator able to maintain and tweak the appropriate query templates.

        QueryTemplateManager is described in contrib unit tests and also written up in my LIA2 contribution (I posted this to Mike McCandless)

        Cheers
        Mark

        Show
        Mark Harwood added a comment - A couple of comments, Erik. You can probably look to cache the CorePlusExtensionsQueryParser - it's designed to be threadsafe. One alternative interface you can possibly look at providing also is to let clients send name/value pairs as criteria (e.g. typical HTML form input) and use QueryTemplateManager along with XSL files in Solr server to map the input parameters into executable XML. This keeps the client interface freer of Lucene internals e.g. filters vs queries etc and the Solr administrator able to maintain and tweak the appropriate query templates. QueryTemplateManager is described in contrib unit tests and also written up in my LIA2 contribution (I posted this to Mike McCandless) Cheers Mark
        Hide
        Erik Hatcher added a comment -

        Example usage: http://localhost:8983/solr/select?q=%3CTermQuery%20fieldName=%22text%22%3Eipod%3C/TermQuery%3E&defType=xml&debugQuery=true

        =>

        <str name="querystring"><TermQuery fieldName="text">ipod</TermQuery></str>
        <str name="parsedquery">text:ipod</str>

        Show
        Erik Hatcher added a comment - Example usage: http://localhost:8983/solr/select?q=%3CTermQuery%20fieldName=%22text%22%3Eipod%3C/TermQuery%3E&defType=xml&debugQuery=true => <str name="querystring"><TermQuery fieldName="text">ipod</TermQuery></str> <str name="parsedquery">text:ipod</str>
        Hide
        Erik Hatcher added a comment -

        Basic support for XML query parser. There are likely to be configuration options desired to wire in extensions.

        Show
        Erik Hatcher added a comment - Basic support for XML query parser. There are likely to be configuration options desired to wire in extensions.
        Hide
        Erik Hatcher added a comment -

        XML Query Parser API. I just noticed this is a 2.4-dev version (though no changes in 2.9-dev for this component). We'll upgrade it to the same build as Lucene itself before committing.

        Show
        Erik Hatcher added a comment - XML Query Parser API. I just noticed this is a 2.4-dev version (though no changes in 2.9-dev for this component). We'll upgrade it to the same build as Lucene itself before committing.

          People

          • Assignee:
            Erik Hatcher
            Reporter:
            Erik Hatcher
          • Votes:
            22 Vote for this issue
            Watchers:
            35 Start watching this issue

            Dates

            • Created:
              Updated:

              Development