|
[
Permlink
| « Hide
]
Darren Erik Vengroff added a comment - 02/Jun/06 01:33 PM
Previous version didn't properly escape the query in the deleteByQuery() case.
This looks quite good and well documented! Thanks for this contribution. The only issue my current project would have with this is the Map which prevents multiple fields of the same name from being added. I use a lot of multi-valued fields.
One idea for dealing with multivalued fields is to check the type of the object in the Map and if is an array or Collection then iterate over it rather than just doing .toString() on it. Would that logic work for your use as well?
Iterating over an array or Collection of values is a great suggestion. I'll change it and resubmit when I have some time later today.
Here is the latest, incorporating Erik's suggestion about supporting multi-valued fields.
BTW, is there any way to delete the older attached versions of this file from JIRA? There's no real need for them to be there any more. New exception type for reporting server-side exceptions.
New client code that uses the new exception.
Here's the latest. There is an abstract base class that handles client connection and request/response and two subclasses. One is as before, with java APIs, and the other is for cases where you have an XML document you want to transform and send to the server.
Great! Now we need to figure out where it lives, and how to work out the dependencies (a solr-util.jar that a client could use, or perhaps just pull the needed class or two directly into the solr-client.jar)
delete() in the DocumentManagerClient ought to be doing this:
<delete><id>1234</id></delete> It's currently doing this: <delete><query>1234</query></delete> Good catch Philip. For the benefit of future downloaders, here's a complete zip file with everything including this fix.
Here is the latest version of the client code, in the form of solr-client-source.jar. The big difference here is that there are now two clients, DocumentManagerClient for adding, inserting, and updating, and SearchClient for searching. They share the same underlying communication mechanism, which consists of a low-level mechanism for doing queries and parsing responses by reading from an InputStream (see ResponseParser) and a slightly higher level mechanism that handles some of the XML for you (see XmlResponseParser).
I've been building this as a seperate project with a Maven2 dependency on Solr, but if the source is dropped into the Solr source tree at the appropriate place I suspect it will just compile and work. There are no other outside dependencies. I have some unit tests as well, but for the moment they are too tied in to my environment to be useful to the broader community. I will correct this and submit them. Please ignore that last attachment. It contains an earlier version of the code than I intended, and has a couple of serious bugs. I'm sure that what I'm running now is a lot better, but I'm going to iterate a little more and build some more complete tests before I submit it again. Unless anyone out there is dieing to be on the bleeding edge of this.
Ping...
any chance you could make your latest version available? I'm dieing... which files should I download?
it is a little confusing – any chance one of the attachments could be designated as the right one? Thanks!
I don't know that there is a specific "right" one at the moment ... Darren's last comment suggests that he has a better version but it's not quite ready for submission.
FYI: if you click the "All" link at teh top left of the comment listing, you can see where in the flow of time each of the attachemnts was added – from there it's pretty easy to tell that while solr-client-sources.jar is the most recent attachment, it's the one Darren said should be ignored... at this moment solr-client-java-2.zip.zip seems to be the most recent "good" version. So, should be merged. HttpClient seems to be right choice (easily configurable; 'follow redirects', 'buffer size', etc.). Possible improvements:
Probably, we need separate src/client folder for source files (sources do not depend on SOLR) My previous comment is visible to jira-users only, sorry.
Code submitted by Ryan looks great! I just posted a new version of a java client. This moves things to proper org.apache... packages and adds the waitFlush, waitSearcher suggested by Fuad.
If people are interested, i think this should sit next to /client/ruby in: /client/java/solrj/ there is a build.xml file that will generate a solr-client.jar file. As a taste, this is how you perform a search: SolrQuery query = new SolrQuery(); QueryResults results = client.query( query ); for( ResultDoc doc : results.getDocs() ) {
System.out.println( "["+doc.getId()+"] "+ doc.getField( "name" ) );
} Also, if there is interest, i can post an example webapp using this client library to search and explore a solr repository. (NOTE: revised summary since this issue has moved beyond just updating) I finally had a chance to look this over, here's a few comments in 1) i like the name solrj, i think this code should definitely live in client/java/solrj so that there is the potential for other java client code that is independent (if nothing else, i suspect something like 2) i wouldn't worry about having a special package for the exceptions ... they've got exception in their name, no ones going to be confused. 3) I'm really not fond of "ParamNames.java" being a copy of the constants in "SolrParams.java", or XML.java being copied, or the xpp jar being duplicated ... it seems like we should just pull in those (compiled) classes at build time ... but that would require that the whole Solr tree be checked out, and there seems to be interest in making it possible to "svn checkout client/lang/impl" and build that in isolation ... perhaps we could use svn:externals to pull in specific utility classes and jars from other places in the tree? (although based on what I've read today, branching for releases would be hard since all of the svn:external props would have to be updated). what do people think in general about how the client code can/should/shouldn't depend on the core server code? 4) one thing we should really try to support in a client is executing query requests against non-standard request handlers ... handlers that might take in request params that we can't even imagine. The SolrQuery class has explicit setters for many of the params that the built in request handlers support, but there is no easy way for people to build other queries. I think it might make sense if SolrQuery was an interface that just defined the methods needed by the SolrClient – probably just getQueryString(). Then their can be a SimpleSolrQuery that has all of the setters in the current SolrQuery class, possibly using a general baseclass with an impl of getQueryString that uses some SolrParams... public class AbstractSolrQuery implements SolrQuery { 5) what is the purpose of SolrClientStub ? 6) what is the purpose of SolrDocumentable being an empty interface? ... it seems like you could replace SolrDocumentable, SolrDocument, and SolrDocumented with something like this... public interface SolrDocument { Then you wouldn't need that instanceof code in SolrClientImpl Note that we should probably support field and document boosts as well, but field boosts don't really need to be specified in the Map since they apply to the whole field and not the individual values, so we could just add... public int getDocumentBoost(); ...to SolrDocument. 7) The ResultsParser and QueryResults classes seem to suffer the same limitation that i was mentioning about the SolrQuery class – they assume a very specific response structure (only one doc list, an optional facet block, an optional highlighting block, an optional debug block) ... I think since the ResultsParser already understands the all of the various tags that are used, it should be easy to do this as long as the QueryResult object becomes a more general container that any named data can be shoved into (just like SolrQueryResponse is on the server side) ... then a "SimpleQueryResults" class could be written that had the convenience methods that make sense when using StandardRequestHandler or DisMaxRequestHandler. 8) There was a comment in public QueryResults process( Reader reader ) throws SolrClientException, SolrServerException, XmlPullParserException, IOException ...i think if we removed XmlPullParserException from that list of exceptions (it could always be wrapped in a SolrClientException, or a new SolrClientParseException) we have a really simple API where other ResultParser classes could be written to handle JSON or what not down the road just by adding a simple setResultParser to SolrClient. Regarding Hoss' point #3, perhaps it's time to reorganize into something like
/solr/server/... "To build client XXX check out /solr/client or just /solr/client/java/XXX and /solr/shared" Shared would include external constants and exceptions. I have dramatically reworked the client code to fit with the pluggable ContentStream model in
http://svn.lapnap.net/solr/solrj/ Major changes:
The key interfaces are: public interface SolrClient public interface SolrRequest
SolrClient client = new CommonsHttpSolrClient( // Set up a simple query QueryResponse rsp = query.execute( client ); SimpleSolrDoc doc = new SimpleSolrDoc();
This also includes a utility to make solr documents from annotations. Given the class: @SolrSearchable( boost=2.0 ) @SolrSearchable( name="cat", boost=3 ) } The DocumentBuilder can automatically make: <doc boost="2.0">
There are a few parts of the API i think are awkward, I'd love any feedback / review you may have. thanks > * it is based on commons-httpclient-3.0.1.jar
Cool, +1 > * I'm using wt=JSON rather then XML. (It maps to a hash easier) Anyway, if you want the best JSON parser on the planet, check out > * handles multiple ContentStreams using multi-part form upload > * You can define and automatically build a solr document with annotations > * Includes a first draft for a HibernateEventListener. Hi:
I was really hoping that this patch will make it to trunk soon. I been using it without any problem. I was wondering if there are any specifics that are left for Thankful for your kind attention to I want this on trunk also. I'll be reviewing it and testing it out this afternoon and committing if all is fine.
My bad... it was
I don't think this one is quite ready to go, but anyone interested can see an updated versions at:
http://solrstuff.org/svn/solrj/ I have extracted the hibernate specific stuff into its own project so solrj is a bit more manageable.
If you are looking for a stable, simple client "solr-client.zip" is still your best bet. to do
Hello, we have been testing the solr-client and think we have found a small bug :
the xml parsers on the query-side is not setup to use "UTF-8" encoding we fixed it by setting the input stream for the xmlparser to "UTF8" which gave us this code in ResultsParser.java : try { notice we changed the argument for this method to InputStream instead of the reader so we could add "UTF-8" to the stream. in our opinion this was a major bug (since all solr-xml is encoded in utf-8) and we guess somebody just forgot to put it in... yay, now we can all start using freaky characters without the client actually freaking out I found that there was no way of adding highlight paramters to an SolrQuery, so I made some modifications to SolrQuery.java and ParamNames.java to allow highlighting
Changes to SolrQuery First add a new variable private HighlightParams _highlight = new HighlightParams(); Than I defined a new innerclass (a bit like FacetParams) public static class HighlightParams { public String simple_pre = "<b>"; public boolean isEnabled() { return field.size() > 0; }} Than add the following methods public void addHighlightField( String f ) { _highlight.field.add( f ); }public void setHighlightSnippets( int snippets ) { _highlight.snippets = snippets; }public void setHighlightFragSize( int fragsize ) { _highlight.fragsize = fragsize; } // ATTENTION : only simple tags. No quotes like in <span class="tag"> Add the following code to getQueryString method (for example right before return builder.toString() if( _highlight.isEnabled() ) { builder.append( '&' ).append( ParamNames.HIGHLIGHT_SIMPLE_PRE ).append( "=" ).append( _highlight.simple_pre ); Changes to ParamNames.java Add/modify the following variables /** wether to highlight */ This can be used as follwed I think that is all. If I forgot something, post it here. One remark. The setHighlightSurroundingTags method can only take simple tags, Greetz. org.apache.solr.client.impl.ResultsParser.java
org.apache.solr.client.QueryFacet.java (a copy of FieldFacet) org.apache.solr.client.QueryResults.java org.apache.solr.client.SolrQuery.java I changed these four files, add some functions for facet query. Some code here.. SolrClient client = .. for (QueryFacet qf : results.getQueryFacets()) { I am using the client library nad have no issues with adding docs to solr. But when i want to make a simple test query i keep getting this error:
org.apache.solr.client.exception.SolrClientException: unknown type: status Solr returns and xml with a statis tag in the header wich the client does not know how to handle? am i right? is this the case? is there an updated package or anyone working on such a thing at the moment? the solr-client.zip at the top of the thread works like a charm but seems to be very outdated and the bits on the svn://solrstuff.org site have some rather serious bugs.
i'm happy to do all the leg work of packaging things, fixing bugs, submitting a patch, etc but i wanted to make sure i'm not about to walk right behind someone else. also, if anyone has any ideas for the best starting point i'm happy to take suggestions.
For now, I'd still recommend using solr-client.zip
The code on solrstuff.org is in the middle of a big overhaul (hopefully stable by the end of this week) – but it will rely on some changes to solr that are not likely to make it into solr-1.2. I'll post another message here when that settles down. thanks For anyone interested, I've finished a major overhaul of the client at:
http://solrstuff.org/svn/solrj/ It is a dramatically different architecture then before. Essentially it reads each response into a NamedList and each response type knows what the contents mean. After solr1.2, I'll work on getting something like this into the official apache distribution. This client source duplicates many classes that will eventually be extracted into an independent solr-utils.jar ( I am using this in production code – but i don't suggest that it is production ready just yet. the trunk version at http://solrstuff.org/svn/solrj/
compile: .... [javac] C:\data\workspace\solrj\src\org\apache\solr\client\solrj\query\SolrQuery.java:10: cannot find symbol aaah, It was compiling with java 6.
I just added stax-api-1.0.jar and cleaned up some imports, it should run on java 5 now. the new api's work great, thanks! what's the plan for this going forward? id' like to start doing some work on this as it's rather critical to my current project and an are i've dealt with a lot in the past. assuming it's not getting dumped into org.apache.* land any time soon are you accepting patches to this code? if so i have some modifications to the api's that i think will make them easier to use (such as a method to set FacetParams on SolrQuery) and i'll even flush out the SolrServerTest for fun.
also, i noticed that all the methods on SolrServer throw undeclared SolrExceptions which extends RuntimeException when things so south. should those throw some other sort of non-ignorable exception like a new SolrServerException? while it made coding/compiling easier to leave out all the usually required try's and catches it made running/debugging much less enjoyable.
great! Any feedback/help would be wonderful.
I hope it is not too long before this can enter solr trunk, but it will first need two solr1.3 additions Re RuntimeException vs SolrServerException, I'm not sure the best choice. Earlier versions had a client exception and server exception, but in practice those got lumped together (in my case) anyway. I ended up just using SolrException because it is there. The SolrClientImpl does not implement the following optional attributes for "add" as documented in http://wiki.apache.org/solr/UpdateXmlMessages
allowDups = "true" | "false" — default is "false" Attached is patch for SolrClientImpl.java which implements allowDups.
Hi Ben-
Thanks for the patch. Will Johnson added the options to the java client on: This implementation is now quite stable (thanks to lots of help from Will!) and I hope will be integrated into solr trunk shortly after 1.2 (this week?!) About the allowDups/overwritePending/overwriteCommited options... what part of them do you use? In SOLR-60, there is talk of replacing the three options with a (simpler) overwrite=true/false – maybe we should change the client api to: UpdateResponse add( SolrDocument doc, boolean overwrite ) throws SolrServerException; In anticipation of this change? [I'm new to solrj, so everything I'm writing can be useless]
While trying to execute range query, using this query: I kept getting IllegalArgumentException: the solution was to patch StrUtils.java, in order to encode square brackets. ==== 204,207d203 Hi Walter-
I just updated http://solrstuff.org/svn/solrj/ URLEncoder.encode( val, "UTF-8" ) rather then: StrUtils.partialURLEncodeVal( val ) Give it a try and let me know if you have problems... (i have done date range queries successfully with resin/jetty, but netbeans must be different!) FYI partialURLEncodeVal was meant for readable, yet unambiguous logging... hence a minimum of escaping is done (but enough to easily paste into a browser and let it do the rest of the escaping when you sumbit).
Latest rev works perfectly thanks.
I've been making some time test with this client (only searching), and overall results show high times: this maybe due to my minimal knowledge on solr, but solr seems fast, is data-receiving/parsing on client that seems slow. Even when solr report 0ms (due to cache it I presume) it still take 200ms to get results (QueryResponse .getElapsedTime()). I'm using this code: SolrQuery query = new SolrQuery(queryString); The slowdown seems to be in client.executeMethod(method) (CommonsHttpSolrServer) Any way to speed up (assuming I'm not totally wrong on how to use this client...)? Reusing same http connection for multiple queries? Playing with MultiThreadedHttpConnectionManager helped a bit, but doesn't seems the solution I don't know if you are on solr-dev, Yonik noted that the QTime does not include the time to write the response, only the query time. To get an accurate number for how long the whole query takes, check your app server logs
http://www.nabble.com/Re%3A-A-simple-Java-client-for-updating-and-searching-tf3890950.html To get a quick response from solr, try rows=0 or a 404 path. (Of course, the speed will depend on you network connection speed between client-server) I'm integrating The basic stuff is no problem. I'm struggling with the best way to:
I think the best approach is to commit my best effort and then have you ryan We are planning to replace our custom Lucene implementation with Solr in the next release of Liferay. This Java client would be extremely useful to us and we would like to see it in the next stable release. When do you anticipate this, or at least an alpha version?
solr 1.2 was released ~1 week ago so the next official stable release is at least a few months out.
The solrj client is quite stable (I won't say that too strongly until more people are using it) and will be included in solr nightly builds. While I don't recommend using the solr server nightly builds, the client should be ok. Added to trunk... any new problems should get their own issue.
This bug was modified as part of a bulk update using the criteria...
The Fix Version for all 29 issues found was set to 1.3, email notification was suppressed to prevent excessive email. For a list of all the issues modified, search jira comments for this (hopefully) unique string: batch20070315hossman1 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||