Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.3
    • Fix Version/s: 1.4
    • Component/s: Schema and Analysis
    • Labels:
      None

      Description

      The FieldAnalysisRequestHandler provides the analysis functionality of the web admin page as a service. This handler accepts a filetype/fieldname parameter and a value and as a response returns a breakdown of the analysis process. It is also possible to send a query value which will use the configured query analyzer as well as a showmatch parameter which will then mark every matched token as a match.

      If this handler is added to the code base, I also recommend to rename the current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have them both inherit from one AnalysisRequestHandlerBase class which provides the common functionality of the analysis breakdown and its translation to named lists. This will also enhance the current AnalysisRequestHandler which right now is fairly simplistic.

      1. AnalisysRequestHandler_refactored.patch
        76 kB
        Uri Boness
      2. analysis_request_handlers_incl_solrj.patch
        138 kB
        Uri Boness
      3. AnalysisRequestHandler_refactored1.patch
        77 kB
        Uri Boness
      4. FieldAnalysisRequestHandler_incl_test.patch
        39 kB
        Uri Boness
      5. SOLR-1099.patch
        6 kB
        Shalin Shekhar Mangar
      6. SOLR-1099.patch
        135 kB
        Shalin Shekhar Mangar
      7. SOLR-1099.patch
        137 kB
        Shalin Shekhar Mangar
      8. SOLR-1099-ordered-TokenizerChain.patch
        1 kB
        Koji Sekiguchi

        Issue Links

          Activity

          Hide
          Shalin Shekhar Mangar added a comment -

          This looks great Uri.

          I'm yet to look completely into the patch. But is there anything in the AnalysisRequestHandler which is not there in this patch? If not, does it make sense to just deprecate AnalysisRequestHandler and use this instead?

          Show
          Shalin Shekhar Mangar added a comment - This looks great Uri. I'm yet to look completely into the patch. But is there anything in the AnalysisRequestHandler which is not there in this patch? If not, does it make sense to just deprecate AnalysisRequestHandler and use this instead?
          Hide
          Uri Boness added a comment -

          Well, the AnalysisRequestHandler goal is to handle documents, so basically, you send a XML document (same document as you would send for indexing) and the handler analyses the fields of the document. So the main difference between the two handlers is that the AnalsisRequestHandler enables you to provides a set of field names/types and their values to be analysed, while in the FieldAnalysisRequestHandler you're mainly targeting just a couple of fields and you can only specify one value to be analysed. The other main difference is that the AnalysisRequestHandler handles a POST request with an XML request body while the FieldAnalysisRequestHandler handles a GET request where all the parameters are specified as URL params.

          As I mentioned, the analysis breakdown of the FieldAnalysisRequestHandler is more detailed than the AnalysisRequestHandler and this is why I think that some refactoring can take place by extracting all the common functionality to a parent class for these two classes.

          Show
          Uri Boness added a comment - Well, the AnalysisRequestHandler goal is to handle documents, so basically, you send a XML document (same document as you would send for indexing) and the handler analyses the fields of the document. So the main difference between the two handlers is that the AnalsisRequestHandler enables you to provides a set of field names/types and their values to be analysed, while in the FieldAnalysisRequestHandler you're mainly targeting just a couple of fields and you can only specify one value to be analysed. The other main difference is that the AnalysisRequestHandler handles a POST request with an XML request body while the FieldAnalysisRequestHandler handles a GET request where all the parameters are specified as URL params. As I mentioned, the analysis breakdown of the FieldAnalysisRequestHandler is more detailed than the AnalysisRequestHandler and this is why I think that some refactoring can take place by extracting all the common functionality to a parent class for these two classes.
          Hide
          Shalin Shekhar Mangar added a comment -

          Well, the AnalysisRequestHandler goal is to handle documents, so basically, you send a XML document (same document as you would send for indexing) and the handler analyses the fields of the document. So the main difference between the two handlers is that the AnalsisRequestHandler enables you to provides a set of field names/types and their values to be analysed, while in the FieldAnalysisRequestHandler you're mainly targeting just a couple of fields and you can only specify one value to be analysed. The other main difference is that the AnalysisRequestHandler handles a POST request with an XML request body while the FieldAnalysisRequestHandler handles a GET request where all the parameters are specified as URL params.

          Thanks for clarifying Uri.

          As I mentioned, the analysis breakdown of the FieldAnalysisRequestHandler is more detailed than the AnalysisRequestHandler and this is why I think that some refactoring can take place by extracting all the common functionality to a parent class for these two classes.

          I agree. With this coming in, we will have three places which help with analysis (analysis.jsp, AnalysisRequestHandler and FieldAnalysisRequestHandler). Would you like to take a stab at this before we commit?

          I doubt our refactoring will change the public API (the request/response format) for any of the three. Therefore, I'm fine with refactoring later and committing this as-is.

          Show
          Shalin Shekhar Mangar added a comment - Well, the AnalysisRequestHandler goal is to handle documents, so basically, you send a XML document (same document as you would send for indexing) and the handler analyses the fields of the document. So the main difference between the two handlers is that the AnalsisRequestHandler enables you to provides a set of field names/types and their values to be analysed, while in the FieldAnalysisRequestHandler you're mainly targeting just a couple of fields and you can only specify one value to be analysed. The other main difference is that the AnalysisRequestHandler handles a POST request with an XML request body while the FieldAnalysisRequestHandler handles a GET request where all the parameters are specified as URL params. Thanks for clarifying Uri. As I mentioned, the analysis breakdown of the FieldAnalysisRequestHandler is more detailed than the AnalysisRequestHandler and this is why I think that some refactoring can take place by extracting all the common functionality to a parent class for these two classes. I agree. With this coming in, we will have three places which help with analysis (analysis.jsp, AnalysisRequestHandler and FieldAnalysisRequestHandler). Would you like to take a stab at this before we commit? I doubt our refactoring will change the public API (the request/response format) for any of the three. Therefore, I'm fine with refactoring later and committing this as-is.
          Hide
          Uri Boness added a comment -

          The public API for the AnalysisRequestHandler will change in the context of the response. Since the analysis breakdown is more detailed, the response format will have to change a bit. Furthermore, it's probably wise to rename the AnalysisRequestHandler to DocumentAnalysisRequestHandler (more expressive name and also consistent with the FieldAnalysisRequestHandler). Another option is to do this refactoring anyway, and leave the AnalysisRequestHandler as is and only deprecate it. So basically we'll have 4 classes:

          AnalysisRequestHanlderBase
          FieldAnalysisRequestHanlder
          DocumentAnalysisRequestHandler
          AnalysisRequestHandler (deprecated)

          what do you think?

          BTW, once commited, it would be also be wise to reimplement the anaysis.jsp to use this new handler and clean it up from all the analysis (now duplicate) logic code.

          Show
          Uri Boness added a comment - The public API for the AnalysisRequestHandler will change in the context of the response. Since the analysis breakdown is more detailed, the response format will have to change a bit. Furthermore, it's probably wise to rename the AnalysisRequestHandler to DocumentAnalysisRequestHandler (more expressive name and also consistent with the FieldAnalysisRequestHandler). Another option is to do this refactoring anyway, and leave the AnalysisRequestHandler as is and only deprecate it. So basically we'll have 4 classes: AnalysisRequestHanlderBase FieldAnalysisRequestHanlder DocumentAnalysisRequestHandler AnalysisRequestHandler (deprecated) what do you think? BTW, once commited, it would be also be wise to reimplement the anaysis.jsp to use this new handler and clean it up from all the analysis (now duplicate) logic code.
          Hide
          Shalin Shekhar Mangar added a comment -

          The public API for the AnalysisRequestHandler will change in the context of the response.

          I was assuming that the output format of AnalysisRequestHandler and FieldAnalysisRequestHandler remains exactly as they are today and the refactoring is just to abstract common code into a base class.

          Furthermore, it's probably wise to rename the AnalysisRequestHandler to DocumentAnalysisRequestHandler (more expressive name and also consistent with the FieldAnalysisRequestHandler). Another option is to do this refactoring anyway, and leave the AnalysisRequestHandler as is and only deprecate it. So basically we'll have 4 classes:

          AnalysisRequestHanlderBase
          FieldAnalysisRequestHanlder
          DocumentAnalysisRequestHandler
          AnalysisRequestHandler (deprecated)

          Agreed. But the output of DocumentAnalysisRequestHandler will look exactly like what AnalysisRequestHandler returns today, right?

          it would be also be wise to reimplement the anaysis.jsp to use this new handler and clean it up from all the analysis (now duplicate) logic code.

          Agreed.

          Show
          Shalin Shekhar Mangar added a comment - The public API for the AnalysisRequestHandler will change in the context of the response. I was assuming that the output format of AnalysisRequestHandler and FieldAnalysisRequestHandler remains exactly as they are today and the refactoring is just to abstract common code into a base class. Furthermore, it's probably wise to rename the AnalysisRequestHandler to DocumentAnalysisRequestHandler (more expressive name and also consistent with the FieldAnalysisRequestHandler). Another option is to do this refactoring anyway, and leave the AnalysisRequestHandler as is and only deprecate it. So basically we'll have 4 classes: AnalysisRequestHanlderBase FieldAnalysisRequestHanlder DocumentAnalysisRequestHandler AnalysisRequestHandler (deprecated) Agreed. But the output of DocumentAnalysisRequestHandler will look exactly like what AnalysisRequestHandler returns today, right? it would be also be wise to reimplement the anaysis.jsp to use this new handler and clean it up from all the analysis (now duplicate) logic code. Agreed.
          Hide
          Uri Boness added a comment -

          I was assuming that the output format of AnalysisRequestHandler and FieldAnalysisRequestHandler remains exactly as they are today and the refactoring is just to abstract common code into a base class.

          Agreed. But the output of DocumentAnalysisRequestHandler will look exactly like what AnalysisRequestHandler returns today, right?

          You know what... your're right... I think it is possible to keep the same output by default. The only change to the original structure will happen when more parameters will be sent, for example, when a query analysis takes place and a "showmatch=true" is sent then each matched token will be marked as a "match". I'll have to have a closer look at the current response of the AnalysisRequestHandler and see if I can support the exact same structure.... my gut feeling is that it's possible.

          I'll start working on it and see where I get

          Show
          Uri Boness added a comment - I was assuming that the output format of AnalysisRequestHandler and FieldAnalysisRequestHandler remains exactly as they are today and the refactoring is just to abstract common code into a base class. Agreed. But the output of DocumentAnalysisRequestHandler will look exactly like what AnalysisRequestHandler returns today, right? You know what... your're right... I think it is possible to keep the same output by default. The only change to the original structure will happen when more parameters will be sent, for example, when a query analysis takes place and a "showmatch=true" is sent then each matched token will be marked as a "match". I'll have to have a closer look at the current response of the AnalysisRequestHandler and see if I can support the exact same structure.... my gut feeling is that it's possible. I'll start working on it and see where I get
          Hide
          Shalin Shekhar Mangar added a comment -

          The only change to the original structure will happen when more parameters will be sent, for example, when a query analysis takes place and a "showmatch=true" is sent then each matched token will be marked as a "match". I'll have to have a closer look at the current response of the AnalysisRequestHandler and see if I can support the exact same structure.... my gut feeling is that it's possible.

          This is the part that I do not understand.

          Let me outline what I understood:

          1. We copy AnalysisRequestHandler (ARH) to DocumentAnalysisRequestHandler and deprecate ARH.
          2. We extract common code (if any) of ARH and FieldARH in to a base class AnalysisRequestHandlerBase, as you suggested
          3. We modify analysis.jsp to use FieldARH (maybe as a separate issue)

          You do not need to support AnalysisRequestHandler's format because it will also exist by the name of DocumentAnalysisRequestHandler. Since FieldARH is a new handler, it does not need to be back-compatible with ARH. Supporting the old format is a nice-to-have feature but not necessary.

          Show
          Shalin Shekhar Mangar added a comment - The only change to the original structure will happen when more parameters will be sent, for example, when a query analysis takes place and a "showmatch=true" is sent then each matched token will be marked as a "match". I'll have to have a closer look at the current response of the AnalysisRequestHandler and see if I can support the exact same structure.... my gut feeling is that it's possible. This is the part that I do not understand. Let me outline what I understood: We copy AnalysisRequestHandler (ARH) to DocumentAnalysisRequestHandler and deprecate ARH. We extract common code (if any) of ARH and FieldARH in to a base class AnalysisRequestHandlerBase, as you suggested We modify analysis.jsp to use FieldARH (maybe as a separate issue) You do not need to support AnalysisRequestHandler's format because it will also exist by the name of DocumentAnalysisRequestHandler. Since FieldARH is a new handler, it does not need to be back-compatible with ARH. Supporting the old format is a nice-to-have feature but not necessary.
          Hide
          Uri Boness added a comment -

          We copy AnalysisRequestHandler (ARH) to DocumentAnalysisRequestHandler and deprecate ARH.

          true, but it will be enhanced with functionality and support more extensive analysis breakdown (e.g. adding a query analysis and showmatch support)

          We extract common code (if any) of ARH and FieldARH in to a base class AnalysisRequestHandlerBase, as you suggested

          true

          We modify analysis.jsp to use FieldARH (maybe as a separate issue)

          probably a separate issue is more appropriate.

          You do not need to support AnalysisRequestHandler's format because it will also exist by the name of DocumentAnalysisRequestHandler. Since FieldARH is a new handler, it does not need to be back-compatible with ARH. Supporting the old format is a nice-to-have feature but not necessary.

          True. The old AnalysisRequestHandler will be deprecated and it's (enhanced) functionality will be available via the DocumentAnalysisRequestHandler. That said, it would be nice to be backward compatible as much as possible for those who are using the old ARH already (I suspect not many are using it anyway as it's mostly used for tooling and debugging). I do believe that both the new DocumentARH and the FieldARH are useful for different purposes due the nature of their differences as I mentioned above.

          I hope this makes things a bit clearer

          Show
          Uri Boness added a comment - We copy AnalysisRequestHandler (ARH) to DocumentAnalysisRequestHandler and deprecate ARH. true, but it will be enhanced with functionality and support more extensive analysis breakdown (e.g. adding a query analysis and showmatch support) We extract common code (if any) of ARH and FieldARH in to a base class AnalysisRequestHandlerBase, as you suggested true We modify analysis.jsp to use FieldARH (maybe as a separate issue) probably a separate issue is more appropriate. You do not need to support AnalysisRequestHandler's format because it will also exist by the name of DocumentAnalysisRequestHandler. Since FieldARH is a new handler, it does not need to be back-compatible with ARH. Supporting the old format is a nice-to-have feature but not necessary. True. The old AnalysisRequestHandler will be deprecated and it's (enhanced) functionality will be available via the DocumentAnalysisRequestHandler. That said, it would be nice to be backward compatible as much as possible for those who are using the old ARH already (I suspect not many are using it anyway as it's mostly used for tooling and debugging). I do believe that both the new DocumentARH and the FieldARH are useful for different purposes due the nature of their differences as I mentioned above. I hope this makes things a bit clearer
          Hide
          Shalin Shekhar Mangar added a comment -

          I hope this makes things a bit clearer

          Crystal clear! Thanks Uri!

          Show
          Shalin Shekhar Mangar added a comment - I hope this makes things a bit clearer Crystal clear! Thanks Uri!
          Hide
          Uri Boness added a comment -

          The latest patch (AnalisysRequestHandler_refactored.patch) does the following:

          • deprecates the AnalysisRequesthandler
          • adds the AnalysisRequestHandlerBase
          • adds the DocumentAnalysisRequestHandler
          • modifies the FieldAnalysisRequestHandler
          • adds/updates the appropriate test classes

          NOTE: the response format of the DocumentAnalysisRequestHandler differs from the AnalysisRequestHandler after all. This is mainly for two reasons: 1) to be consistent with the response format of the FieldAnalysisRequestHandler, 2) New features were added to this request handler which didn't exist in the old AnalysisRequestHandler

          Show
          Uri Boness added a comment - The latest patch (AnalisysRequestHandler_refactored.patch) does the following: deprecates the AnalysisRequesthandler adds the AnalysisRequestHandlerBase adds the DocumentAnalysisRequestHandler modifies the FieldAnalysisRequestHandler adds/updates the appropriate test classes NOTE: the response format of the DocumentAnalysisRequestHandler differs from the AnalysisRequestHandler after all. This is mainly for two reasons: 1) to be consistent with the response format of the FieldAnalysisRequestHandler, 2) New features were added to this request handler which didn't exist in the old AnalysisRequestHandler
          Hide
          Shalin Shekhar Mangar added a comment -

          Uri – the patch is missing a class – org.apache.solr.client.solrj.request.DocumentAnalysisRequest. Also, can you please remove the author javadoc tags? We don't use them in Lucene/Solr.

          Thanks!

          Show
          Shalin Shekhar Mangar added a comment - Uri – the patch is missing a class – org.apache.solr.client.solrj.request.DocumentAnalysisRequest. Also, can you please remove the author javadoc tags? We don't use them in Lucene/Solr. Thanks!
          Hide
          Uri Boness added a comment -

          see the new AnalysisRequestHandler_refactored1.patch file.

          I started working on solrj support for these request handlers, I just need to write some tests and will commit it soon as a separate patch.

          Show
          Uri Boness added a comment - see the new AnalysisRequestHandler_refactored1.patch file. I started working on solrj support for these request handlers, I just need to write some tests and will commit it soon as a separate patch.
          Hide
          Grant Ingersoll added a comment - - edited

          Sorry for being a bit late...
          Am I understanding that the main thing this does is allow you to specify one input and get back analysis for each field you specify? Well, that and the GET, right?

          Show
          Grant Ingersoll added a comment - - edited Sorry for being a bit late... Am I understanding that the main thing this does is allow you to specify one input and get back analysis for each field you specify? Well, that and the GET, right?
          Hide
          Uri Boness added a comment -

          Basically that was the original idea, but since then a few things were added.

          I thought it would be nice to have the same features provided by the FieldAnalysisRequestHandler in the AnalysisRequestHnalder, so the latest patch does the following:

          • The FieldAnalysisRequestHandler accepts a query, a value, and field names/types to work on. It will then return the analysis breakdown for each field name/type. It is also possible to send a showmatch param to "mark" the field tokens that match the query tokens (just like you have in the analysis.jsp)
          • The current/old AnalysisRequestHandler is deprecated in favour of a new class -> DocumentAnalysisRequestHandler
          • The new DocumentARH acts on sent documents (just like the old ARH) only that it also support query and showmatche parameters
          • The reason I didn't apply this logic directly to the AnalysisRequestHandler is to be consistent with the naming here -> DocumentARH vs. FieldARH (and it's a more descriptive name now that we have yet another analysis request handler)
          • Both the FieldARH and the DocumentARH inherit from the same base class - so they can share common functionality (e.g. rendering of the analysis breakdown)
          Show
          Uri Boness added a comment - Basically that was the original idea, but since then a few things were added. I thought it would be nice to have the same features provided by the FieldAnalysisRequestHandler in the AnalysisRequestHnalder , so the latest patch does the following: The FieldAnalysisRequestHandler accepts a query, a value, and field names/types to work on. It will then return the analysis breakdown for each field name/type. It is also possible to send a showmatch param to "mark" the field tokens that match the query tokens (just like you have in the analysis.jsp) The current/old AnalysisRequestHandler is deprecated in favour of a new class -> DocumentAnalysisRequestHandler The new DocumentARH acts on sent documents (just like the old ARH) only that it also support query and showmatche parameters The reason I didn't apply this logic directly to the AnalysisRequestHandler is to be consistent with the naming here -> DocumentARH vs. FieldARH (and it's a more descriptive name now that we have yet another analysis request handler) Both the FieldARH and the DocumentARH inherit from the same base class - so they can share common functionality (e.g. rendering of the analysis breakdown)
          Hide
          Uri Boness added a comment -

          latest patch. This one includes SolrJ support.

          Show
          Uri Boness added a comment - latest patch. This one includes SolrJ support.
          Hide
          Grant Ingersoll added a comment -

          So, why not just fold all of this into the ARH? Seems like all of these features work just as well as input parameters and there is no need for deprecation, etc.

          Show
          Grant Ingersoll added a comment - So, why not just fold all of this into the ARH? Seems like all of these features work just as well as input parameters and there is no need for deprecation, etc.
          Hide
          Uri Boness added a comment -

          Not all input can be sent as input parameters (the documents will still be sent as a request body via a POST) but of course it's still possible to fold everything in one handler. It just feels like putting too much logic & responsibility on a single handler which increases code complexity and makes it harder to maintain (at least in my opinion). The deprecation also provides users who already use the current ARH a chance to move to the DocumentARH (which has a different response format)

          Show
          Uri Boness added a comment - Not all input can be sent as input parameters (the documents will still be sent as a request body via a POST) but of course it's still possible to fold everything in one handler. It just feels like putting too much logic & responsibility on a single handler which increases code complexity and makes it harder to maintain (at least in my opinion). The deprecation also provides users who already use the current ARH a chance to move to the DocumentARH (which has a different response format)
          Hide
          Shalin Shekhar Mangar added a comment -

          Patch with the following changes:

          1. Removed rsp.getResponseHeader().add("params", req.getParams().toNamedList()); from AnalysisRequestHandlerBase as that is already done in SolrCore#execute if echoParams=explicit is passed as a request param (or configured in solrconfig.xml)
          2. Re-formatted code to Lucene/Solr code style
          3. Added /analysis/document and /analysis/field to example solrconfig.xml with documentation and added a note on the deprecation of AnalysisRequestHandler

          All tests pass.

          Uri, AnalysisRequest cannot be an inner class of FieldAnalysisRequestHandler because when people use Solrj, FieldAnalysisRequestHandler won't be in the class-path. Same for DocumentAnalysisRequestHandler. We'd need to change this.

          Show
          Shalin Shekhar Mangar added a comment - Patch with the following changes: Removed rsp.getResponseHeader().add("params", req.getParams().toNamedList()); from AnalysisRequestHandlerBase as that is already done in SolrCore#execute if echoParams=explicit is passed as a request param (or configured in solrconfig.xml) Re-formatted code to Lucene/Solr code style Added /analysis/document and /analysis/field to example solrconfig.xml with documentation and added a note on the deprecation of AnalysisRequestHandler All tests pass. Uri, AnalysisRequest cannot be an inner class of FieldAnalysisRequestHandler because when people use Solrj, FieldAnalysisRequestHandler won't be in the class-path. Same for DocumentAnalysisRequestHandler. We'd need to change this.
          Hide
          Uri Boness added a comment -

          Actually, there is not dependency between the handlers and SolrJ. SolrJ comes with its own FieldAnalysisRequest and DocumentAnalysisRequest classes (which extend the SolrRequest class). The inner classes in the handlers are used to represent analysis requests on the server side.

          Another thing. I believe that the default names for the handlers as you defined in the default solrconfig.xml (i.e. "analysis/field" and "analysis/document") are better than the ones I came up with . The only thing left to do is to update these defaults in the SolrJ request classes: FieldAnalysisRequest and DocumentAnalysisRequests.

          Show
          Uri Boness added a comment - Actually, there is not dependency between the handlers and SolrJ. SolrJ comes with its own FieldAnalysisRequest and DocumentAnalysisRequest classes (which extend the SolrRequest class). The inner classes in the handlers are used to represent analysis requests on the server side. Another thing. I believe that the default names for the handlers as you defined in the default solrconfig.xml (i.e. "analysis/field" and "analysis/document") are better than the ones I came up with . The only thing left to do is to update these defaults in the SolrJ request classes: FieldAnalysisRequest and DocumentAnalysisRequests .
          Hide
          Shalin Shekhar Mangar added a comment -

          Actually, there is not dependency between the handlers and SolrJ. SolrJ comes with its own FieldAnalysisRequest and DocumentAnalysisRequest classes (which extend the SolrRequest class). The inner classes in the handlers are used to represent analysis requests on the server side.

          Ok, I see. But why are they needed? Can't we use the Solrj ones on the server side? SolrJ is in the core's class path always. Let me see if that is possible.

          The only thing left to do is to update these defaults in the SolrJ request classes: FieldAnalysisRequest and DocumentAnalysisRequests.

          Ok, I'll do that.

          Show
          Shalin Shekhar Mangar added a comment - Actually, there is not dependency between the handlers and SolrJ. SolrJ comes with its own FieldAnalysisRequest and DocumentAnalysisRequest classes (which extend the SolrRequest class). The inner classes in the handlers are used to represent analysis requests on the server side. Ok, I see. But why are they needed? Can't we use the Solrj ones on the server side? SolrJ is in the core's class path always. Let me see if that is possible. The only thing left to do is to update these defaults in the SolrJ request classes: FieldAnalysisRequest and DocumentAnalysisRequests. Ok, I'll do that.
          Hide
          Shalin Shekhar Mangar added a comment -
          1. Removed DocumentAnalysisRequestHandler.DocumentAnalysisRequest inner class and replaced all usages with the SolrJ DocumentAnalysisRequest
          2. Removed FieldAnalysisRequestHandler.AnalysisRequest inner class and replaced all usages with the SolrJ FieldAnalysisRequest
          3. Renamed mis-spelled method name List<String> getFieldNamed() in FieldAnalysisRequest
          4. Added methods setFieldNames(List<String>) and setFieldTypes(List<String>) to FieldAnalysisRequest

          All tests pass. I think this is ready for commit.

          Grant, did you have any suggestions? Should we go ahead with this?

          Show
          Shalin Shekhar Mangar added a comment - Removed DocumentAnalysisRequestHandler.DocumentAnalysisRequest inner class and replaced all usages with the SolrJ DocumentAnalysisRequest Removed FieldAnalysisRequestHandler.AnalysisRequest inner class and replaced all usages with the SolrJ FieldAnalysisRequest Renamed mis-spelled method name List<String> getFieldNamed() in FieldAnalysisRequest Added methods setFieldNames(List<String>) and setFieldTypes(List<String>) to FieldAnalysisRequest All tests pass. I think this is ready for commit. Grant, did you have any suggestions? Should we go ahead with this?
          Hide
          Shalin Shekhar Mangar added a comment -

          Committed revision 767412.

          Thanks Uri!

          Show
          Shalin Shekhar Mangar added a comment - Committed revision 767412. Thanks Uri!
          Hide
          Yonik Seeley added a comment -

          Trying to review some of the new external APIs before 1.4.... but this one gives me a null-pointer exception.

          http://localhost:8983/solr/analysis/field?analysis.fieldvalue=hello

          java.lang.NullPointerException
          at org.apache.solr.handler.FieldAnalysisRequestHandler.handleAnalysisRequest(FieldAnalysisRequestHandler.java:157)
          at org.apache.solr.handler.FieldAnalysisRequestHandler.doAnalysis(FieldAnalysisRequestHandler.java:97)

          Show
          Yonik Seeley added a comment - Trying to review some of the new external APIs before 1.4.... but this one gives me a null-pointer exception. http://localhost:8983/solr/analysis/field?analysis.fieldvalue=hello java.lang.NullPointerException at org.apache.solr.handler.FieldAnalysisRequestHandler.handleAnalysisRequest(FieldAnalysisRequestHandler.java:157) at org.apache.solr.handler.FieldAnalysisRequestHandler.doAnalysis(FieldAnalysisRequestHandler.java:97)
          Hide
          Shalin Shekhar Mangar added a comment - - edited

          I guess contrary to the javadocs, you need to specify either analysis.fieldname or analysis.fieldtype along with analysis.fieldvalue to make it work. They are optional but one of them must be present.

          On second thought, we could just use the default search field if both fieldname and fieldtype are not specified.

          Show
          Shalin Shekhar Mangar added a comment - - edited I guess contrary to the javadocs, you need to specify either analysis.fieldname or analysis.fieldtype along with analysis.fieldvalue to make it work. They are optional but one of them must be present. On second thought, we could just use the default search field if both fieldname and fieldtype are not specified.
          Hide
          Uri Boness added a comment -

          I guess contrary to the javadocs, you need to specify either analysis.fieldname or analysis.fieldtype along with analysis.fieldvalue to make it work. They are optional but one of them must be present.

          That's true, one of them must be set.

          On second thought, we could just use the default search field if both fieldname and fieldtype are not specified.

          Sounds like a reasonable fallback.

          Show
          Uri Boness added a comment - I guess contrary to the javadocs, you need to specify either analysis.fieldname or analysis.fieldtype along with analysis.fieldvalue to make it work. They are optional but one of them must be present. That's true, one of them must be set. On second thought, we could just use the default search field if both fieldname and fieldtype are not specified. Sounds like a reasonable fallback.
          Hide
          Yonik Seeley added a comment -

          I guess contrary to the javadocs, you need to specify either analysis.fieldname or analysis.fieldtype

          I had tried that too though - I just posted the simplest example that conformed to the javadoc.
          This also gives me an exception:

          http://localhost:8983/solr/analysis/field?analysis.fieldvalue=hello&analysis.fieldname=text

          Show
          Yonik Seeley added a comment - I guess contrary to the javadocs, you need to specify either analysis.fieldname or analysis.fieldtype I had tried that too though - I just posted the simplest example that conformed to the javadoc. This also gives me an exception: http://localhost:8983/solr/analysis/field?analysis.fieldvalue=hello&analysis.fieldname=text
          Hide
          Shalin Shekhar Mangar added a comment -

          A null-check was missing. Now we use the default search field if both field type and name are not specified. All the tests were actually testing with a request containing both field name and field type.

          I'll commit this shortly.

          Show
          Shalin Shekhar Mangar added a comment - A null-check was missing. Now we use the default search field if both field type and name are not specified. All the tests were actually testing with a request containing both field name and field type. I'll commit this shortly.
          Hide
          Shalin Shekhar Mangar added a comment -

          Committed revision 771099.

          Show
          Shalin Shekhar Mangar added a comment - Committed revision 771099.
          Hide
          Yonik Seeley added a comment -

          Finally got around to reviewing the interface for some of this stuff...
          there are a number of oddities (things like using the complete text of a field as the key or name in a map value, listing the value twice, requiring a uniqueKey)... but then I started thinking about who will use this, and maybe it's not worth trying to fix it up right now.

          And that got me thinking why there are SolrJ classes dedicated to it... and I'm not sure that we should take up space for that.

          IMO, common things in SolrJ should have easier, more type safe interfaces and uncommon, advanced features should be accessed via the generic APIs in order to keep the interfaces smaller and more understandable for the general user.

          Show
          Yonik Seeley added a comment - Finally got around to reviewing the interface for some of this stuff... there are a number of oddities (things like using the complete text of a field as the key or name in a map value, listing the value twice, requiring a uniqueKey)... but then I started thinking about who will use this, and maybe it's not worth trying to fix it up right now. And that got me thinking why there are SolrJ classes dedicated to it... and I'm not sure that we should take up space for that. IMO, common things in SolrJ should have easier, more type safe interfaces and uncommon, advanced features should be accessed via the generic APIs in order to keep the interfaces smaller and more understandable for the general user.
          Hide
          Uri Boness added a comment -

          there are a number of oddities (things like using the complete text of a field as the key or name in a map value, listing the value twice, requiring a uniqueKey)

          Yes, I know... I didn't feel best with it as well, but that how the original analysis handler worked so I just followed that

          And that got me thinking why there are SolrJ classes dedicated to it... and I'm not sure that we should take up space for that.

          IMO, common things in SolrJ should have easier, more type safe interfaces and uncommon, advanced features should be accessed via the generic APIs in order to keep the interfaces smaller and more understandable for the general user.

          Why wouldn't you want SolrJ support for it? IMO, it would be great to have SolrJ support for every request handler that ships out of the box with Solr. It makes the user's life simpler and easier to use Solr this way. And as far as space is concerned... how much does it really add to the overall size of solrj jar? In any case, we're not talking of megabytes here... and for most people it doesn't really matter - I think it's more important to provide a simple and user friendly API to work with, and if the cost is to add a few extra classes I think it's a pretty cheap price to pay. It is true (I also mentioned it before) that it's not a major functionality that will be used often... but it is useful to have for tooling support - We're using it in one of the tools that we've created and the admin website can use it as well.

          Show
          Uri Boness added a comment - there are a number of oddities (things like using the complete text of a field as the key or name in a map value, listing the value twice, requiring a uniqueKey) Yes, I know... I didn't feel best with it as well, but that how the original analysis handler worked so I just followed that And that got me thinking why there are SolrJ classes dedicated to it... and I'm not sure that we should take up space for that. IMO, common things in SolrJ should have easier, more type safe interfaces and uncommon, advanced features should be accessed via the generic APIs in order to keep the interfaces smaller and more understandable for the general user. Why wouldn't you want SolrJ support for it? IMO, it would be great to have SolrJ support for every request handler that ships out of the box with Solr. It makes the user's life simpler and easier to use Solr this way. And as far as space is concerned... how much does it really add to the overall size of solrj jar? In any case, we're not talking of megabytes here... and for most people it doesn't really matter - I think it's more important to provide a simple and user friendly API to work with, and if the cost is to add a few extra classes I think it's a pretty cheap price to pay. It is true (I also mentioned it before) that it's not a major functionality that will be used often... but it is useful to have for tooling support - We're using it in one of the tools that we've created and the admin website can use it as well.
          Hide
          Koji Sekiguchi added a comment -

          Hmm, I think the order of Tokenizer/TokenFilters in response is unconsidered. For example, I cannot take out Tokenizer/TokenFilters from ruby response in order...

          Show
          Koji Sekiguchi added a comment - Hmm, I think the order of Tokenizer/TokenFilters in response is unconsidered. For example, I cannot take out Tokenizer/TokenFilters from ruby response in order...
          Hide
          Koji Sekiguchi added a comment -

          I'd like to use NamedList rather than SimpleOrderedMap. If there is no objections, I'll commit soon. All tests pass.

          Show
          Koji Sekiguchi added a comment - I'd like to use NamedList rather than SimpleOrderedMap. If there is no objections, I'll commit soon. All tests pass.
          Hide
          Hoss Man added a comment -

          go for it koji

          Show
          Hoss Man added a comment - go for it koji
          Hide
          Koji Sekiguchi added a comment -

          Committed revision 827032. Thanks.

          Show
          Koji Sekiguchi added a comment - Committed revision 827032. Thanks.
          Hide
          Grant Ingersoll added a comment -

          Bulk close for Solr 1.4

          Show
          Grant Ingersoll added a comment - Bulk close for Solr 1.4

            People

            • Assignee:
              Koji Sekiguchi
              Reporter:
              Uri Boness
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development