Solr
  1. Solr
  2. SOLR-7182

Make the Schema-API a first class citizen of SolrJ

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 5.0
    • Fix Version/s: 5.3, 6.0
    • Component/s: SolrJ
    • Labels:
    • Environment:

      any

      Description

      There are several Solr APIs that are handled as first class citizens in SolrJ, esp. the Node API and the Collections API, i.e. they have their own xxxAdminRequest Object extended from the SolrRequest Class. As someone who programmatically changes Schemas a lot, I had hoped to see the Schema API handled first class in release 5.0, too. As far as I dug into the code and docs of SolrJ 5.0, that did not happen. If there is a reasonable point why this won't happen at all, I would really like to hear it. If the only reason is, that nobody had time for this, I would happily help out here.

      1. SOLR-7182.patch
        116 kB
        Shalin Shekhar Mangar
      2. SOLR-7182.patch
        115 kB
        Marius Grama
      3. SOLR-7182.patch
        114 kB
        Marius Grama
      4. SOLR-7182.patch
        106 kB
        Marius Grama
      5. SOLR-7182.patch
        102 kB
        Marius Grama
      6. SOLR-7182.patch
        68 kB
        Marius Grama
      7. SOLR-7182.patch
        71 kB
        Marius Grama
      8. SOLR-7182-branch_5x.patch
        51 kB
        Shalin Shekhar Mangar

        Issue Links

          Activity

          Hide
          Shalin Shekhar Mangar added a comment -

          Lack of time is really the only reason. A patch would be appreciated!

          Show
          Shalin Shekhar Mangar added a comment - Lack of time is really the only reason. A patch would be appreciated!
          Hide
          Sven Windisch added a comment -

          Thanks for your answer. Is there an approximate date for the 5.1 release, so I can have a rough idea when to be finished with this?

          Show
          Sven Windisch added a comment - Thanks for your answer. Is there an approximate date for the 5.1 release, so I can have a rough idea when to be finished with this?
          Hide
          Erick Erickson added a comment -

          Sven:

          There's really never a firm date in mind, it's more like whenever enough has accumulated that someone makes it a priority. It might be in the next several weeks but nothing's been planned yet.

          Usually someone announces on dev list that they're thinking about cutting a new release giving people a few day's warning, no such announcement has been made.

          Show
          Erick Erickson added a comment - Sven: There's really never a firm date in mind, it's more like whenever enough has accumulated that someone makes it a priority. It might be in the next several weeks but nothing's been planned yet. Usually someone announces on dev list that they're thinking about cutting a new release giving people a few day's warning, no such announcement has been made.
          Hide
          Noble Paul added a comment -

          I would say , don't worry about any particular release. Just build it . all the point releases will happen in quick succession . If it does not go in 5.1 , there will be a 5.2 pretty soon

          Show
          Noble Paul added a comment - I would say , don't worry about any particular release. Just build it . all the point releases will happen in quick succession . If it does not go in 5.1 , there will be a 5.2 pretty soon
          Hide
          Marius Grama added a comment -

          I see that this feature request hasn't been updated in a while. I've started working on it and implemented the GET methods for the Schema API and will continue with the POST methods.
          After having a look at the document https://cwiki.apache.org/confluence/display/solr/Schema+API I see that there can be exposed also Managed Resources REST API for the Schema API. https://cwiki.apache.org/confluence/display/solr/Managed+Resources
          Does the Schema API need to support also these features?

          Show
          Marius Grama added a comment - I see that this feature request hasn't been updated in a while. I've started working on it and implemented the GET methods for the Schema API and will continue with the POST methods. After having a look at the document https://cwiki.apache.org/confluence/display/solr/Schema+API I see that there can be exposed also Managed Resources REST API for the Schema API. https://cwiki.apache.org/confluence/display/solr/Managed+Resources Does the Schema API need to support also these features?
          Hide
          Marius Grama added a comment -

          I've attached a patch that offers SchemaRequest and SchemaResponse classes.
          Due to the fact that there can be executed multiple updates in a single POST, I've opted for aggregate multiple UpdateSchemaRequest classes in a MultipleUpdateSchemaRequest class. When obtaining the JSON representation to be sent to the /schema/update REST resource, all the update requests will be gathered in a NamedList and serialized in JSON format, by using a slightly modified version of noggit's JSONWriter that can deal with objects of type NamedList.

          The unit tests contained in the class SchemaTest are assuring the accuracy and failure of the methods.

          Support for managed schema API is not explicitly offered, but it can be achieved by simply extending the class within the client applications that need it.

          Show
          Marius Grama added a comment - I've attached a patch that offers SchemaRequest and SchemaResponse classes. Due to the fact that there can be executed multiple updates in a single POST, I've opted for aggregate multiple UpdateSchemaRequest classes in a MultipleUpdateSchemaRequest class. When obtaining the JSON representation to be sent to the /schema/update REST resource, all the update requests will be gathered in a NamedList and serialized in JSON format, by using a slightly modified version of noggit's JSONWriter that can deal with objects of type NamedList. The unit tests contained in the class SchemaTest are assuring the accuracy and failure of the methods. Support for managed schema API is not explicitly offered, but it can be achieved by simply extending the class within the client applications that need it.
          Hide
          Shalin Shekhar Mangar added a comment -

          Thanks Marius Grama! This is a great start. I'd very much like to have shorter class names e.g. SchemaRequest.DeleteFieldType instead of SchemaRequest.DeleteFieldTypeUpdateSchemaRequest. This is in similar to the CollectionAdmin API which has CollectionAdminRequest.Create and CollectionAdminRequest.AddReplica and so on.

          Show
          Shalin Shekhar Mangar added a comment - Thanks Marius Grama ! This is a great start. I'd very much like to have shorter class names e.g. SchemaRequest.DeleteFieldType instead of SchemaRequest.DeleteFieldTypeUpdateSchemaRequest. This is in similar to the CollectionAdmin API which has CollectionAdminRequest.Create and CollectionAdminRequest.AddReplica and so on.
          Hide
          Marius Grama added a comment -

          Thank you Shalin Shekhar Mangar for having a look over this patch. I've udated the inner class names to be consistent with the model of CollectionAdminRequest class and attached a new version of the patch. I'd appreciate any feedback in improving this client API.

          Show
          Marius Grama added a comment - Thank you Shalin Shekhar Mangar for having a look over this patch. I've udated the inner class names to be consistent with the model of CollectionAdminRequest class and attached a new version of the patch. I'd appreciate any feedback in improving this client API.
          Hide
          Shalin Shekhar Mangar added a comment -

          Can we remove the use of NamedList in the SchemaRequest public API? The NamedList is an internal Solr collection type and we should not force users to learn this API when they can just as easily use a java.util.Map.

          Show
          Shalin Shekhar Mangar added a comment - Can we remove the use of NamedList in the SchemaRequest public API? The NamedList is an internal Solr collection type and we should not force users to learn this API when they can just as easily use a java.util.Map.
          Hide
          Marius Grama added a comment -

          My first intention when writing the Schema API client was indeed to work with Map<String, Object> type for the parameters. When doing the first tests of the client I noticed that when dealing with the response I needed to work with NamedList
          e.g. :

          NamedList<Object> fieldProperties = (NamedList<Object>) fieldResponse.getResponse().get("field");
          

          which made me use NamedList<Object> for convenience in the request methods.
          I will submit shortly a new patch containing Map<String, Object> parameters instead of NamedList<Object>

          Show
          Marius Grama added a comment - My first intention when writing the Schema API client was indeed to work with Map<String, Object> type for the parameters. When doing the first tests of the client I noticed that when dealing with the response I needed to work with NamedList e.g. : NamedList< Object > fieldProperties = (NamedList< Object >) fieldResponse.getResponse().get( "field" ); which made me use NamedList<Object> for convenience in the request methods. I will submit shortly a new patch containing Map<String, Object> parameters instead of NamedList<Object>
          Hide
          Shalin Shekhar Mangar added a comment -

          Yeah, I was wondering if we should expand the SchemaResponse as well. The user should be able to read the schema using proper objects instead of using NamedList directly.

          Show
          Shalin Shekhar Mangar added a comment - Yeah, I was wondering if we should expand the SchemaResponse as well. The user should be able to read the schema using proper objects instead of using NamedList directly.
          Hide
          Marius Grama added a comment -

          I've taken out the usage of NamedList from the API exposed by the SchemaRequest.
          I've also created specific wrappers for the response objects corresponding to the Schema API so that the clients do not need to iterate through the response NamedList.
          For wrapping the FieldType add/replace request object I've introduced the class FieldTypeDefinition. I have to do some extra testing to make sure that I didn't miss any of the properties that can be set (I've used FieldTypeXmlAdapter as template for the properties).
          For wrapping the response messages for schema and fieldtype calls I've introduced SchemaRepresentation, respectively FieldTypeRepresentation.

          Show
          Marius Grama added a comment - I've taken out the usage of NamedList from the API exposed by the SchemaRequest. I've also created specific wrappers for the response objects corresponding to the Schema API so that the clients do not need to iterate through the response NamedList. For wrapping the FieldType add/replace request object I've introduced the class FieldTypeDefinition. I have to do some extra testing to make sure that I didn't miss any of the properties that can be set (I've used FieldTypeXmlAdapter as template for the properties). For wrapping the response messages for schema and fieldtype calls I've introduced SchemaRepresentation, respectively FieldTypeRepresentation.
          Hide
          Shalin Shekhar Mangar added a comment -

          This is excellent!

          1. Can we rename SchemaRepresentation's members fieldsProperties, dynamicFieldsProperties, fieldTypesRepresentations, copyFieldsProperties to fields, dynamicFields, fieldTypes, copyFields respectively?
          2. Similarly, rename FieldTypeDefinition's fieldTypeProperties, analyzerDefinition, indexAnalyzerDefinition, queryAnalyzerDefinition, multiTermAnalyzerDefinition, similarityProperties to fieldType, indexAnalyzer, queryAnalyzer, multiTermAnalyzer and similarity respectively.

          As you can see, I very much prefer shorter names

          Show
          Shalin Shekhar Mangar added a comment - This is excellent! Can we rename SchemaRepresentation's members fieldsProperties, dynamicFieldsProperties, fieldTypesRepresentations, copyFieldsProperties to fields, dynamicFields, fieldTypes, copyFields respectively? Similarly, rename FieldTypeDefinition's fieldTypeProperties, analyzerDefinition, indexAnalyzerDefinition, queryAnalyzerDefinition, multiTermAnalyzerDefinition, similarityProperties to fieldType, indexAnalyzer, queryAnalyzer, multiTermAnalyzer and similarity respectively. As you can see, I very much prefer shorter names
          Hide
          Marius Grama added a comment -

          Shalin Shekhar Mangar I've followed the recommendations that you've provided me. Thanks for the feedback. I appreciate it.

          In FieldTypeDefinition instead of fieldTypeProperties i didn't change the field name to fieldType, but to attributes. I thought it more semantically appropriate to its purpose.

          For the class names I'd appreciate FieldTypeDefinition, AnalyzerDefintion (request specific), FieldTypeRepresentation, SchemaRepresentation (response specific) if you would come with some more appropriate class names.

          I've discovered SOLR-7679 bug along the way. I'd appreciate it if this would be commited soon so that the current unit tests for SOLR-7182 can pass.

          Show
          Marius Grama added a comment - Shalin Shekhar Mangar I've followed the recommendations that you've provided me. Thanks for the feedback. I appreciate it. In FieldTypeDefinition instead of fieldTypeProperties i didn't change the field name to fieldType , but to attributes . I thought it more semantically appropriate to its purpose. For the class names I'd appreciate FieldTypeDefinition, AnalyzerDefintion (request specific), FieldTypeRepresentation, SchemaRepresentation (response specific) if you would come with some more appropriate class names. I've discovered SOLR-7679 bug along the way. I'd appreciate it if this would be commited soon so that the current unit tests for SOLR-7182 can pass.
          Hide
          Marius Grama added a comment -

          Created a new version of the patch containing modifications added due to the newest developments done as part of the resolved Schema API issues (see related issue links).

          Show
          Marius Grama added a comment - Created a new version of the patch containing modifications added due to the newest developments done as part of the resolved Schema API issues (see related issue links).
          Hide
          Noble Paul added a comment -

          Can we rename the Response objects to have a "Response" suffix. That is the convention that we use in the other APIs

          e.g:

           public static class FieldType extends SolrResponseBase {}
          

          to

           public static class FieldTypeResponse extends SolrResponseBase {}
          

          It also creates confusion because class with exact same name is used in the Request as well

          Show
          Noble Paul added a comment - Can we rename the Response objects to have a "Response" suffix. That is the convention that we use in the other APIs e.g: public static class FieldType extends SolrResponseBase {} to public static class FieldTypeResponse extends SolrResponseBase {} It also creates confusion because class with exact same name is used in the Request as well
          Hide
          Marius Grama added a comment -

          Noble Paul I've attached a patch containing changed names for the response classes. The reasoning that I've used by having same names for request and response classes is that they are static inner classes (e.g. : SchemaRequest.FieldType, SchemaResponse.FieldType) and when using them their purpose is obvious. Nevertheless having the same class names is error prone, so I consider this a good suggestion.

          Show
          Marius Grama added a comment - Noble Paul I've attached a patch containing changed names for the response classes. The reasoning that I've used by having same names for request and response classes is that they are static inner classes (e.g. : SchemaRequest.FieldType, SchemaResponse.FieldType) and when using them their purpose is obvious. Nevertheless having the same class names is error prone, so I consider this a good suggestion.
          Hide
          Shalin Shekhar Mangar added a comment -

          Thanks Marius Grama. I fixed a few failures reported by ant precommit:

          1. Use the String.format method with the ROOT locale
          2. Javadoc warning in SchemaRequestJSONWriter.write
          3. Missing package-info.java for the two new packages introduced in this patch
          4. Type erasure warning in Arrays.asList used in testMultipleUpdateRequestAccuracy

          ant precommit is happy and all tests pass. I'll commit this shortly.

          Show
          Shalin Shekhar Mangar added a comment - Thanks Marius Grama . I fixed a few failures reported by ant precommit: Use the String.format method with the ROOT locale Javadoc warning in SchemaRequestJSONWriter.write Missing package-info.java for the two new packages introduced in this patch Type erasure warning in Arrays.asList used in testMultipleUpdateRequestAccuracy ant precommit is happy and all tests pass. I'll commit this shortly.
          Hide
          ASF subversion and git services added a comment -

          Commit 1686650 from shalin@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1686650 ]

          SOLR-7182: Make the Schema-API a first class citizen of SolrJ

          Show
          ASF subversion and git services added a comment - Commit 1686650 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1686650 ] SOLR-7182 : Make the Schema-API a first class citizen of SolrJ
          Hide
          Shalin Shekhar Mangar added a comment -

          This patch applies on branch_5x. I had to make a few adjustments to compile and get precommit passing on branch_5x with Java7.

          Show
          Shalin Shekhar Mangar added a comment - This patch applies on branch_5x. I had to make a few adjustments to compile and get precommit passing on branch_5x with Java7.
          Hide
          ASF subversion and git services added a comment -

          Commit 1686654 from shalin@apache.org in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1686654 ]

          SOLR-7182: Make the Schema-API a first class citizen of SolrJ

          Show
          ASF subversion and git services added a comment - Commit 1686654 from shalin@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1686654 ] SOLR-7182 : Make the Schema-API a first class citizen of SolrJ
          Hide
          Shalin Shekhar Mangar added a comment -

          Thanks Sven for opening the issue and Marius for all the work!

          Show
          Shalin Shekhar Mangar added a comment - Thanks Sven for opening the issue and Marius for all the work!
          Hide
          Shalin Shekhar Mangar added a comment -

          Bulk close for 5.3.0 release

          Show
          Shalin Shekhar Mangar added a comment - Bulk close for 5.3.0 release

            People

            • Assignee:
              Shalin Shekhar Mangar
              Reporter:
              Sven Windisch
            • Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development