Solr
  1. Solr
  2. SOLR-8131

Make ManagedIndexSchemaFactory as the default in Solr

    Details

      Description

      The techproducts and other examples shipped with Solr all use the ClassicIndexSchemaFactory which disables all Schema APIs which need to modify schema. It'd be nice to be able to support both read/write schema APIs without needing to enable data-driven or schema-less mode.

      I propose to change all 5.x examples to explicitly use ManagedIndexSchemaFactory and to enable ManagedIndexSchemaFactory by default in trunk (6.x).

      1. create-core.png
        207 kB
        Varun Thacker
      2. SOLR-8131_5x.patch
        17 kB
        Varun Thacker
      3. SOLR-8131.patch
        97 kB
        Varun Thacker
      4. SOLR-8131.patch
        95 kB
        Varun Thacker
      5. SOLR-8131.patch
        89 kB
        Varun Thacker
      6. SOLR-8131.patch
        90 kB
        Varun Thacker
      7. SOLR-8131.patch
        14 kB
        Varun Thacker
      8. SOLR-8131-schemaless-fix.patch
        2 kB
        Shalin Shekhar Mangar
      9. SOLR-8131-schemaless-fix.patch
        0.9 kB
        Shalin Shekhar Mangar

        Issue Links

          Activity

          Hide
          Noble Paul added a comment -

          ๐Ÿ‘

          Show
          Noble Paul added a comment - ๐Ÿ‘
          Hide
          Varun Thacker added a comment -

          +1

          Just to clarify what we'll have is a `managed-schema` file and no `schema.xml` file in the default configs right?

          Show
          Varun Thacker added a comment - +1 Just to clarify what we'll have is a `managed-schema` file and no `schema.xml` file in the default configs right?
          Hide
          Shalin Shekhar Mangar added a comment -

          Just to clarify what we'll have is a `managed-schema` file and no `schema.xml` file in the default configs right?

          Yeah, I think the default is to rename any existing schema.xml file to schema.xml.bak and afterwards use 'managed-schema' as the generated schema file name.

          Show
          Shalin Shekhar Mangar added a comment - Just to clarify what we'll have is a `managed-schema` file and no `schema.xml` file in the default configs right? Yeah, I think the default is to rename any existing schema.xml file to schema.xml.bak and afterwards use 'managed-schema' as the generated schema file name.
          Hide
          Varun Thacker added a comment -

          Yeah, I think the default is to rename any existing schema.xml file to schema.xml.bak and afterwards use 'managed-schema' as the generated schema file name.

          The current data_driven config doesn't have a schema.bak file .

          Also if we enable it by default in 6.0 is the "mutable" flag useful then?

          Show
          Varun Thacker added a comment - Yeah, I think the default is to rename any existing schema.xml file to schema.xml.bak and afterwards use 'managed-schema' as the generated schema file name. The current data_driven config doesn't have a schema.bak file . Also if we enable it by default in 6.0 is the "mutable" flag useful then?
          Hide
          Alexandre Rafalovitch added a comment -

          What about all the embedded documentation in the examples that disappears on the first run with managed schema? Including all the commented-out sections and "this is default" sections.

          Show
          Alexandre Rafalovitch added a comment - What about all the embedded documentation in the examples that disappears on the first run with managed schema? Including all the commented-out sections and "this is default" sections.
          Hide
          Shalin Shekhar Mangar added a comment -

          The current data_driven config doesn't have a schema.bak file

          That's because the data driven config does not have a schema.xml and starts off directly with a managed-schema file. The techproducts and basic configs example do have a schema.xml (which I wasn't planning on removing) which will be renamed to schema.xml.bak

          What about all the embedded documentation in the examples that disappears on the first run with managed schema? Including all the commented-out sections and "this is default" sections.

          Good point, Alexandre. What do you think we should do? Maybe we can create a page in the ref guide which has all that information instead? Another option (don't know how feasible it'd be) is to have a describe mode in the /schema API which prints helpful documentation about every enabled option/plugin in the schema?

          Show
          Shalin Shekhar Mangar added a comment - The current data_driven config doesn't have a schema.bak file That's because the data driven config does not have a schema.xml and starts off directly with a managed-schema file. The techproducts and basic configs example do have a schema.xml (which I wasn't planning on removing) which will be renamed to schema.xml.bak What about all the embedded documentation in the examples that disappears on the first run with managed schema? Including all the commented-out sections and "this is default" sections. Good point, Alexandre. What do you think we should do? Maybe we can create a page in the ref guide which has all that information instead? Another option (don't know how feasible it'd be) is to have a describe mode in the /schema API which prints helpful documentation about every enabled option/plugin in the schema?
          Hide
          Alexandre Rafalovitch added a comment -

          I would be all over anything that's self-documenting. API endpoints, analyzers, etc. For API, something like http://swagger.io/ could help. That would enable other newbie-oriented use cases too. E.g. auto-generated UI for https://www.getpostman.com/ .

          This deserves its own discussion, really.

          A page in ref-guide could be a - simpler - option too, especially if the comments are hyperlinked into the specific guide sections. That would give people jump off points from the context of the config file into more detailed descriptions.

          Show
          Alexandre Rafalovitch added a comment - I would be all over anything that's self-documenting. API endpoints, analyzers, etc. For API, something like http://swagger.io/ could help. That would enable other newbie-oriented use cases too. E.g. auto-generated UI for https://www.getpostman.com/ . This deserves its own discussion, really. A page in ref-guide could be a - simpler - option too, especially if the comments are hyperlinked into the specific guide sections. That would give people jump off points from the context of the config file into more detailed descriptions.
          Hide
          Erick Erickson added a comment -

          Apologies in advance if I'm missing something here and don't get me wrong, I love where this could go; "use solr/bin to set up your test cluster then go to the admin UI to modify your configs" would simplify greatly the new user experience.

          The current "use zkcli to update your config set after you make changes" is really clumsy and I've seen countless users confused by this, especially misunderstanding of the bootstrapping options. Not to mention all the confusion when a manual edit introduces a syntax error and we get "help me, all of the sudden Solr doesn't start"....

          However (you knew this was coming):
          If this is the default, then the only way to modify the schema will be with the Schema API, right? So rather than allow someone to get into the schema file and do a bunch of manual edits we'll force them to issue some long command like below. I'm NOT knocking it as an API call, it's a perfectly fine API, but I'm sure not going to be happy typing it out 100 times for adding 100 fields to my schema. Or writing a script.....

          curl -X POST -H 'Content-type:application/json' --data-binary '{
          "add-field":

          { "name":"sell-by", "type":"tdate", "stored":true }

          }' http://localhost:8983/solr/gettingstarted/schema

          Sure, I can do all the mods to schema.xml first then switch to managed, but that's obscure and certainly not something that someone would even think about when just starting out. Or we can tell a novice to ignore the message about "generated by... do not modify".

          So my proposal would be do one of two things:
          1> provide a "schema builder" as part of the new Angular Admin. Wouldn't have to be anything all that complex to start and could grow over time. I can imagine several approaches that would be more or less work, discuss this in any tickets that come up I suppose.
          or
          2> enhance the bin/solr script to add/delete/replace schema elements. I'd really like this option to allow changing one and only one parameter as an option (syntax TBD) to make it simple to, say, just change stored from "true" to "false" for some field/fieldType.
          3> ???

          I prefer <1> by far, but "progress not perfection" is the goal here. And this could be quite a simple UI, we'd of course have to be careful that it didn't upload arbitrary XML, but that should be relatively simple.

          I'd go so far as advocate that whatever we decide about <1> or <2> be put in a JIRA that blocks this one. I can be talked out of that, especially if making a UI that used the managed schema stuff under the covers would take a long time.

          Show
          Erick Erickson added a comment - Apologies in advance if I'm missing something here and don't get me wrong, I love where this could go; "use solr/bin to set up your test cluster then go to the admin UI to modify your configs" would simplify greatly the new user experience. The current "use zkcli to update your config set after you make changes" is really clumsy and I've seen countless users confused by this, especially misunderstanding of the bootstrapping options. Not to mention all the confusion when a manual edit introduces a syntax error and we get "help me, all of the sudden Solr doesn't start".... However (you knew this was coming): If this is the default, then the only way to modify the schema will be with the Schema API, right? So rather than allow someone to get into the schema file and do a bunch of manual edits we'll force them to issue some long command like below. I'm NOT knocking it as an API call, it's a perfectly fine API, but I'm sure not going to be happy typing it out 100 times for adding 100 fields to my schema. Or writing a script..... curl -X POST -H 'Content-type:application/json' --data-binary '{ "add-field": { "name":"sell-by", "type":"tdate", "stored":true } }' http://localhost:8983/solr/gettingstarted/schema Sure, I can do all the mods to schema.xml first then switch to managed, but that's obscure and certainly not something that someone would even think about when just starting out. Or we can tell a novice to ignore the message about "generated by... do not modify". So my proposal would be do one of two things: 1> provide a "schema builder" as part of the new Angular Admin. Wouldn't have to be anything all that complex to start and could grow over time. I can imagine several approaches that would be more or less work, discuss this in any tickets that come up I suppose. or 2> enhance the bin/solr script to add/delete/replace schema elements. I'd really like this option to allow changing one and only one parameter as an option (syntax TBD) to make it simple to, say, just change stored from "true" to "false" for some field/fieldType. 3> ??? I prefer <1> by far , but "progress not perfection" is the goal here. And this could be quite a simple UI, we'd of course have to be careful that it didn't upload arbitrary XML, but that should be relatively simple. I'd go so far as advocate that whatever we decide about <1> or <2> be put in a JIRA that blocks this one. I can be talked out of that, especially if making a UI that used the managed schema stuff under the covers would take a long time.
          Hide
          Upayavira added a comment -

          What I'd like to see is the techproducts sample stay the same (it has always been a static schema, let it stay that way), but update the bin/solr script (or such) to upload a standard managed schema config on first start up - or allow it via an argument:

          bin/solr -conf managed start

          That way, once the server is started, everything needed can be done via the UI.

          I'd love to see this soon, as I want to use it in my talk at LuceneRevolution next week!! I'll also add basic "add field", "add dynamic field" and "copy field" support to the schema browser in time for that talk also

          What I'm saying is - there's no such thing as default configset anymore. What I'd suggest is that Solr starts with a single configset uploaded, which is in managed schema mode, or there is a very easy way to do it, on startup.

          Show
          Upayavira added a comment - What I'd like to see is the techproducts sample stay the same (it has always been a static schema, let it stay that way), but update the bin/solr script (or such) to upload a standard managed schema config on first start up - or allow it via an argument: bin/solr -conf managed start That way, once the server is started, everything needed can be done via the UI. I'd love to see this soon, as I want to use it in my talk at LuceneRevolution next week!! I'll also add basic "add field", "add dynamic field" and "copy field" support to the schema browser in time for that talk also What I'm saying is - there's no such thing as default configset anymore. What I'd suggest is that Solr starts with a single configset uploaded, which is in managed schema mode, or there is a very easy way to do it, on startup.
          Hide
          Erick Erickson added a comment -

          Guess this kind of makes the discussion about whether 8139 should block this irrelevant doesn't it

          I'm sure 8139 will drive people toward the new admin UI as well...

          Show
          Erick Erickson added a comment - Guess this kind of makes the discussion about whether 8139 should block this irrelevant doesn't it I'm sure 8139 will drive people toward the new admin UI as well...
          Hide
          Alexandre Rafalovitch added a comment -

          You mean "bin/solr create_core -d", right? As a standard configset! Stored in the same place as the others, etc. Just checking? Otherwise, yet another way to create the collection would cause confusion.

          Show
          Alexandre Rafalovitch added a comment - You mean "bin/solr create_core -d", right? As a standard configset! Stored in the same place as the others, etc. Just checking? Otherwise, yet another way to create the collection would cause confusion.
          Hide
          Upayavira added a comment -

          No, I don't want to create a collection. I want an empty managed-schema config set to be uploaded to a SolrCloud setup, either by default, or with a single switch on the bin/solr start command.

          Or... an extension to the configset API that says "load one of these standard configsets from disk", e.g:

          /admin/configs?action=LOAD&set=managed_schema

          There's no real security risks here, as the files are coming off disk. It allows the UI to pull in one of a set of sample configsets from the server/solr/configsets directory.

          Show
          Upayavira added a comment - No, I don't want to create a collection. I want an empty managed-schema config set to be uploaded to a SolrCloud setup, either by default, or with a single switch on the bin/solr start command. Or... an extension to the configset API that says "load one of these standard configsets from disk", e.g: /admin/configs?action=LOAD&set=managed_schema There's no real security risks here, as the files are coming off disk. It allows the UI to pull in one of a set of sample configsets from the server/solr/configsets directory.
          Hide
          Upayavira added a comment -

          I'm sure 8139 will drive people toward the new admin UI as well...

          That's my idea. Make the new UI visible via the "new UI" link, then start jamming it so full of new features that people will demand that it be made the default!

          Show
          Upayavira added a comment - I'm sure 8139 will drive people toward the new admin UI as well... That's my idea. Make the new UI visible via the "new UI" link, then start jamming it so full of new features that people will demand that it be made the default!
          Hide
          Noble Paul added a comment - - edited

          Upayavira why do you think the people will resist moving to the new admin UI. I don't remember seeing any discussion where people are opposed to it.

          I'm NOT knocking it as an API call, it's a perfectly fine API, but I'm sure not going to be happy typing it out 100 times for adding 100 fields to my schema. Or writing a script.....

          I see the pain Erick Erickson .

          Apart from what is proposed in SOLR-8139 , we should add a modify-field command where you can just update one value of the field
          e.g

          curl -X POST -H 'Content-type:application/json' --data-binary '{
          "modify-field":
          { "name":"sell-by", "stored":true }
          }' http://localhost:8983/solr/gettingstarted/schema
          

          We should provide a simple admin page where I can type arbitrary commands to an endpoint . So all I need to do is type

          {
          "modify-field":
          { "name":"sell-by", "stored":true }
          }
          

          We plan to support hocon as well. so all you need to type would be

          modify-field {name:sell-by, stored:true}
          
          Show
          Noble Paul added a comment - - edited Upayavira why do you think the people will resist moving to the new admin UI. I don't remember seeing any discussion where people are opposed to it. I'm NOT knocking it as an API call, it's a perfectly fine API, but I'm sure not going to be happy typing it out 100 times for adding 100 fields to my schema. Or writing a script..... I see the pain Erick Erickson . Apart from what is proposed in SOLR-8139 , we should add a modify-field command where you can just update one value of the field e.g curl -X POST -H 'Content-type:application/json' --data-binary '{ "modify-field" : { "name" : "sell-by" , "stored" : true } }' http: //localhost:8983/solr/gettingstarted/schema We should provide a simple admin page where I can type arbitrary commands to an endpoint . So all I need to do is type { "modify-field" : { "name" : "sell-by" , "stored" : true } } We plan to support hocon as well. so all you need to type would be modify-field {name:sell-by, stored: true }
          Hide
          Varun Thacker added a comment -

          What I'd like to see is the techproducts sample stay the same (it has always been a static schema, let it stay that way), but update the bin/solr script (or such) to upload a standard managed schema config on first start up

          I don't think thats a good idea. We might end up having the same issue we had with collection1 .

          If ManagedIndexSchemaFactory is by default we can have the schema editor thats in your plan without any problems?

          Show
          Varun Thacker added a comment - What I'd like to see is the techproducts sample stay the same (it has always been a static schema, let it stay that way), but update the bin/solr script (or such) to upload a standard managed schema config on first start up I don't think thats a good idea. We might end up having the same issue we had with collection1 . If ManagedIndexSchemaFactory is by default we can have the schema editor thats in your plan without any problems?
          Hide
          Upayavira added a comment -

          What I want is a way to say "when you start, start with some configs". Equally there can be a "start without configs" option. Or, an API that says "load a sample config set from disk (i.e from server/solr/configsets) so the user can bootstrap via the UI from one of the known provided configsets.

          Is the intention that a non-managed schema be deprecated? If not, I'd like there to be an example that works that way, and techproducts seems like a reasonable candidate.

          Show
          Upayavira added a comment - What I want is a way to say "when you start, start with some configs". Equally there can be a "start without configs" option. Or, an API that says "load a sample config set from disk (i.e from server/solr/configsets) so the user can bootstrap via the UI from one of the known provided configsets. Is the intention that a non-managed schema be deprecated? If not, I'd like there to be an example that works that way, and techproducts seems like a reasonable candidate.
          Hide
          Upayavira added a comment -

          bq: Upayavira why do you think the people will resist moving to the new admin UI. I don't remember seeing any discussion where people are opposed to it.

          Because people don't like change. Because once people start using the new UI, I put money on the fact that they will start finding all sorts of details that I haven't considered. Giving some new features at least counterbalances that risk somewhat.

          Show
          Upayavira added a comment - bq: Upayavira why do you think the people will resist moving to the new admin UI. I don't remember seeing any discussion where people are opposed to it. Because people don't like change. Because once people start using the new UI, I put money on the fact that they will start finding all sorts of details that I haven't considered. Giving some new features at least counterbalances that risk somewhat.
          Hide
          Varun Thacker added a comment -

          Patch which changes all example schema files to explicitly use ManagedIndexSchemaFactory. If a schema factory is not specified starting from 6.0 Solr will use ManagedIndexSchemaFactory by default.

          I have no idea how to make the tests pass with this change. We rewrite all the schema files to managed-schema . Firstly we need to give the tests write permissions to managed-schema ( solr-tests.policy ) . More importantly tests which don't run in their own VM will overwrite the files under test-files/solr/collection1/solr ?

          Show
          Varun Thacker added a comment - Patch which changes all example schema files to explicitly use ManagedIndexSchemaFactory. If a schema factory is not specified starting from 6.0 Solr will use ManagedIndexSchemaFactory by default. I have no idea how to make the tests pass with this change. We rewrite all the schema files to managed-schema . Firstly we need to give the tests write permissions to managed-schema ( solr-tests.policy ) . More importantly tests which don't run in their own VM will overwrite the files under test-files/solr/collection1/solr ?
          Hide
          Uwe Schindler added a comment -

          I have no idea how to make the tests pass with this change. We rewrite all the schema files to managed-schema . Firstly we need to give the tests write permissions to managed-schema ( solr-tests.policy ) . More importantly tests which don't run in their own VM will overwrite the files under test-files/solr/collection1/solr ?

          Please don't do this permanently! Maybe only rewrite the old schemas once (automatically), disabling the security manager (e.g. run tests with -Dtests.useSecurityManager=false). After that all schemas should be converted. Then just cleanup the directory and remove the old files and commit the changes. After that nothing should change the schemas anymore?

          The general rule is to make a clone of the core directory for tests that actually modify the core directory, e.g. update the schema. Most tests already do this.

          Show
          Uwe Schindler added a comment - I have no idea how to make the tests pass with this change. We rewrite all the schema files to managed-schema . Firstly we need to give the tests write permissions to managed-schema ( solr-tests.policy ) . More importantly tests which don't run in their own VM will overwrite the files under test-files/solr/collection1/solr ? Please don't do this permanently! Maybe only rewrite the old schemas once (automatically), disabling the security manager (e.g. run tests with -Dtests.useSecurityManager=false ). After that all schemas should be converted. Then just cleanup the directory and remove the old files and commit the changes. After that nothing should change the schemas anymore? The general rule is to make a clone of the core directory for tests that actually modify the core directory, e.g. update the schema. Most tests already do this.
          Hide
          Varun Thacker added a comment -

          The way ManagedSchemaFactory initializes itself is that it looks for the default schema file ( "managed-schema" ) .
          If it's not present then it takes the specified schema-name.xml ( default is schema.xml ) and renames it to a file called managed-schema or to a name explicitly mentioned.

          We have lots of solrconfigs and schema files in a common directory and each test can use any combination of that. Since we need to specify the new managed-schema file name in the solrconfig file and that a test can use any combination we can't manually change anything either.

          I don't see a way where we can have multiple schema/solrconfig files in the same directory and support any renaming logic from ManagedSchemaFactory

          Show
          Varun Thacker added a comment - The way ManagedSchemaFactory initializes itself is that it looks for the default schema file ( "managed-schema" ) . If it's not present then it takes the specified schema-name.xml ( default is schema.xml ) and renames it to a file called managed-schema or to a name explicitly mentioned. We have lots of solrconfigs and schema files in a common directory and each test can use any combination of that. Since we need to specify the new managed-schema file name in the solrconfig file and that a test can use any combination we can't manually change anything either. I don't see a way where we can have multiple schema/solrconfig files in the same directory and support any renaming logic from ManagedSchemaFactory
          Hide
          Uwe Schindler added a comment -

          renames it to a file called managed-schema or to a name explicitly mentioned.

          Why not specify "schema.xml" as the specified schema name? I have no idea why we need a different name here? In any case, we cannot allow write access to src/test-files! So 2 possibilities:

          • Fix all schema/configs throughout solr
          • tests that directly access the test-files folder have to copy to a temporary dir (like most tests already do). There is some utility method in SolrTestcaseJ4 / TestHarness to do this (I think).
          Show
          Uwe Schindler added a comment - renames it to a file called managed-schema or to a name explicitly mentioned. Why not specify "schema.xml" as the specified schema name? I have no idea why we need a different name here? In any case, we cannot allow write access to src/test-files! So 2 possibilities: Fix all schema/configs throughout solr tests that directly access the test-files folder have to copy to a temporary dir (like most tests already do). There is some utility method in SolrTestcaseJ4 / TestHarness to do this (I think).
          Hide
          Uwe Schindler added a comment -

          I still don't understand why you cannot use the follwoing approach as described before! Run all tests one time with -Dtests.useSecurityManager=false. After this test run, all schema files should be renamed accordingly. All later runs then would not need write access anymore, so security manager can be enabled again. To persist the modified/renamed schemas just commit the changes after the run without security manager.

          Show
          Uwe Schindler added a comment - I still don't understand why you cannot use the follwoing approach as described before! Run all tests one time with -Dtests.useSecurityManager=false . After this test run, all schema files should be renamed accordingly. All later runs then would not need write access anymore, so security manager can be enabled again. To persist the modified/renamed schemas just commit the changes after the run without security manager.
          Hide
          Varun Thacker added a comment -

          Hi Uwe,

          The problem is , we have lots of schema-.xml / solrconfig-.xmlfiles under test-files/solr/collection1/conf . ManagedSchemaIndexFactory will rename a schema file to "managed-schema" . So the test runs will overwrite the same file when run without the security manager.

          We cannot even specify a specific name instead of "managed-schema" in the solrconfig.xml files since a test can use any combination of solrconfig/schema files..

          Show
          Varun Thacker added a comment - Hi Uwe, The problem is , we have lots of schema- .xml / solrconfig- .xmlfiles under test-files/solr/collection1/conf . ManagedSchemaIndexFactory will rename a schema file to "managed-schema" . So the test runs will overwrite the same file when run without the security manager. We cannot even specify a specific name instead of "managed-schema" in the solrconfig.xml files since a test can use any combination of solrconfig/schema files..
          Hide
          Uwe Schindler added a comment - - edited

          OK. So why do you want to change those configs to use managed schema at all? Those are static test only schema, never changed. The classic schema factory is not deprecated and should never be. So why not leave the "test schemas" as they are.

          I just want to say: These are different issues: Making the new schema the default for new Solr installations and fixing test schemas.

          So just change the defaults in this issue and then open new issues to change tests one by one to use managed schema (although I am against this, because we should also test the classic schema factory). All tests that use the manages schema factory have to be rewritten to use SolrTestcaseJ4's copy functionality to create a writeable clone. This also makes the tests behave correct, because all would get a clean setup, not overwriting files already there.

          Show
          Uwe Schindler added a comment - - edited OK. So why do you want to change those configs to use managed schema at all? Those are static test only schema, never changed. The classic schema factory is not deprecated and should never be. So why not leave the "test schemas" as they are. I just want to say: These are different issues: Making the new schema the default for new Solr installations and fixing test schemas. So just change the defaults in this issue and then open new issues to change tests one by one to use managed schema (although I am against this, because we should also test the classic schema factory). All tests that use the manages schema factory have to be rewritten to use SolrTestcaseJ4's copy functionality to create a writeable clone. This also makes the tests behave correct, because all would get a clean setup, not overwriting files already there.
          Hide
          Ishan Chattopadhyaya added a comment - - edited

          All tests that use the manages schema factory have to be rewritten to use SolrTestcaseJ4's copy functionality to create a writeable clone.

          Not sure if this helps here, but I once wrote a test based on managed schema, SpatialRPTFieldTypeTest (extends AbstractBadConfigTestBase, which extends SolrTestcaseJ4).

          AFAICT, that test copied all configs and schemas specified into temp directory. Then, it tried to find "managed-schema", couldn't find it and loaded schema-minimal.xml after copying it over to "managed-schema".

          o.a.s.s.ManagedIndexSchemaFactory The schema is configured as managed, but managed schema resource managed-schema not found - loading non-managed schema schema-minimal.xml instead
          

          Do you think the needed schema file for each test can be copied over to the temp directory as "schema-minimal.xml", from where it gets loaded up as "managed-schema"? I'm assuming that all tests would have a different temp directory, so there won't be a conflict there.

          Show
          Ishan Chattopadhyaya added a comment - - edited All tests that use the manages schema factory have to be rewritten to use SolrTestcaseJ4's copy functionality to create a writeable clone. Not sure if this helps here, but I once wrote a test based on managed schema, SpatialRPTFieldTypeTest (extends AbstractBadConfigTestBase, which extends SolrTestcaseJ4). AFAICT, that test copied all configs and schemas specified into temp directory. Then, it tried to find "managed-schema", couldn't find it and loaded schema-minimal.xml after copying it over to "managed-schema". o.a.s.s.ManagedIndexSchemaFactory The schema is configured as managed, but managed schema resource managed-schema not found - loading non-managed schema schema-minimal.xml instead Do you think the needed schema file for each test can be copied over to the temp directory as "schema-minimal.xml", from where it gets loaded up as "managed-schema"? I'm assuming that all tests would have a different temp directory, so there won't be a conflict there.
          Hide
          Uwe Schindler added a comment -

          I'm assuming that all tests would have a different temp directory, so there won't be a conflict there.

          Exactly. This is why I said, that all tests that actually modify the schmea, have to be copied over. Those with static schemas never changed should (In my opinion) still use the classic schema factory.

          Show
          Uwe Schindler added a comment - I'm assuming that all tests would have a different temp directory, so there won't be a conflict there. Exactly. This is why I said, that all tests that actually modify the schmea, have to be copied over. Those with static schemas never changed should (In my opinion) still use the classic schema factory.
          Hide
          Varun Thacker added a comment -

          Those with static schemas never changed should (In my opinion) still use the classic schema factory.

          Makes sense. I'll work on a new patch where these static schemas use the classic schema factory.

          Show
          Varun Thacker added a comment - Those with static schemas never changed should (In my opinion) still use the classic schema factory. Makes sense. I'll work on a new patch where these static schemas use the classic schema factory.
          Hide
          Varun Thacker added a comment -

          New patch.

          • All example solrconfigs explicitly use ManagedIndexSchemaFactory by default
          • All tests solrconfigs explicitly use ClassicIndexSchemaFactory unless they were using ManagedIndexSchemaFactory / testing managed schema
          • If a solrconfig doesn't mention a schemaFactory then ManagedIndexSchemaFactory will be used from 6.0

          We will keep the current behaviour in 5.x i.e If a solrconfig doesn't mention a schemaFactory then ClassicIndexSchemaFactory is used.

          Although the patch has become quite big it should't take long to review.

          Show
          Varun Thacker added a comment - New patch. All example solrconfigs explicitly use ManagedIndexSchemaFactory by default All tests solrconfigs explicitly use ClassicIndexSchemaFactory unless they were using ManagedIndexSchemaFactory / testing managed schema If a solrconfig doesn't mention a schemaFactory then ManagedIndexSchemaFactory will be used from 6.0 We will keep the current behaviour in 5.x i.e If a solrconfig doesn't mention a schemaFactory then ClassicIndexSchemaFactory is used. Although the patch has become quite big it should't take long to review.
          Hide
          Erik Hatcher added a comment -

          Varun Thacker - here's some feedback on that patch:

          • ShowFileRequestHandlerTest.java: maybe use the DEFAULT_MANAGED_SCHEMA_RESOURCE_NAME constant instead on "QueryRequest request = new QueryRequest(params("file","managed-schema"));"
          • In the spirit of less (explicit) config, we could maybe get away from specifying this incantation
            +  <schemaFactory class="ManagedIndexSchemaFactory">
            +    <bool name="mutable">true</bool>
            +    <str name="managedSchemaResourceName">managed-schema</str>
            +  </schemaFactory>
            

            explicitly in all the shipped configs and just leave it out entirely. (for one, no one really needs to change the resource name)

          • I like the test of managed schema being set in the shipped configs!
          Show
          Erik Hatcher added a comment - Varun Thacker - here's some feedback on that patch: ShowFileRequestHandlerTest.java: maybe use the DEFAULT_MANAGED_SCHEMA_RESOURCE_NAME constant instead on "QueryRequest request = new QueryRequest(params("file","managed-schema"));" In the spirit of less (explicit) config, we could maybe get away from specifying this incantation + <schemaFactory class= "ManagedIndexSchemaFactory" > + <bool name= "mutable" > true </bool> + <str name= "managedSchemaResourceName" >managed-schema</str> + </schemaFactory> explicitly in all the shipped configs and just leave it out entirely. (for one, no one really needs to change the resource name) I like the test of managed schema being set in the shipped configs!
          Hide
          Varun Thacker added a comment -

          Updated patch which trunk which takes Erik's feedback into account.

          With the current patch this will be the change in behaviour on trunk.

          • If no schema factory is mentioned in the solrconfig.xml file then ManagedSchemaFactory will be used by default if Lucene_Version > 6.0 else ClassicSchemaFactory will be used.
          • All shipped configs don't explicitly mention a schema factory. Hence it will default to ManagedSchemaFactory automatically in trunk.
          Show
          Varun Thacker added a comment - Updated patch which trunk which takes Erik's feedback into account. With the current patch this will be the change in behaviour on trunk. If no schema factory is mentioned in the solrconfig.xml file then ManagedSchemaFactory will be used by default if Lucene_Version > 6.0 else ClassicSchemaFactory will be used. All shipped configs don't explicitly mention a schema factory. Hence it will default to ManagedSchemaFactory automatically in trunk.
          Hide
          Varun Thacker added a comment -

          Hi Erik,

          ShowFileRequestHandlerTest.java: maybe use the DEFAULT_MANAGED_SCHEMA_RESOURCE_NAME constant instead on "QueryRequest request = new QueryRequest(params("file","managed-schema"));"

          I didn't make this change. The argument being we should leave hardcoded constants in our tests. That way is someone in the future accidentally changed DEFAULT_MANAGED_SCHEMA_RESOURCE_NAME the tests would atleast catch it as a break.

          explicitly in all the shipped configs and just leave it out entirely. (for one, no one really needs to change the resource name)

          Incorporated in the patch

          Show
          Varun Thacker added a comment - Hi Erik, ShowFileRequestHandlerTest.java: maybe use the DEFAULT_MANAGED_SCHEMA_RESOURCE_NAME constant instead on "QueryRequest request = new QueryRequest(params("file","managed-schema"));" I didn't make this change. The argument being we should leave hardcoded constants in our tests. That way is someone in the future accidentally changed DEFAULT_MANAGED_SCHEMA_RESOURCE_NAME the tests would atleast catch it as a break. explicitly in all the shipped configs and just leave it out entirely. (for one, no one really needs to change the resource name) Incorporated in the patch
          Hide
          Shalin Shekhar Mangar added a comment -

          Thanks Varun, there's one reproducible test failure with your patch applied:

            [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestSolrCLIRunExample -Dtests.method=testInteractiveSolrCloudExample -Dtests.seed=B06DF3AE906F4D27 -Dtests.slow=true -Dtests.locale=es_PY -Dtests.timezone=Pacific/Easter -Dtests.asserts=true -Dtests.file.encoding=UTF-8
             [junit4] ERROR   5.32s J1 | TestSolrCLIRunExample.testInteractiveSolrCloudExample <<<
             [junit4]    > Throwable #1: org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error from server at http://localhost:54786/solr/testCloudExamplePrompt_shard1_replica2: This IndexSchema is not mutable.
             [junit4]    > 	at __randomizedtesting.SeedInfo.seed([B06DF3AE906F4D27:6B1C1364A71A8841]:0)
             [junit4]    > 	at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:633)
             [junit4]    > 	at org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:982)
             [junit4]    > 	at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:871)
             [junit4]    > 	at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:807)
             [junit4]    > 	at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:150)
             [junit4]    > 	at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:174)
             [junit4]    > 	at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:139)
             [junit4]    > 	at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:153)
             [junit4]    > 	at org.apache.solr.util.TestSolrCLIRunExample.testInteractiveSolrCloudExample(TestSolrCLIRunExample.java:445)
             [junit4]    > 	at java.lang.Thread.run(Thread.java:745)
             [junit4]    > Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:54786/solr/testCloudExamplePrompt_shard1_replica2: This IndexSchema is not mutable.
             [junit4]    > 	at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:575)
             [junit4]    > 	at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:241)
             [junit4]    > 	at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:230)
             [junit4]    > 	at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:372)
             [junit4]    > 	at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:325)
             [junit4]    > 	at org.apache.solr.client.solrj.impl.CloudSolrClient$2.call(CloudSolrClient.java:608)
             [junit4]    > 	at org.apache.solr.client.solrj.impl.CloudSolrClient$2.call(CloudSolrClient.java:605)
             [junit4]    > 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
             [junit4]    > 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:232)
             [junit4]    > 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
             [junit4]    > 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
             [junit4]    > 	... 1 more
          
          
          Show
          Shalin Shekhar Mangar added a comment - Thanks Varun, there's one reproducible test failure with your patch applied: [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestSolrCLIRunExample -Dtests.method=testInteractiveSolrCloudExample -Dtests.seed=B06DF3AE906F4D27 -Dtests.slow= true -Dtests.locale=es_PY -Dtests.timezone=Pacific/Easter -Dtests.asserts= true -Dtests.file.encoding=UTF-8 [junit4] ERROR 5.32s J1 | TestSolrCLIRunExample.testInteractiveSolrCloudExample <<< [junit4] > Throwable #1: org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error from server at http: //localhost:54786/solr/testCloudExamplePrompt_shard1_replica2: This IndexSchema is not mutable. [junit4] > at __randomizedtesting.SeedInfo.seed([B06DF3AE906F4D27:6B1C1364A71A8841]:0) [junit4] > at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:633) [junit4] > at org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:982) [junit4] > at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:871) [junit4] > at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:807) [junit4] > at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:150) [junit4] > at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:174) [junit4] > at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:139) [junit4] > at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:153) [junit4] > at org.apache.solr.util.TestSolrCLIRunExample.testInteractiveSolrCloudExample(TestSolrCLIRunExample.java:445) [junit4] > at java.lang. Thread .run( Thread .java:745) [junit4] > Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http: //localhost:54786/solr/testCloudExamplePrompt_shard1_replica2: This IndexSchema is not mutable. [junit4] > at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:575) [junit4] > at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:241) [junit4] > at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:230) [junit4] > at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:372) [junit4] > at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:325) [junit4] > at org.apache.solr.client.solrj.impl.CloudSolrClient$2.call(CloudSolrClient.java:608) [junit4] > at org.apache.solr.client.solrj.impl.CloudSolrClient$2.call(CloudSolrClient.java:605) [junit4] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) [junit4] > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:232) [junit4] > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [junit4] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [junit4] > ... 1 more
          Hide
          Varun Thacker added a comment -

          Updated patch. The one additional change in behaviour with this patch is that mutable=true for ManagedSchema by default. This was there in my first patch but slipped out in the subsequent ones.

          This also fixes the test failure which Shalin pointed out. Additionally there were some SolrJ tests failing because I had missed out some solrconfig files to add the schema factory to. Those have been fixed as well.

          If everything is looking fine then I'll run the tests one final time with nightly tests to make sure everything is okay and then commit it tomorrow ( Friday )

          Show
          Varun Thacker added a comment - Updated patch. The one additional change in behaviour with this patch is that mutable=true for ManagedSchema by default. This was there in my first patch but slipped out in the subsequent ones. This also fixes the test failure which Shalin pointed out. Additionally there were some SolrJ tests failing because I had missed out some solrconfig files to add the schema factory to. Those have been fixed as well. If everything is looking fine then I'll run the tests one final time with nightly tests to make sure everything is okay and then commit it tomorrow ( Friday )
          Hide
          Shalin Shekhar Mangar added a comment -

          There are still some test failures, Varun

             [junit4] Tests with failures [seed: EF2ABB6034EFC3BC]:
             [junit4]   - org.apache.solr.analytics.facet.FieldFacetTest (suite)
             [junit4]   - org.apache.solr.analytics.expression.ExpressionTest (suite)
             [junit4]   - org.apache.solr.analytics.NoFacetTest (suite)
             [junit4]   - org.apache.solr.analytics.util.valuesource.FunctionTest (suite)
             [junit4]   - org.apache.solr.analytics.facet.RangeFacetTest (suite)
             [junit4]   - org.apache.solr.analytics.facet.FieldFacetExtrasTest (suite)
             [junit4]   - org.apache.solr.analytics.facet.QueryFacetTest (suite)
          

          Most are of the form:

             [junit4] ERROR   0.00s J1 | QueryFacetTest (suite) <<<
             [junit4]    > Throwable #1: java.security.AccessControlException: access denied ("java.io.FilePermission" "/home/shalin/work/oss/trunk/solr/contrib/analytics/src/test-files/solr/collection1/conf/managed-schema" "write")
             [junit4]    > 	at __randomizedtesting.SeedInfo.seed([EF2ABB6034EFC3BC]:0)
             [junit4]    > 	at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
             [junit4]    > 	at java.security.AccessController.checkPermission(AccessController.java:884)
             [junit4]    > 	at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
             [junit4]    > 	at java.lang.SecurityManager.checkWrite(SecurityManager.java:979)
             [junit4]    > 	at java.io.FileOutputStream.<init>(FileOutputStream.java:200)
             [junit4]    > 	at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
             [junit4]    > 	at org.apache.solr.schema.ManagedIndexSchema.persistManagedSchema(ManagedIndexSchema.java:130)
             [junit4]    > 	at org.apache.solr.schema.ManagedIndexSchemaFactory.upgradeToManagedSchema(ManagedIndexSchemaFactory.java:271)
             [junit4]    > 	at org.apache.solr.schema.ManagedIndexSchemaFactory.create(ManagedIndexSchemaFactory.java:186)
             [junit4]    > 	at org.apache.solr.schema.ManagedIndexSchemaFactory.create(ManagedIndexSchemaFactory.java:46)
             [junit4]    > 	at org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:75)
             [junit4]    > 	at org.apache.solr.util.TestHarness.<init>(TestHarness.java:97)
             [junit4]    > 	at org.apache.solr.SolrTestCaseJ4.createCore(SolrTestCaseJ4.java:572)
             [junit4]    > 	at org.apache.solr.SolrTestCaseJ4.initCore(SolrTestCaseJ4.java:562)
             [junit4]    > 	at org.apache.solr.SolrTestCaseJ4.initCore(SolrTestCaseJ4.java:404)
             [junit4]    > 	at org.apache.solr.SolrTestCaseJ4.initCore(SolrTestCaseJ4.java:393)
             [junit4]    > 	at org.apache.solr.analytics.facet.QueryFacetTest.beforeClass(QueryFacetTest.java:39)
             [junit4]    > 	at java.lang.Thread.run(Thread.java:745)
          
          Show
          Shalin Shekhar Mangar added a comment - There are still some test failures, Varun [junit4] Tests with failures [seed: EF2ABB6034EFC3BC]: [junit4] - org.apache.solr.analytics.facet.FieldFacetTest (suite) [junit4] - org.apache.solr.analytics.expression.ExpressionTest (suite) [junit4] - org.apache.solr.analytics.NoFacetTest (suite) [junit4] - org.apache.solr.analytics.util.valuesource.FunctionTest (suite) [junit4] - org.apache.solr.analytics.facet.RangeFacetTest (suite) [junit4] - org.apache.solr.analytics.facet.FieldFacetExtrasTest (suite) [junit4] - org.apache.solr.analytics.facet.QueryFacetTest (suite) Most are of the form: [junit4] ERROR 0.00s J1 | QueryFacetTest (suite) <<< [junit4] > Throwable #1: java.security.AccessControlException: access denied ( "java.io.FilePermission" "/home/shalin/work/oss/trunk/solr/contrib/analytics/src/test-files/solr/collection1/conf/managed-schema" "write" ) [junit4] > at __randomizedtesting.SeedInfo.seed([EF2ABB6034EFC3BC]:0) [junit4] > at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472) [junit4] > at java.security.AccessController.checkPermission(AccessController.java:884) [junit4] > at java.lang. SecurityManager .checkPermission( SecurityManager .java:549) [junit4] > at java.lang. SecurityManager .checkWrite( SecurityManager .java:979) [junit4] > at java.io.FileOutputStream.<init>(FileOutputStream.java:200) [junit4] > at java.io.FileOutputStream.<init>(FileOutputStream.java:162) [junit4] > at org.apache.solr.schema.ManagedIndexSchema.persistManagedSchema(ManagedIndexSchema.java:130) [junit4] > at org.apache.solr.schema.ManagedIndexSchemaFactory.upgradeToManagedSchema(ManagedIndexSchemaFactory.java:271) [junit4] > at org.apache.solr.schema.ManagedIndexSchemaFactory.create(ManagedIndexSchemaFactory.java:186) [junit4] > at org.apache.solr.schema.ManagedIndexSchemaFactory.create(ManagedIndexSchemaFactory.java:46) [junit4] > at org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:75) [junit4] > at org.apache.solr.util.TestHarness.<init>(TestHarness.java:97) [junit4] > at org.apache.solr.SolrTestCaseJ4.createCore(SolrTestCaseJ4.java:572) [junit4] > at org.apache.solr.SolrTestCaseJ4.initCore(SolrTestCaseJ4.java:562) [junit4] > at org.apache.solr.SolrTestCaseJ4.initCore(SolrTestCaseJ4.java:404) [junit4] > at org.apache.solr.SolrTestCaseJ4.initCore(SolrTestCaseJ4.java:393) [junit4] > at org.apache.solr.analytics.facet.QueryFacetTest.beforeClass(QueryFacetTest.java:39) [junit4] > at java.lang. Thread .run( Thread .java:745)
          Hide
          Varun Thacker added a comment -

          Updated patch which fixes the failing tests Shalin pointed out. All tests and precommit passes with this patch. I'll commit this soon.

          Show
          Varun Thacker added a comment - Updated patch which fixes the failing tests Shalin pointed out. All tests and precommit passes with this patch. I'll commit this soon.
          Hide
          ASF subversion and git services added a comment -

          Commit 1718258 from Varun Thacker in branch 'dev/trunk'
          [ https://svn.apache.org/r1718258 ]

          SOLR-8131: Make ManagedIndexSchemaFactory the default schemaFactory when luceneMatchVersion >= 6

          Show
          ASF subversion and git services added a comment - Commit 1718258 from Varun Thacker in branch 'dev/trunk' [ https://svn.apache.org/r1718258 ] SOLR-8131 : Make ManagedIndexSchemaFactory the default schemaFactory when luceneMatchVersion >= 6
          Hide
          Varun Thacker added a comment -

          Patch against the 5x branch. It changes no default behaviour and only modifies the example config files to use ManagedSchema

          Show
          Varun Thacker added a comment - Patch against the 5x branch. It changes no default behaviour and only modifies the example config files to use ManagedSchema
          Hide
          ASF subversion and git services added a comment -

          Commit 1718264 from Varun Thacker in branch 'dev/trunk'
          [ https://svn.apache.org/r1718264 ]

          SOLR-8131: Add CHANGES entry under solr 5.4 as well mentioning change to ManagedIndexSchemaFactory in all example config files

          Show
          ASF subversion and git services added a comment - Commit 1718264 from Varun Thacker in branch 'dev/trunk' [ https://svn.apache.org/r1718264 ] SOLR-8131 : Add CHANGES entry under solr 5.4 as well mentioning change to ManagedIndexSchemaFactory in all example config files
          Hide
          ASF subversion and git services added a comment -

          Commit 1718265 from Varun Thacker in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1718265 ]

          SOLR-8131: Make ManagedIndexSchemaFactory the default schemaFactory in all example config files

          Show
          ASF subversion and git services added a comment - Commit 1718265 from Varun Thacker in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1718265 ] SOLR-8131 : Make ManagedIndexSchemaFactory the default schemaFactory in all example config files
          Hide
          Varun Thacker added a comment -

          Marking this Jira as Resolved. Thanks everyone!

          Show
          Varun Thacker added a comment - Marking this Jira as Resolved. Thanks everyone!
          Hide
          ASF subversion and git services added a comment -

          Commit 1718307 from Varun Thacker in branch 'dev/trunk'
          [ https://svn.apache.org/r1718307 ]

          SOLR-8131: fix test solrconfig.xml files for the contrib modules

          Show
          ASF subversion and git services added a comment - Commit 1718307 from Varun Thacker in branch 'dev/trunk' [ https://svn.apache.org/r1718307 ] SOLR-8131 : fix test solrconfig.xml files for the contrib modules
          Hide
          Varun Thacker added a comment -

          The example/files/solrconfig.xml in trunk still explicitly mentions the schemaFactory. Reopening this to remove it from trunk.

          Show
          Varun Thacker added a comment - The example/files/solrconfig.xml in trunk still explicitly mentions the schemaFactory. Reopening this to remove it from trunk.
          Hide
          ASF subversion and git services added a comment -

          Commit 1718768 from Varun Thacker in branch 'dev/trunk'
          [ https://svn.apache.org/r1718768 ]

          SOLR-8131: example/files config doesn't explicitly mention a schema factory + imporove upgrading instructions

          Show
          ASF subversion and git services added a comment - Commit 1718768 from Varun Thacker in branch 'dev/trunk' [ https://svn.apache.org/r1718768 ] SOLR-8131 : example/files config doesn't explicitly mention a schema factory + imporove upgrading instructions
          Hide
          Shalin Shekhar Mangar added a comment -

          This has broken the schemaless feature in trunk. An easy way to reproduce is to create at least 2 shards and index some arbitrary json. The following errors repeat ad nauseam:

          ERROR - 2015-12-12 14:53:45.526; [c:gettingstarted s:shard1 r:core_node1 x:gettingstarted_shard1_replica1] org.apache.solr.schema.ManagedIndexSchema; Bad version when trying to persist schema using 0 due to: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /configs/gettingstarted/managed-schema
          INFO  - 2015-12-12 14:53:45.526; [c:gettingstarted s:shard1 r:core_node1 x:gettingstarted_shard1_replica1] org.apache.solr.schema.ManagedIndexSchema; Failed to persist managed schema at /configs/gettingstarted/managed-schema - version mismatch
          

          This indicated that the schema version is not being updated from ZooKeeper. I found that the watch was not being set because the ManagedSchemaIndexFactory.inform() was never called. This was because in case there is no schema factory set in the solrconfig.xml, the IndexSchemaFactory.buildIndexSchema() created a ManagedIndexSchemaFactory object directly. This is a SolrCoreAware class and it must be created using the resource loader so that the inform method can be called automatically.

          Show
          Shalin Shekhar Mangar added a comment - This has broken the schemaless feature in trunk. An easy way to reproduce is to create at least 2 shards and index some arbitrary json. The following errors repeat ad nauseam: ERROR - 2015-12-12 14:53:45.526; [c:gettingstarted s:shard1 r:core_node1 x:gettingstarted_shard1_replica1] org.apache.solr.schema.ManagedIndexSchema; Bad version when trying to persist schema using 0 due to: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /configs/gettingstarted/managed-schema INFO - 2015-12-12 14:53:45.526; [c:gettingstarted s:shard1 r:core_node1 x:gettingstarted_shard1_replica1] org.apache.solr.schema.ManagedIndexSchema; Failed to persist managed schema at /configs/gettingstarted/managed-schema - version mismatch This indicated that the schema version is not being updated from ZooKeeper. I found that the watch was not being set because the ManagedSchemaIndexFactory.inform() was never called. This was because in case there is no schema factory set in the solrconfig.xml, the IndexSchemaFactory.buildIndexSchema() created a ManagedIndexSchemaFactory object directly. This is a SolrCoreAware class and it must be created using the resource loader so that the inform method can be called automatically.
          Hide
          Shalin Shekhar Mangar added a comment -

          Trivial fix.

          But we need to add an integration test which can catch such problems.

          Show
          Shalin Shekhar Mangar added a comment - Trivial fix. But we need to add an integration test which can catch such problems.
          Hide
          Shalin Shekhar Mangar added a comment -

          The same fix but I removed the explicit mention of the IndexSchemaFactory from solrconfig-schemaless.xml which makes TestCloudSchemaless fail without this fix.

          Show
          Shalin Shekhar Mangar added a comment - The same fix but I removed the explicit mention of the IndexSchemaFactory from solrconfig-schemaless.xml which makes TestCloudSchemaless fail without this fix.
          Hide
          Varun Thacker added a comment -

          Hi Shalin,

          Does it make sent to have TestCloudSchemaless randomize between two solrconfig.xml's instead? One which explicitly specifies managed-schema and one which doesn't?

          Show
          Varun Thacker added a comment - Hi Shalin, Does it make sent to have TestCloudSchemaless randomize between two solrconfig.xml's instead? One which explicitly specifies managed-schema and one which doesn't?
          Hide
          Shalin Shekhar Mangar added a comment -

          I was assuming that we have a test which will fail if no index schema factory definition defaults to ClassicIndexSchemaFactory accidentally?

          Show
          Shalin Shekhar Mangar added a comment - I was assuming that we have a test which will fail if no index schema factory definition defaults to ClassicIndexSchemaFactory accidentally?
          Hide
          Varun Thacker added a comment -

          Yes we have a test for that - TestManagedSchema#testDefaultSchemaFactory . Okay so then we don't need to randomize here..

          Show
          Varun Thacker added a comment - Yes we have a test for that - TestManagedSchema#testDefaultSchemaFactory . Okay so then we don't need to randomize here..
          Hide
          Shalin Shekhar Mangar added a comment -

          Cool, I'll commit this then. All tests pass.

          Show
          Shalin Shekhar Mangar added a comment - Cool, I'll commit this then. All tests pass.
          Hide
          ASF subversion and git services added a comment -

          Commit 1720083 from shalin@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1720083 ]

          SOLR-8131: Use SolrResourceLoader to instantiate ManagedIndexSchemaFactory when no schema factory is specified in solrconfig.xml

          Show
          ASF subversion and git services added a comment - Commit 1720083 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1720083 ] SOLR-8131 : Use SolrResourceLoader to instantiate ManagedIndexSchemaFactory when no schema factory is specified in solrconfig.xml
          Hide
          Mark Miller added a comment -

          I think this made it so that you cannot use the Admin UI to add a SolrCore out of the box? I think that's a tough user experience.

          Show
          Mark Miller added a comment - I think this made it so that you cannot use the Admin UI to add a SolrCore out of the box? I think that's a tough user experience.
          Hide
          Varun Thacker added a comment -

          Hi Mark,

          Sorry I didn't quite follow your comment.

          Are you saying that the Admin UI to add a core is broken? I think that didn't work out of the box in earlier versions as well.

          Show
          Varun Thacker added a comment - Hi Mark, Sorry I didn't quite follow your comment. Are you saying that the Admin UI to add a core is broken? I think that didn't work out of the box in earlier versions as well.
          Hide
          Mark Miller added a comment -

          What part was broken?

          Now you can't create a core because it makes you specify a schema.xml, which will fail.

          Show
          Mark Miller added a comment - What part was broken? Now you can't create a core because it makes you specify a schema.xml, which will fail.
          Hide
          Varun Thacker added a comment -

          Hi Mark,

          I just tried this out on trunk
          1. Start solr using ./bin/solr start -e techproducts
          2. cp -r techproducts test; rm -r test/data/ test/core.properties - from the example/techproducts/solr folder to create an instance dir and have a conf/ directory
          3. Ran the core admin create command from the UI and it created the core successfully for me. Attaching a screenshot of what I put in.

          Could you please tell me how you tried it out. I'll file a bug and fix whatever is broken

          Show
          Varun Thacker added a comment - Hi Mark, I just tried this out on trunk 1. Start solr using ./bin/solr start -e techproducts 2. cp -r techproducts test; rm -r test/data/ test/core.properties - from the example/techproducts/solr folder to create an instance dir and have a conf/ directory 3. Ran the core admin create command from the UI and it created the core successfully for me. Attaching a screenshot of what I put in. Could you please tell me how you tried it out. I'll file a bug and fix whatever is broken

            People

            • Assignee:
              Varun Thacker
              Reporter:
              Shalin Shekhar Mangar
            • Votes:
              2 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development