Details

      Description

      The current schema API does one operation at a time and the normal usecase is that users add multiple fields/fieldtypes/copyFields etc in one shot.

      example

      curl http://localhost:8983/solr/collection1/schema -H 'Content-type:application/json'  -d '{
          "add-field": {
              "name":"sell-by",
              "type":"tdate",
              "stored":true
          },
          "add-field":{
              "name":"catchall",
              "type":"text_general",
              "stored":false
          }
      }
      

      or

      curl http://localhost:8983/solr/collection1/schema -H 'Content-type:application/json'  -d '{
          "add-field":[ {
              "name":"sell-by",
              "type":"tdate",
              "stored":true
          },
          {
              "name":"catchall",
              "type":"text_general",
              "stored":false
          }]
      }
      
      1. SOLR-6476.patch
        59 kB
        Noble Paul
      2. SOLR-6476.patch
        55 kB
        Noble Paul
      3. SOLR-6476.patch
        45 kB
        Noble Paul
      4. SOLR-6476.patch
        44 kB
        Noble Paul
      5. SOLR-6476.patch
        36 kB
        Noble Paul
      6. SOLR-6476.patch
        36 kB
        Noble Paul
      7. SOLR-6476.patch
        33 kB
        Noble Paul
      8. SOLR-6476.patch
        20 kB
        Noble Paul

        Issue Links

          Activity

          Hide
          noble.paul Noble Paul added a comment -

          A first cut of working patch. No testcases ,

          Show
          noble.paul Noble Paul added a comment - A first cut of working patch. No testcases ,
          Hide
          noble.paul Noble Paul added a comment -

          This time with testcases and support for add-copy-field and add-type

          Show
          noble.paul Noble Paul added a comment - This time with testcases and support for add-copy-field and add-type
          Hide
          noble.paul Noble Paul added a comment -

          I plan to commit this soon

          Show
          noble.paul Noble Paul added a comment - I plan to commit this soon
          Hide
          steve_rowe Steve Rowe added a comment -

          Noble, I looked at your latest patch and I found a few issues:

          1. The bulk schema api should be added to TestCloudManagedSchemaConcurrent
          2. TestBulkSchemaAPI has lots of backslash-double-quotes in JSON strings - those are way easier to look at if you convert the backslash-double-quotes to single-quotes and pass the string to the json() method.
          3. One of the tests should have a non-trivial fieldtype definition, like at least one analyzer.
          4. In SchemaManager.doOperation(), you limit retries to MAX_TRIES when there is a SchemaChangedInZkException, but none of the other schema APIs do that - why not let it continue until success?
          5. SchemaManager.ADD_FIELD_TYPE should be "add-field-type" instead of "add-type". (Solr could introduce a non-field type at some point in the future; even with "-field-" added, it's still not the longest command.)
          6. SchemaManager has lots of non-generic collections - I looked at a couple, and they could be generified - maybe they all could?
          7. IndexSchema.addDynamicFields() is missing the persist param you added to all the other add*() methods; also, it always persists, even when called from the bulk mode schema api.
          8. Most of the add*() javadocs in IndexSchema say that persistence always happens, but it doesn't if persist=false
          Show
          steve_rowe Steve Rowe added a comment - Noble, I looked at your latest patch and I found a few issues: The bulk schema api should be added to TestCloudManagedSchemaConcurrent TestBulkSchemaAPI has lots of backslash-double-quotes in JSON strings - those are way easier to look at if you convert the backslash-double-quotes to single-quotes and pass the string to the json() method. One of the tests should have a non-trivial fieldtype definition, like at least one analyzer. In SchemaManager.doOperation() , you limit retries to MAX_TRIES when there is a SchemaChangedInZkException , but none of the other schema APIs do that - why not let it continue until success? SchemaManager.ADD_FIELD_TYPE should be "add-field-type" instead of "add-type". (Solr could introduce a non-field type at some point in the future; even with "-field-" added, it's still not the longest command.) SchemaManager has lots of non-generic collections - I looked at a couple, and they could be generified - maybe they all could? IndexSchema.addDynamicFields() is missing the persist param you added to all the other add*() methods; also, it always persists, even when called from the bulk mode schema api. Most of the add*() javadocs in IndexSchema say that persistence always happens, but it doesn't if persist=false
          Hide
          noble.paul Noble Paul added a comment - - edited

          The bulk schema api should be added to TestCloudManagedSchemaConcurrent

          Why can't it be in a new class?

          TestBulkSchemaAPI has lots of backslash-double-quotes in JSON strings

          That was deliberate. Most JSON writers use double quotes.

          In SchemaManager.doOperation(), you limit retries to MAX_TRIES

          I can make the variable bigger , but I somehow hate infinite loops

          SchemaManager.ADD_FIELD_TYPE should be "add-field-type"

          sure

          SchemaManager has lots of non-generic collections

          will fix them

          IndexSchema.addDynamicFields() is missing the persist param you added to all the other add*() methods;

          will fix

          Most of the add*() javadocs in IndexSchema say that persistence always happens, but it doesn't if persist=false

          We need to remove those methods without persist option

          Show
          noble.paul Noble Paul added a comment - - edited The bulk schema api should be added to TestCloudManagedSchemaConcurrent Why can't it be in a new class? TestBulkSchemaAPI has lots of backslash-double-quotes in JSON strings That was deliberate. Most JSON writers use double quotes. In SchemaManager.doOperation(), you limit retries to MAX_TRIES I can make the variable bigger , but I somehow hate infinite loops SchemaManager.ADD_FIELD_TYPE should be "add-field-type" sure SchemaManager has lots of non-generic collections will fix them IndexSchema.addDynamicFields() is missing the persist param you added to all the other add*() methods; will fix Most of the add*() javadocs in IndexSchema say that persistence always happens, but it doesn't if persist=false We need to remove those methods without persist option
          Hide
          noble.paul Noble Paul added a comment -

          I also want to ask the preference of "add-field" Vs "addField" in the command names

          Show
          noble.paul Noble Paul added a comment - I also want to ask the preference of "add-field" Vs "addField" in the command names
          Hide
          noble.paul Noble Paul added a comment -

          New patch posted with changes incorporated

          Show
          noble.paul Noble Paul added a comment - New patch posted with changes incorporated
          Hide
          steve_rowe Steve Rowe added a comment -

          The bulk schema api should be added to TestCloudManagedSchemaConcurrent

          Why can't it be in a new class?

          It can, but you don't have any distributed tests at all. All existing schema API modification functionality gets distributed testing via TestCloudManagedSchemaConcurrent at this point.

          TestBulkSchemaAPI has lots of backslash-double-quotes in JSON strings

          That was deliberate. Most JSON writers use double quotes.

          You misunderstand me: SolrTestCaseJ4.json(String) converts single-quotes to double-quotes. Several Solr tests use it to get most-JSON-writer-compatible output without causing backslash-eye-bleed. I'm suggesting you do the same.

          In SchemaManager.doOperation(), you limit retries to MAX_TRIES

          I can make the variable bigger , but I somehow hate infinite loops

          It keeps retrying when another schema modification request gets in first. Please fix this - you didn't change anything at all here. There is no way we should be giving up here just because other modifications are happening.

          SchemaManager.ADD_FIELD_TYPE should be "add-field-type"

          sure

          I also want to ask the preference of "add-field" Vs "addField" in the command names

          Thanks - I like "add-field" better, but either would be fine.

          SchemaManager has lots of non-generic collections

          will fix them

          I only see one modification here (you added <String> on the left hand side but forgot to put in the diamond operator on the right side), still lots of non-generic collection usages - is there a problem with fixing the others?

          Most of the add*() javadocs in IndexSchema say that persistence always happens, but it doesn't if persist=false

          We need to remove those methods without persist option

          You removed some sugar methods, specifically for adding a single dynamic field, but left other sugar methods intact, e.g. addField() - why the inconsistency? I personally think sugar methods are fine - that's why I put them there - but if you're going to remove them, please be consistent.

          And the javadocs are still wrong:

          1. the persist param isn't listed on addFieldTypes() and addDynamicFields()
          2. the javadocs for all the methods say the methods will always persist. This is wrong.
          Show
          steve_rowe Steve Rowe added a comment - The bulk schema api should be added to TestCloudManagedSchemaConcurrent Why can't it be in a new class? It can, but you don't have any distributed tests at all. All existing schema API modification functionality gets distributed testing via TestCloudManagedSchemaConcurrent at this point. TestBulkSchemaAPI has lots of backslash-double-quotes in JSON strings That was deliberate. Most JSON writers use double quotes. You misunderstand me: SolrTestCaseJ4.json(String) converts single-quotes to double-quotes. Several Solr tests use it to get most-JSON-writer-compatible output without causing backslash-eye-bleed. I'm suggesting you do the same. In SchemaManager.doOperation(), you limit retries to MAX_TRIES I can make the variable bigger , but I somehow hate infinite loops It keeps retrying when another schema modification request gets in first. Please fix this - you didn't change anything at all here. There is no way we should be giving up here just because other modifications are happening. SchemaManager.ADD_FIELD_TYPE should be "add-field-type" sure I also want to ask the preference of "add-field" Vs "addField" in the command names Thanks - I like "add-field" better, but either would be fine. SchemaManager has lots of non-generic collections will fix them I only see one modification here (you added <String> on the left hand side but forgot to put in the diamond operator on the right side), still lots of non-generic collection usages - is there a problem with fixing the others? Most of the add*() javadocs in IndexSchema say that persistence always happens, but it doesn't if persist=false We need to remove those methods without persist option You removed some sugar methods, specifically for adding a single dynamic field, but left other sugar methods intact, e.g. addField() - why the inconsistency? I personally think sugar methods are fine - that's why I put them there - but if you're going to remove them, please be consistent. And the javadocs are still wrong: the persist param isn't listed on addFieldTypes() and addDynamicFields() the javadocs for all the methods say the methods will always persist. This is wrong.
          Hide
          steve_rowe Steve Rowe added a comment -

          FYI Noble Paul, I have create a review request on ReviewBoard to compare the previous patch I reviewed with your latest: https://reviews.apache.org/r/26037/

          Show
          steve_rowe Steve Rowe added a comment - FYI Noble Paul , I have create a review request on ReviewBoard to compare the previous patch I reviewed with your latest: https://reviews.apache.org/r/26037/
          Hide
          noble.paul Noble Paul added a comment - - edited

          It can, but you don't have any distributed tests at all.

          I should fix that. May be separately

          You removed some sugar methods,

          Yeah, the class was getting very hard to read with all the sugar methods and I prefer them removed

          You misunderstand me: SolrTestCaseJ4.json(String)

          ok , I got it . I'll fix it

          I'll fix the javadocs , in the next patch.

          It keeps retrying when another schema modification request gets in first. Please fix this - you didn't change anything at all here. There is no way we should be giving up here just because other modifications are happening.

          I disagree with this idea. It is not necessary that every command has to succeed. Whatever sucseeds , will have to do so with the given time/resources. Otherwise , it should just fail

          Show
          noble.paul Noble Paul added a comment - - edited It can, but you don't have any distributed tests at all. I should fix that. May be separately You removed some sugar methods, Yeah, the class was getting very hard to read with all the sugar methods and I prefer them removed You misunderstand me: SolrTestCaseJ4.json(String) ok , I got it . I'll fix it I'll fix the javadocs , in the next patch. It keeps retrying when another schema modification request gets in first. Please fix this - you didn't change anything at all here. There is no way we should be giving up here just because other modifications are happening. I disagree with this idea. It is not necessary that every command has to succeed. Whatever sucseeds , will have to do so with the given time/resources. Otherwise , it should just fail
          Hide
          yseeley@gmail.com Yonik Seeley added a comment -

          It keeps retrying when another schema modification request gets in first. Please fix this - you didn't change anything at all here. There is no way we should be giving up here just because other modifications are happening.

          +1

          The closest analogy is something like AtomicLong.incrementAndGet()
          Those implementations try forever. It would be exceedingly rare for starvation for any amount of time... but if we say it can fail then clients need to implement their own try-until-success loop (which won't succeed any faster!)

          Show
          yseeley@gmail.com Yonik Seeley added a comment - It keeps retrying when another schema modification request gets in first. Please fix this - you didn't change anything at all here. There is no way we should be giving up here just because other modifications are happening. +1 The closest analogy is something like AtomicLong.incrementAndGet() Those implementations try forever. It would be exceedingly rare for starvation for any amount of time... but if we say it can fail then clients need to implement their own try-until-success loop (which won't succeed any faster!)
          Hide
          noble.paul Noble Paul added a comment -

          Yonik Seeley

          Let us look at it this way. This command is issued most likely by a user over http. Normally people expect a command to succeed or fail within a few seconds. They would be more than glad to be presented with an error message than waiting forever. HTTP may timeout after some time too

          The closest analogy is something like AtomicLong.incrementAndGet()

          yes, they are similar . But, that is just doing a CAS inside a CPU which can do a zillion ops per second. Here we are talking about a distributed operation where each try can take a a few 100 ms.

          Show
          noble.paul Noble Paul added a comment - Yonik Seeley Let us look at it this way. This command is issued most likely by a user over http. Normally people expect a command to succeed or fail within a few seconds. They would be more than glad to be presented with an error message than waiting forever. HTTP may timeout after some time too The closest analogy is something like AtomicLong.incrementAndGet() yes, they are similar . But, that is just doing a CAS inside a CPU which can do a zillion ops per second. Here we are talking about a distributed operation where each try can take a a few 100 ms.
          Hide
          erickerickson Erick Erickson added a comment -

          This is complete tangential, but I wanted to get other's thinking about it....

          A while ago, Stefan and I tried to allow schema.xml to be edited from the admin UI.
          See: https://issues.apache.org/jira/browse/SOLR-5287

          Uwe pointed out that writing arbitrary XML to a server is a security problem so we pulled things out. It's actually in limbo in trunk currently marked as a blocker.

          Is there any way the managed schema functionality could be warped in the Admin UI to allow editing for the schema file? I'm forever wishing that I could do that.... I suppose it would require that the managed schema is used though...

          Anyway, feel free to ignore this entirely or open a new JIRA if it sparks some ideas.

          Show
          erickerickson Erick Erickson added a comment - This is complete tangential, but I wanted to get other's thinking about it.... A while ago, Stefan and I tried to allow schema.xml to be edited from the admin UI. See: https://issues.apache.org/jira/browse/SOLR-5287 Uwe pointed out that writing arbitrary XML to a server is a security problem so we pulled things out. It's actually in limbo in trunk currently marked as a blocker. Is there any way the managed schema functionality could be warped in the Admin UI to allow editing for the schema file? I'm forever wishing that I could do that.... I suppose it would require that the managed schema is used though... Anyway, feel free to ignore this entirely or open a new JIRA if it sparks some ideas.
          Hide
          noble.paul Noble Paul added a comment -

          A while ago, Stefan and I tried to allow schema.xml to be edited from the admin UI.

          I want to move to a system where the users are not aware of schema.xml . It should be an internal detail. Exactly the same way I deal with my RDBMS/Cassandra (or whatever) . The way a user thinks of my RDBMS is as follows

          • Startup the server first
          • use A DDL to create a schema
          • During the lifecycle of the system I use more DDL to add/remove fields
          • I use a command like "describe table' to know the current schema. (We have a REST API )
          • I don't really care about how the server stores the schema/config or whatever

          To achieve tis goal , we must stop thinking about the system in terms of xmls and we should start thinking about the APIs as the DDL for Solr.

          The term 'managed schema' will have no relevance. schema will always be 'managed' and the the adjective 'managed' must go away

          Show
          noble.paul Noble Paul added a comment - A while ago, Stefan and I tried to allow schema.xml to be edited from the admin UI. I want to move to a system where the users are not aware of schema.xml . It should be an internal detail. Exactly the same way I deal with my RDBMS/Cassandra (or whatever) . The way a user thinks of my RDBMS is as follows Startup the server first use A DDL to create a schema During the lifecycle of the system I use more DDL to add/remove fields I use a command like "describe table' to know the current schema. (We have a REST API ) I don't really care about how the server stores the schema/config or whatever To achieve tis goal , we must stop thinking about the system in terms of xmls and we should start thinking about the APIs as the DDL for Solr. The term 'managed schema' will have no relevance. schema will always be 'managed' and the the adjective 'managed' must go away
          Hide
          erickerickson Erick Erickson added a comment -

          Works for me. The only thing I'll strongly weigh in on here is that the admin UI must be able to access at least the "describe table" functionality, and ideally the ability to change the schema definition from the admin UI. Ditto with accessing the rest of the configuration information. There are just waaay too many situations "in the field" where actually seeing what the server is working with (as opposed to what the ops person thinks they've configured) on a running instance is critical to troubleshooting.

          Currently this is all done via the admin UI and the "files" link, although you can't edit there of course.

          Not quite sure how all this applies to, say, solrconfig.xml though. I guess we'll see as things develop.

          Show
          erickerickson Erick Erickson added a comment - Works for me. The only thing I'll strongly weigh in on here is that the admin UI must be able to access at least the "describe table" functionality, and ideally the ability to change the schema definition from the admin UI. Ditto with accessing the rest of the configuration information. There are just waaay too many situations "in the field" where actually seeing what the server is working with (as opposed to what the ops person thinks they've configured) on a running instance is critical to troubleshooting. Currently this is all done via the admin UI and the "files" link, although you can't edit there of course. Not quite sure how all this applies to, say, solrconfig.xml though. I guess we'll see as things develop.
          Hide
          noble.paul Noble Paul added a comment -

          the "files link" is a problem. Instead we should just have a prominent way to access "schema" which should display what is the current list of fields , types etc.
          we need to eventually de-prioritize the raw file view.

          Show
          noble.paul Noble Paul added a comment - the "files link" is a problem. Instead we should just have a prominent way to access "schema" which should display what is the current list of fields , types etc. we need to eventually de-prioritize the raw file view.
          Hide
          noble.paul Noble Paul added a comment -

          The only outstanding issue AFAIK is the no:of reties. whether it should be finite or infinite. I believe all user interacting APIs should be coded to fail gracefully. If we can resolve that I can commit this

          Show
          noble.paul Noble Paul added a comment - The only outstanding issue AFAIK is the no:of reties. whether it should be finite or infinite. I believe all user interacting APIs should be coded to fail gracefully. If we can resolve that I can commit this
          Hide
          steve_rowe Steve Rowe added a comment -

          The only outstanding issue AFAIK is the no:of reties. whether it should be finite or infinite.

          (There are other outstanding issues - I'll list them in another comment after this one.)

          I still think continuous retrying when there are competing updates is the right thing to do.

          How about this: in SOLR-6249, Timothy Potter added request param updateTimeoutSecs to fail Schema API requests unless they succeed within the given time. We could add checking of this timeout to the update-retry loop if it's provided, but when it's not, we allow the update-retry loop to continue ad infinitum.

          In any case, the patch on this issue needs to be changed to make bulk Schema API requests aware of the new updateTimeoutSecs param and perform the same all-replicas-in-sync check that the other Schema API methods now have.

          Show
          steve_rowe Steve Rowe added a comment - The only outstanding issue AFAIK is the no:of reties. whether it should be finite or infinite. (There are other outstanding issues - I'll list them in another comment after this one.) I still think continuous retrying when there are competing updates is the right thing to do. How about this: in SOLR-6249 , Timothy Potter added request param updateTimeoutSecs to fail Schema API requests unless they succeed within the given time. We could add checking of this timeout to the update-retry loop if it's provided, but when it's not, we allow the update-retry loop to continue ad infinitum. In any case, the patch on this issue needs to be changed to make bulk Schema API requests aware of the new updateTimeoutSecs param and perform the same all-replicas-in-sync check that the other Schema API methods now have.
          Hide
          steve_rowe Steve Rowe added a comment -

          Previously mentioned outstanding issues:

          1. One of the tests should have a non-trivial fieldtype definition, like at least one analyzer.
          2. SchemaManager has lots of non-generic collections - I looked at a couple, and they could be generified - maybe they all could?
          3. Most of the add*() javadocs in IndexSchema say that persistence always happens, but it doesn't if persist=false

          New things I noticed today:

          1. SchemaManager has zero javadocs. More would be good.
          2. AFAICT, a bulk Schema API request with a new field using a new field type in the same request can fail depending on the order of the specified operations, e.g. this will fail because "newtype" won't exist when SchemaManager tries to add "newfield" (caveat: untested):
            {
              "add-field" : { "name":"newfield", "type":"newtype" }, 
              "add-field-type" : { "name":"newtype", "class":"solr.StrField" }
            }
            
          Show
          steve_rowe Steve Rowe added a comment - Previously mentioned outstanding issues: One of the tests should have a non-trivial fieldtype definition, like at least one analyzer. SchemaManager has lots of non-generic collections - I looked at a couple, and they could be generified - maybe they all could? Most of the add*() javadocs in IndexSchema say that persistence always happens, but it doesn't if persist=false New things I noticed today: SchemaManager has zero javadocs. More would be good. AFAICT, a bulk Schema API request with a new field using a new field type in the same request can fail depending on the order of the specified operations, e.g. this will fail because "newtype" won't exist when SchemaManager tries to add "newfield" (caveat: untested): { "add-field" : { "name":"newfield", "type":"newtype" }, "add-field-type" : { "name":"newtype", "class":"solr.StrField" } }
          Hide
          noble.paul Noble Paul added a comment - - edited

          AFAICT, a bulk Schema API request with a new field using a new field type in the same request can fail depending on the order of the specified operations

          Yes it fails. Works as designed. This is exactly the same behavior you will see in an RDBMS as well. Order is important

          Show
          noble.paul Noble Paul added a comment - - edited AFAICT, a bulk Schema API request with a new field using a new field type in the same request can fail depending on the order of the specified operations Yes it fails. Works as designed. This is exactly the same behavior you will see in an RDBMS as well. Order is important
          Hide
          noble.paul Noble Paul added a comment -

          One of the tests should have a non-trivial fieldtype definition, like at least one analyzer.

          It is not supported yet. Pls correct me if I am wrong. does the current REST API support it?

          Most of the add*() javadocs in IndexSchema say that persistence always happens, but it doesn't if persist=false

          It says "* @param persist to persist or not" . Isn't it right?

          SchemaManager has zero javadocs. More would be good.

          It is not a class for others to use . But , it will be added

          Show
          noble.paul Noble Paul added a comment - One of the tests should have a non-trivial fieldtype definition, like at least one analyzer. It is not supported yet. Pls correct me if I am wrong. does the current REST API support it? Most of the add*() javadocs in IndexSchema say that persistence always happens, but it doesn't if persist=false It says "* @param persist to persist or not" . Isn't it right? SchemaManager has zero javadocs. More would be good. It is not a class for others to use . But , it will be added
          Hide
          steve_rowe Steve Rowe added a comment -

          One of the tests should have a non-trivial fieldtype definition, like at least one analyzer.

          It is not supported yet. Pls correct me if I am wrong. does the current REST API support it?

          Yes. See TestManagedSchemaFieldTypeResource.

          Most of the add*() javadocs in IndexSchema say that persistence always happens, but it doesn't if persist=false

          It says "* @param persist to persist or not" . Isn't it right?

          This is what I'm talking about: "Copies this schema, adds the given fields to the copy, then persists the new schema."

          SchemaManager has zero javadocs. More would be good.

          It is not a class for others to use . But , it will be added

          Thanks. Javadocs (or rather comments of any kind) are for maintainers too, not just users. Here's an example where javadocs/comments would help a maintainer: SchemaManager.Operation.getMetaWithout(). What does that thing do? (Hint: the name of the method doesn't tell you.)

          Show
          steve_rowe Steve Rowe added a comment - One of the tests should have a non-trivial fieldtype definition, like at least one analyzer. It is not supported yet. Pls correct me if I am wrong. does the current REST API support it? Yes. See TestManagedSchemaFieldTypeResource . Most of the add*() javadocs in IndexSchema say that persistence always happens, but it doesn't if persist=false It says "* @param persist to persist or not" . Isn't it right? This is what I'm talking about: "Copies this schema, adds the given fields to the copy, then persists the new schema." SchemaManager has zero javadocs. More would be good. It is not a class for others to use . But , it will be added Thanks. Javadocs (or rather comments of any kind) are for maintainers too, not just users. Here's an example where javadocs/comments would help a maintainer: SchemaManager.Operation.getMetaWithout() . What does that thing do? (Hint: the name of the method doesn't tell you.)
          Hide
          steve_rowe Steve Rowe added a comment -

          AFAICT, a bulk Schema API request with a new field using a new field type in the same request can fail depending on the order of the specified operations

          Yes it fails. Works as designed. This is exactly the same behavior you will see in an RDBMS as well. Order is important

          Order is not important in schema.xml, and in plenty of other contexts. This order dependence will need to be explicitly documented.

          Show
          steve_rowe Steve Rowe added a comment - AFAICT, a bulk Schema API request with a new field using a new field type in the same request can fail depending on the order of the specified operations Yes it fails. Works as designed. This is exactly the same behavior you will see in an RDBMS as well. Order is important Order is not important in schema.xml , and in plenty of other contexts. This order dependence will need to be explicitly documented.
          Hide
          noble.paul Noble Paul added a comment -

          SOLR-6249 features applied.

          Added javadocs and added complex field types in tests

          Order is not important in schema.xml, and in plenty of other contexts. This order dependence will need to be explicitly documented.

          This will have to be documented in the API documentation in reference guide

          Show
          noble.paul Noble Paul added a comment - SOLR-6249 features applied. Added javadocs and added complex field types in tests Order is not important in schema.xml, and in plenty of other contexts. This order dependence will need to be explicitly documented. This will have to be documented in the API documentation in reference guide
          Hide
          steve_rowe Steve Rowe added a comment - - edited

          Thanks for the changes, Noble, those look fine - two new issues:

          (edit: System.getCurrentTimeMillis() -> currentTimeMillis())

          1. Didn't Mark Miller convert all timing stuff in Solr to use nanoTime() instead of currentTimeMillis()? If so, shouldn't we use nanoTime() here too? (This applies to Timothy Potter's SOLR-6249 work as well.)
          2. In SchemaManager.waitForOtherReplicasToUpdate(), called from doOperations(), you send -1 in as maxWaitSecs to ManagedIndexSchema.waitForSchemaZkVersionAgreement() when the timeout has been exceeded, but AFAICT negative values aren't handled appropriately there, e.g. it gets sent in unexamined to ExecutorService.invokeAll():
            private List<String> doOperations(List<Operation> operations){
              int timeout = req.getParams().getInt(BaseSolrResource.UPDATE_TIMEOUT_SECS, -1);
              long startTime = System.currentTimeMillis();
              [...]
                  managedIndexSchema.persistManagedSchema(false);
                  core.setLatestSchema(managedIndexSchema);
                  waitForOtherReplicasToUpdate(timeout, startTime);
                  [...]
            }
          
            private void waitForOtherReplicasToUpdate(int timeout, long startTime) {
              if(timeout > 0 && [...]){
                  [...]
                  ManagedIndexSchema.waitForSchemaZkVersionAgreement([...],
                      getTimeLeftInSecs(timeout, startTime));
                }
              }
            }
            private int getTimeLeftInSecs(int timeout, long startTime) {
              long timeLeftSecs = timeout -  ((System.currentTimeMillis() - startTime) /1000);
              return (int) (timeLeftSecs > 0 ?timeLeftSecs: -1);
            }
          
          Show
          steve_rowe Steve Rowe added a comment - - edited Thanks for the changes, Noble, those look fine - two new issues: ( edit : System.getCurrentTimeMillis() -> currentTimeMillis() ) Didn't Mark Miller convert all timing stuff in Solr to use nanoTime() instead of currentTimeMillis() ? If so, shouldn't we use nanoTime() here too? (This applies to Timothy Potter 's SOLR-6249 work as well.) In SchemaManager.waitForOtherReplicasToUpdate() , called from doOperations() , you send -1 in as maxWaitSecs to ManagedIndexSchema.waitForSchemaZkVersionAgreement() when the timeout has been exceeded, but AFAICT negative values aren't handled appropriately there, e.g. it gets sent in unexamined to ExecutorService.invokeAll() : private List< String > doOperations(List<Operation> operations){ int timeout = req.getParams().getInt(BaseSolrResource.UPDATE_TIMEOUT_SECS, -1); long startTime = System .currentTimeMillis(); [...] managedIndexSchema.persistManagedSchema( false ); core.setLatestSchema(managedIndexSchema); waitForOtherReplicasToUpdate(timeout, startTime); [...] } private void waitForOtherReplicasToUpdate( int timeout, long startTime) { if (timeout > 0 && [...]){ [...] ManagedIndexSchema.waitForSchemaZkVersionAgreement([...], getTimeLeftInSecs(timeout, startTime)); } } } private int getTimeLeftInSecs( int timeout, long startTime) { long timeLeftSecs = timeout - (( System .currentTimeMillis() - startTime) /1000); return ( int ) (timeLeftSecs > 0 ?timeLeftSecs: -1); }
          Hide
          noble.paul Noble Paul added a comment -

          actually System.currentTimeMillis() is fine if used in the same thread . And if it is used in different threads it can give wrong values. However , for uniformity I shall change it to System.nanoTime()

          Show
          noble.paul Noble Paul added a comment - actually System.currentTimeMillis() is fine if used in the same thread . And if it is used in different threads it can give wrong values. However , for uniformity I shall change it to System.nanoTime()
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1628734 from Noble Paul in branch 'dev/trunk'
          [ https://svn.apache.org/r1628734 ]

          SOLR-6476

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1628734 from Noble Paul in branch 'dev/trunk' [ https://svn.apache.org/r1628734 ] SOLR-6476
          Hide
          steve_rowe Steve Rowe added a comment -

          Noble, you apparently ignored this comment I made earlier when you committed - it's still a problem:

          In SchemaManager.waitForOtherReplicasToUpdate(), called from doOperations(), you send -1 in as maxWaitSecs to ManagedIndexSchema.waitForSchemaZkVersionAgreement() when the timeout has been exceeded, but AFAICT negative values aren't handled appropriately there, e.g. it gets sent in unexamined to ExecutorService.invokeAll():

          Two more issues:

          1. It's better form to use the conversion function for nanoseconds->milliseconds rather than having constant conversion factors - you have (starting at line #159 in SchemaManager):

          long timeLeftSecs1 = timeout -  ((System.nanoTime() - startTime) /1000000);
          [...]
          long timeLeftSecs = timeout -  ((System.nanoTime() - startTime) /1000000);
          

          But everywhere else in Solr, e.g. line #178 in HttpShardHandler:

          ssr.elapsedTime = TimeUnit.MILLISECONDS.convert(System.nanoTime() - startTime, TimeUnit.NANOSECONDS);
          

          2. Aren't you going to backport this to branch_5x?

          Show
          steve_rowe Steve Rowe added a comment - Noble, you apparently ignored this comment I made earlier when you committed - it's still a problem: In SchemaManager.waitForOtherReplicasToUpdate() , called from doOperations() , you send -1 in as maxWaitSecs to ManagedIndexSchema.waitForSchemaZkVersionAgreement() when the timeout has been exceeded, but AFAICT negative values aren't handled appropriately there, e.g. it gets sent in unexamined to ExecutorService.invokeAll() : Two more issues: 1. It's better form to use the conversion function for nanoseconds->milliseconds rather than having constant conversion factors - you have (starting at line #159 in SchemaManager ): long timeLeftSecs1 = timeout - (( System .nanoTime() - startTime) /1000000); [...] long timeLeftSecs = timeout - (( System .nanoTime() - startTime) /1000000); But everywhere else in Solr, e.g. line #178 in HttpShardHandler : ssr.elapsedTime = TimeUnit.MILLISECONDS.convert( System .nanoTime() - startTime, TimeUnit.NANOSECONDS); 2. Aren't you going to backport this to branch_5x?
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1628747 from Noble Paul in branch 'dev/trunk'
          [ https://svn.apache.org/r1628747 ]

          SOLR-6476

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1628747 from Noble Paul in branch 'dev/trunk' [ https://svn.apache.org/r1628747 ] SOLR-6476
          Hide
          noble.paul Noble Paul added a comment - - edited

          Aren't you going to backport this to branch_5x?

          yes but I am waiting for Timothy Potter to port the changes from SOLR-6249 tobe ported first

          Show
          noble.paul Noble Paul added a comment - - edited Aren't you going to backport this to branch_5x? yes but I am waiting for Timothy Potter to port the changes from SOLR-6249 tobe ported first
          Hide
          steve_rowe Steve Rowe added a comment -

          Thanks Noble, your last commit fixed the two issues I mentioned. +1 to resolve once you've backported.

          Show
          steve_rowe Steve Rowe added a comment - Thanks Noble, your last commit fixed the two issues I mentioned. +1 to resolve once you've backported.
          Hide
          andyetitmoves Ramkumar Aiyengar added a comment -

          actually System.currentTimeMillis() is fine if used in the same thread . And if it is used in different threads it can give wrong values.

          It's nothing to do with the number of threads. currentTimeMillis uses the wall time and is not guaranteed to be monotonic. So if the sysadmin or ntp for example changes time (or horror, if your system time uses local time and you cross a DST transition), the difference in value between two measurements is not guaranteed to reflect the actual duration of time. So in this case for example you might end up violating the timeout altogether. nanoTime essentially exposes a counter which keeps increasing, so while it has no bearing on the system time (so a value by itself is meaningless), differences are guaranteed to be accurate. (Well, as long as the platform supports it, which is almost everywhere except for some random old versions of Windows).

          Show
          andyetitmoves Ramkumar Aiyengar added a comment - actually System.currentTimeMillis() is fine if used in the same thread . And if it is used in different threads it can give wrong values. It's nothing to do with the number of threads. currentTimeMillis uses the wall time and is not guaranteed to be monotonic. So if the sysadmin or ntp for example changes time (or horror, if your system time uses local time and you cross a DST transition), the difference in value between two measurements is not guaranteed to reflect the actual duration of time. So in this case for example you might end up violating the timeout altogether. nanoTime essentially exposes a counter which keeps increasing, so while it has no bearing on the system time (so a value by itself is meaningless), differences are guaranteed to be accurate. (Well, as long as the platform supports it, which is almost everywhere except for some random old versions of Windows).
          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          +1. If measuring elapsed time, please use nanoTime and not currentTimeMillis.

          Show
          markrmiller@gmail.com Mark Miller added a comment - +1. If measuring elapsed time, please use nanoTime and not currentTimeMillis.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1629301 from Noble Paul in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1629301 ]

          SOLR-6476

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1629301 from Noble Paul in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1629301 ] SOLR-6476
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1631397 from Noble Paul in branch 'dev/trunk'
          [ https://svn.apache.org/r1631397 ]

          SOLR-6476 refactored to get the Operation class outside

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1631397 from Noble Paul in branch 'dev/trunk' [ https://svn.apache.org/r1631397 ] SOLR-6476 refactored to get the Operation class outside
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1631400 from Noble Paul in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1631400 ]

          SOLR-6476 refactored to get the Operation class outside

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1631400 from Noble Paul in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1631400 ] SOLR-6476 refactored to get the Operation class outside
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1631420 from Noble Paul in branch 'dev/trunk'
          [ https://svn.apache.org/r1631420 ]

          SOLR-6476 change the error message

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1631420 from Noble Paul in branch 'dev/trunk' [ https://svn.apache.org/r1631420 ] SOLR-6476 change the error message
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1631421 from Noble Paul in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1631421 ]

          SOLR-6476 change the error message

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1631421 from Noble Paul in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1631421 ] SOLR-6476 change the error message
          Hide
          yseeley@gmail.com Yonik Seeley added a comment -

          People hate repeated keys in JSON... we should avoid adding more of that if possible.
          Instead of:

          {
            "add-field": {...},  // field1
            "add-field":{...}    // field2
          }
          

          How about:

          {
            "add-field": [ 
               {...},    // field1
               {...}     // field2
             ]
          }
          

          And you can continue to still offer a simple form for a single field:

          {
            "add-field": {...}    // field1
          }
          
          Show
          yseeley@gmail.com Yonik Seeley added a comment - People hate repeated keys in JSON... we should avoid adding more of that if possible. Instead of: { "add-field" : {...}, // field1 "add-field" :{...} // field2 } How about: { "add-field" : [ {...}, // field1 {...} // field2 ] } And you can continue to still offer a simple form for a single field: { "add-field" : {...} // field1 }
          Hide
          noble.paul Noble Paul added a comment - - edited

          actually both formats are supported.I will ensure that the documentation carries both examples

          But it is useful when order is important.

          For the /update path it was a not a great choice because they are usually written down by libraries which don't know how to write it . schema is mostly written by hand. and humans are fine writing the repeated stuff if it helps readability

          Show
          noble.paul Noble Paul added a comment - - edited actually both formats are supported.I will ensure that the documentation carries both examples But it is useful when order is important. For the /update path it was a not a great choice because they are usually written down by libraries which don't know how to write it . schema is mostly written by hand. and humans are fine writing the repeated stuff if it helps readability
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1632525 from Noble Paul in branch 'dev/trunk'
          [ https://svn.apache.org/r1632525 ]

          SOLR-6476 error message fixed

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1632525 from Noble Paul in branch 'dev/trunk' [ https://svn.apache.org/r1632525 ] SOLR-6476 error message fixed
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1632526 from Noble Paul in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1632526 ]

          SOLR-6476 error message fixed

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1632526 from Noble Paul in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1632526 ] SOLR-6476 error message fixed
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1642641 from Noble Paul in branch 'dev/trunk'
          [ https://svn.apache.org/r1642641 ]

          SOLR-6476 refactored bulk schema APIs and other read REST APIs to use standard RequestHandler mechanism

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1642641 from Noble Paul in branch 'dev/trunk' [ https://svn.apache.org/r1642641 ] SOLR-6476 refactored bulk schema APIs and other read REST APIs to use standard RequestHandler mechanism
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1642660 from Noble Paul in branch 'dev/trunk'
          [ https://svn.apache.org/r1642660 ]

          SOLR-6476 removing othe vestiges of config REST APIs

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1642660 from Noble Paul in branch 'dev/trunk' [ https://svn.apache.org/r1642660 ] SOLR-6476 removing othe vestiges of config REST APIs
          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          This issue seems to have broken the trunk build.

          Show
          markrmiller@gmail.com Mark Miller added a comment - This issue seems to have broken the trunk build.
          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          hmm...perhaps I needed a clean - JettyWebappTest was failing, but it passes for me on a new checkout.

          Show
          markrmiller@gmail.com Mark Miller added a comment - hmm...perhaps I needed a clean - JettyWebappTest was failing, but it passes for me on a new checkout.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1642705 from Noble Paul in branch 'dev/trunk'
          [ https://svn.apache.org/r1642705 ]

          SOLR-6476 reference to a deleted servlet cleaned up

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1642705 from Noble Paul in branch 'dev/trunk' [ https://svn.apache.org/r1642705 ] SOLR-6476 reference to a deleted servlet cleaned up
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1642862 from Noble Paul in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1642862 ]

          SOLR-6476 refactored bulk schema APIs and other read REST APIs to use standard RequestHandler mechanism

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1642862 from Noble Paul in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1642862 ] SOLR-6476 refactored bulk schema APIs and other read REST APIs to use standard RequestHandler mechanism
          Hide
          anshumg Anshum Gupta added a comment -

          Bulk close after 5.0 release.

          Show
          anshumg Anshum Gupta added a comment - Bulk close after 5.0 release.

            People

            • Assignee:
              noble.paul Noble Paul
              Reporter:
              noble.paul Noble Paul
            • Votes:
              1 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development