Solr
  1. Solr
  2. SOLR-4898 Flesh out the Schema REST API
  3. SOLR-6141

Schema API: Remove fields, dynamic fields, field types and copy fields; and replace fields, dynamic fields and field types

    Details

      Description

      It should be possible, via the bulk schema API, to remove and replace the following:

      1. fields
      2. dynamic fields
      3. field types
      4. copy field directives (note: replacement is not applicable to copy fields)

      Removing schema elements that are referred to elsewhere in the schema must be guarded against:

      1. Removing a field type should be disallowed when there are fields or dynamic fields of that type.
      2. Removing a field should be disallowed when there are copy field directives that use the field as source or destination.
      3. Removing a dynamic field should be disallowed when it is the only possible match for a copy field source or destination.
      1. SOLR-6141.patch
        111 kB
        Steve Rowe
      2. SOLR-6141.patch
        106 kB
        Steve Rowe
      3. SOLR-6141-fix-TestBulkSchemaConcurrent.patch
        4 kB
        Steve Rowe

        Issue Links

          Activity

          Hide
          Noble Paul added a comment -

          Do we plan to do the same functionality in he bulk mode as well as the REST api ?

          Show
          Noble Paul added a comment - Do we plan to do the same functionality in he bulk mode as well as the REST api ?
          Hide
          Steve Rowe added a comment -

          Patch.

          The following commands are added to the bulk schema API:

          1. delete-field
          2. delete-dynamic-field
          3. delete-field-type
          4. delete-copy-field
          5. replace-field
          6. replace-dynamic-field
          7. replace-field-type

          I think this is ready.

          Show
          Steve Rowe added a comment - Patch. The following commands are added to the bulk schema API: delete-field delete-dynamic-field delete-field-type delete-copy-field replace-field replace-dynamic-field replace-field-type I think this is ready.
          Hide
          Steve Rowe added a comment -

          Do we plan to do the same functionality in he bulk mode as well as the REST api ?

          Noble Paul, I only implemented these change in bulk mode, not in the REST api.

          Show
          Steve Rowe added a comment - Do we plan to do the same functionality in he bulk mode as well as the REST api ? Noble Paul , I only implemented these change in bulk mode, not in the REST api.
          Hide
          Steve Rowe added a comment -

          FYI, replacement (as opposed to just using separate delete and add commands) is necessary when other schema elements refer to an element to be replaced - e.g. if you try to delete a field type that is used by existing fields or dynamic fields, you'll get errors.

          At first I thought this could be special-cased - we could relax checking when removing then re-adding a schema element (though figuring out that this is happening would be an interesting exercise given the bulk command structure), but that wouldn't clean up the data structure bindings (fields and dynamic fields -> field types; and copy fields -> fields and dynamic fields).

          So in the implementation in the patch, when replacing a schema element, bindings are found and refreshed.

          Show
          Steve Rowe added a comment - FYI, replacement (as opposed to just using separate delete and add commands) is necessary when other schema elements refer to an element to be replaced - e.g. if you try to delete a field type that is used by existing fields or dynamic fields, you'll get errors. At first I thought this could be special-cased - we could relax checking when removing then re-adding a schema element (though figuring out that this is happening would be an interesting exercise given the bulk command structure), but that wouldn't clean up the data structure bindings (fields and dynamic fields -> field types; and copy fields -> fields and dynamic fields). So in the implementation in the patch, when replacing a schema element, bindings are found and refreshed.
          Hide
          Steve Rowe added a comment -

          This version of the patch modifies ZkIndexSchemaReader.updateSchema() to fully parse the remote changed schema rather than merging the local copy with the remote copy - now that the schema is (almost) fully addressable with the schema API, we can't reliably do such merges.

          Show
          Steve Rowe added a comment - This version of the patch modifies ZkIndexSchemaReader.updateSchema() to fully parse the remote changed schema rather than merging the local copy with the remote copy - now that the schema is (almost) fully addressable with the schema API, we can't reliably do such merges.
          Hide
          Steve Rowe added a comment -

          I'll commit this to trunk now and let it bake for a day or two before backporting to branch_5x.

          Show
          Steve Rowe added a comment - I'll commit this to trunk now and let it bake for a day or two before backporting to branch_5x.
          Hide
          ASF subversion and git services added a comment -

          Commit 1667175 from Steve Rowe in branch 'dev/trunk'
          [ https://svn.apache.org/r1667175 ]

          SOLR-6141: Schema API: Remove fields, dynamic fields, field types and copy fields; and replace fields, dynamic fields and field types

          Show
          ASF subversion and git services added a comment - Commit 1667175 from Steve Rowe in branch 'dev/trunk' [ https://svn.apache.org/r1667175 ] SOLR-6141 : Schema API: Remove fields, dynamic fields, field types and copy fields; and replace fields, dynamic fields and field types
          Hide
          Steve Rowe added a comment -

          Shalin Shekhar Mangar alerted me to TestCloudSchemaless fails he was seeing 25-30% of the time on trunk - I was able to get the same failures too, and I see there is a Policeman Jenkins failure here: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/11998/.

          There is also a new TestBulkSchemaConcurrent fail: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/2061/.

          I backed out the above-described change to fully parse the remote changed schema in ZkIndexSchemaReader, and I couldn't get TestCloudSchemaless to fail. I think this is due to a change in the schema update lock - in the old code, the schema update lock is shared by the old and new schema, but in the new code, I created a new lock with the new schema.

          I see the same pattern in the bulk schema api, at SchemaManager.getFreshManagedSchema(), so I suspect that this is the source of both test failures.

          I'll switch both to sharing the schema update lock with the old schema and beast them.

          Show
          Steve Rowe added a comment - Shalin Shekhar Mangar alerted me to TestCloudSchemaless fails he was seeing 25-30% of the time on trunk - I was able to get the same failures too, and I see there is a Policeman Jenkins failure here: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/11998/ . There is also a new TestBulkSchemaConcurrent fail: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/2061/ . I backed out the above-described change to fully parse the remote changed schema in ZkIndexSchemaReader , and I couldn't get TestCloudSchemaless to fail. I think this is due to a change in the schema update lock - in the old code, the schema update lock is shared by the old and new schema, but in the new code, I created a new lock with the new schema. I see the same pattern in the bulk schema api, at SchemaManager.getFreshManagedSchema() , so I suspect that this is the source of both test failures. I'll switch both to sharing the schema update lock with the old schema and beast them.
          Hide
          Steve Rowe added a comment -

          Each of the two tests successfully passed 25 iterations of beasting each with this patch:

          Index: solr/core/src/java/org/apache/solr/schema/SchemaManager.java
          ===================================================================
          --- solr/core/src/java/org/apache/solr/schema/SchemaManager.java	(revision 1667433)
          +++ solr/core/src/java/org/apache/solr/schema/SchemaManager.java	(working copy)
          @@ -421,11 +421,9 @@
                 if (in instanceof ZkSolrResourceLoader.ZkByteArrayInputStream) {
                   int version = ((ZkSolrResourceLoader.ZkByteArrayInputStream) in).getStat().getVersion();
                   log.info("managed schema loaded . version : {} ", version);
          -        return new ManagedIndexSchema(req.getCore().getSolrConfig(),
          -            req.getSchema().getResourceName() ,new InputSource(in),
          -            true,
          -            req.getSchema().getResourceName(),
          -            version,new Object());
          +        return new ManagedIndexSchema
          +            (req.getCore().getSolrConfig(), req.getSchema().getResourceName(), new InputSource(in), 
          +                true, req.getSchema().getResourceName(), version, req.getSchema().getSchemaUpdateLock());
                 } else {
                   return (ManagedIndexSchema) req.getCore().getLatestSchema();
                 }
          Index: solr/core/src/java/org/apache/solr/schema/ZkIndexSchemaReader.java
          ===================================================================
          --- solr/core/src/java/org/apache/solr/schema/ZkIndexSchemaReader.java	(revision 1667433)
          +++ solr/core/src/java/org/apache/solr/schema/ZkIndexSchemaReader.java	(working copy)
          @@ -108,8 +108,8 @@
                     InputSource inputSource = new InputSource(new ByteArrayInputStream(data));
                     String resourceName = managedIndexSchemaFactory.getManagedSchemaResourceName();
                     ManagedIndexSchema newSchema = new ManagedIndexSchema
          -              (managedIndexSchemaFactory.getConfig(), resourceName, inputSource,
          -                  managedIndexSchemaFactory.isMutable(), resourceName, stat.getVersion(), new Object());
          +              (managedIndexSchemaFactory.getConfig(), resourceName, inputSource, managedIndexSchemaFactory.isMutable(), 
          +                  resourceName, stat.getVersion(), oldSchema.getSchemaUpdateLock());
                     managedIndexSchemaFactory.setSchema(newSchema);
                     long stop = System.nanoTime();
                     log.info("Finished refreshing schema in " + TimeUnit.MILLISECONDS.convert(stop - start, TimeUnit.NANOSECONDS) + " ms");
          

          Committing shortly.

          Show
          Steve Rowe added a comment - Each of the two tests successfully passed 25 iterations of beasting each with this patch: Index: solr/core/src/java/org/apache/solr/schema/SchemaManager.java =================================================================== --- solr/core/src/java/org/apache/solr/schema/SchemaManager.java (revision 1667433) +++ solr/core/src/java/org/apache/solr/schema/SchemaManager.java (working copy) @@ -421,11 +421,9 @@ if (in instanceof ZkSolrResourceLoader.ZkByteArrayInputStream) { int version = ((ZkSolrResourceLoader.ZkByteArrayInputStream) in).getStat().getVersion(); log.info( "managed schema loaded . version : {} " , version); - return new ManagedIndexSchema(req.getCore().getSolrConfig(), - req.getSchema().getResourceName() , new InputSource(in), - true , - req.getSchema().getResourceName(), - version, new Object ()); + return new ManagedIndexSchema + (req.getCore().getSolrConfig(), req.getSchema().getResourceName(), new InputSource(in), + true , req.getSchema().getResourceName(), version, req.getSchema().getSchemaUpdateLock()); } else { return (ManagedIndexSchema) req.getCore().getLatestSchema(); } Index: solr/core/src/java/org/apache/solr/schema/ZkIndexSchemaReader.java =================================================================== --- solr/core/src/java/org/apache/solr/schema/ZkIndexSchemaReader.java (revision 1667433) +++ solr/core/src/java/org/apache/solr/schema/ZkIndexSchemaReader.java (working copy) @@ -108,8 +108,8 @@ InputSource inputSource = new InputSource( new ByteArrayInputStream(data)); String resourceName = managedIndexSchemaFactory.getManagedSchemaResourceName(); ManagedIndexSchema newSchema = new ManagedIndexSchema - (managedIndexSchemaFactory.getConfig(), resourceName, inputSource, - managedIndexSchemaFactory.isMutable(), resourceName, stat.getVersion(), new Object ()); + (managedIndexSchemaFactory.getConfig(), resourceName, inputSource, managedIndexSchemaFactory.isMutable(), + resourceName, stat.getVersion(), oldSchema.getSchemaUpdateLock()); managedIndexSchemaFactory.setSchema(newSchema); long stop = System .nanoTime(); log.info( "Finished refreshing schema in " + TimeUnit.MILLISECONDS.convert(stop - start, TimeUnit.NANOSECONDS) + " ms" ); Committing shortly.
          Hide
          ASF subversion and git services added a comment -

          Commit 1667579 from Steve Rowe in branch 'dev/trunk'
          [ https://svn.apache.org/r1667579 ]

          SOLR-6141: fix schema update lock usage

          Show
          ASF subversion and git services added a comment - Commit 1667579 from Steve Rowe in branch 'dev/trunk' [ https://svn.apache.org/r1667579 ] SOLR-6141 : fix schema update lock usage
          Hide
          Steve Rowe added a comment -

          I haven't seen any more Jenkins failures. Backporting to branch_5x shortly.

          Show
          Steve Rowe added a comment - I haven't seen any more Jenkins failures. Backporting to branch_5x shortly.
          Hide
          Steve Rowe added a comment -

          I haven't seen any more Jenkins failures. Backporting to branch_5x shortly.

          I spoke too soon - both tests are still failing:

          TestBulkSchemaConcurrent:
          http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/2076/
          http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/2084/

          TestCloudSchemaless:
          http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/2039/

          I'm investigating.

          Show
          Steve Rowe added a comment - I haven't seen any more Jenkins failures. Backporting to branch_5x shortly. I spoke too soon - both tests are still failing: TestBulkSchemaConcurrent : http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/2076/ http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/2084/ TestCloudSchemaless : http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/2039/ I'm investigating.
          Hide
          Steve Rowe added a comment -

          I spoke too soon - both tests are still failing:
          TestBulkSchemaConcurrent:
          http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/2076/
          http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/2084/

          The attached patch fixes this test. The failures were a combination of incorrect test sequencing (attempting to delete a field and a dynamic field before deleting a copy field directive that refers to them); a couple of unwarranted attempts to access non-existent entries in copyFieldTargetCounts; and failing to fail a field deletion when a dynamic copy field directive has it as its source.

          I've beasted this test with 50 iterations, no failures. Committing shortly.

          TestCloudSchemaless:
          http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/2039/

          I can't get this one to reproduce

          Show
          Steve Rowe added a comment - I spoke too soon - both tests are still failing: TestBulkSchemaConcurrent : http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/2076/ http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/2084/ The attached patch fixes this test. The failures were a combination of incorrect test sequencing (attempting to delete a field and a dynamic field before deleting a copy field directive that refers to them); a couple of unwarranted attempts to access non-existent entries in copyFieldTargetCounts ; and failing to fail a field deletion when a dynamic copy field directive has it as its source. I've beasted this test with 50 iterations, no failures. Committing shortly. TestCloudSchemaless: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/2039/ I can't get this one to reproduce
          Hide
          ASF subversion and git services added a comment -

          Commit 1669055 from Steve Rowe in branch 'dev/trunk'
          [ https://svn.apache.org/r1669055 ]

          SOLR-6141: fix TestBulkSchemaConcurrent; fix field deletion to fail when a dynamic copy field directive has the field as its source; don't attempt to decrement a SchemaField's count in copyFieldTargetCounts if it's not present in the map.

          Show
          ASF subversion and git services added a comment - Commit 1669055 from Steve Rowe in branch 'dev/trunk' [ https://svn.apache.org/r1669055 ] SOLR-6141 : fix TestBulkSchemaConcurrent; fix field deletion to fail when a dynamic copy field directive has the field as its source; don't attempt to decrement a SchemaField's count in copyFieldTargetCounts if it's not present in the map.
          Hide
          ASF subversion and git services added a comment -

          Commit 1669173 from Steve Rowe in branch 'dev/trunk'
          [ https://svn.apache.org/r1669173 ]

          SOLR-6141: fix TestBulkSchemaAPI (expected exception message changed)

          Show
          ASF subversion and git services added a comment - Commit 1669173 from Steve Rowe in branch 'dev/trunk' [ https://svn.apache.org/r1669173 ] SOLR-6141 : fix TestBulkSchemaAPI (expected exception message changed)
          Hide
          Steve Rowe added a comment -

          I plan on backporting to branch_5x today.

          Show
          Steve Rowe added a comment - I plan on backporting to branch_5x today.
          Hide
          ASF subversion and git services added a comment -

          Commit 1669413 from Steve Rowe in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1669413 ]

          SOLR-6141: Schema API: Remove fields, dynamic fields, field types and copy fields; and replace fields, dynamic fields and field types (merged trunk r1667175,r1667579,r1669055,r1669173)

          Show
          ASF subversion and git services added a comment - Commit 1669413 from Steve Rowe in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1669413 ] SOLR-6141 : Schema API: Remove fields, dynamic fields, field types and copy fields; and replace fields, dynamic fields and field types (merged trunk r1667175,r1667579,r1669055,r1669173)
          Hide
          Steve Rowe added a comment -

          Committed to trunk and branch_5x.

          Show
          Steve Rowe added a comment - Committed to trunk and branch_5x.
          Hide
          Timothy Potter added a comment -

          Bulk close after 5.1 release

          Show
          Timothy Potter added a comment - Bulk close after 5.1 release

            People

            • Assignee:
              Steve Rowe
              Reporter:
              Christoph Strobl
            • Votes:
              3 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development