Solr
  1. Solr
  2. SOLR-4650

copyField doesn't work with source globs that don't match dynamic fields

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.2
    • Fix Version/s: 4.3, 6.0
    • Component/s: Schema and Analysis
    • Labels:
      None

      Description

      We have a schema that is currently on Solr 4.0 and supports language-specific stemming for content by use of dynamic fields and copyFields.

      Sample of schema:

         <field name="headline" type="text_general" indexed="true" stored="true" required="false" omitNorms="true"/>
         <field name="body" type="text_general" indexed="true" stored="false" required="false" omitNorms="true"/>
      
         <dynamicField name="*_en" type="text_en" indexed="true" stored="false" multiValued="true" omitNorms="true"/>
         <dynamicField name="*_ja" type="text_ja" indexed="true" stored="false" multiValued="true" omitNorms="true"/>
         <dynamicField name="*_fr" type="text_fr" indexed="true" stored="false" multiValued="true" omitNorms="true"/>
         <dynamicField name="*_de" type="text_de" indexed="true" stored="false" multiValued="true" omitNorms="true"/>
         <dynamicField name="*_es" type="text_es" indexed="true" stored="false" multiValued="true" omitNorms="true"/>
         <dynamicField name="*_pt" type="text_pt" indexed="true" stored="false" multiValued="true" omitNorms="true"/>
            ...
      
         <copyField source="headline_*" dest="headline"/>
         <copyField source="body_*" dest="body"/>
      

      The aim is to store language-specific (stemmed) text in the headline_en, body_en, ... fields and then generic versions (no stemming) in headline & body. This works fine in 4.0 and 4.1, but now fails to start in 4.2,

      SEVERE: Unable to create core: collection1
      org.apache.solr.common.SolrException: copyField source :'headline_*' is not an explicit field and doesn't match a dynamicField.
              at org.apache.solr.schema.IndexSchema.registerCopyField(IndexSchema.java:688)
      

      Shouldn't this still work?

        Issue Links

          Activity

          Hide
          Steve Rowe added a comment -

          Hi Daniel,

          This is issue was fixed in SOLR-4567, and will be included in the soon-to-be-released Solr v4.2.1.

          Show
          Steve Rowe added a comment - Hi Daniel, This is issue was fixed in SOLR-4567 , and will be included in the soon-to-be-released Solr v4.2.1.
          Hide
          Steve Rowe added a comment -

          Hmm, closed as duplicate too fast - this is not the same issue.

          Reopening to discuss.

          Show
          Steve Rowe added a comment - Hmm, closed as duplicate too fast - this is not the same issue. Reopening to discuss.
          Hide
          Daniel Collins added a comment -

          Seems to be related to the comment

          [branch_4x commit] Steven Rowe
          http://svn.apache.org/viewvc?view=revision&revision=1453162

          Might be do to with the fix for SOLR-3798?

          Show
          Daniel Collins added a comment - Seems to be related to the comment [branch_4x commit] Steven Rowe http://svn.apache.org/viewvc?view=revision&revision=1453162 Might be do to with the fix for SOLR-3798 ?
          Hide
          Steve Rowe added a comment -

          SOLR-4567 is a similar issue: copyField glob sources matching explicit fields stopped working

          Show
          Steve Rowe added a comment - SOLR-4567 is a similar issue: copyField glob sources matching explicit fields stopped working
          Hide
          Steve Rowe added a comment - - edited

          Might be do to with the fix for SOLR-3798?

          Yes, definitely, I made changes to the way copyField worked there in order to support previously unsupported but valid uses. Unfortunately, I introduced bugs by doing so...

          Daniel Collins, the workaround here is the same as the workaround mentioned in SOLR-4567 description: simply enumerate all sources you want to copy, e.g.:

          <copyField source="headline_en" dest="headline"/>
          <copyField source="headline_ja" dest="headline"/>
          ...
          
          Show
          Steve Rowe added a comment - - edited Might be do to with the fix for SOLR-3798 ? Yes, definitely, I made changes to the way copyField worked there in order to support previously unsupported but valid uses. Unfortunately, I introduced bugs by doing so... Daniel Collins , the workaround here is the same as the workaround mentioned in SOLR-4567 description: simply enumerate all sources you want to copy, e.g.: <copyField source= "headline_en" dest= "headline" /> <copyField source= "headline_ja" dest= "headline" /> ...
          Hide
          Daniel Collins added a comment -

          Yes, I hadn't seen SOLR-4567, but not sure if your fix for that will be good enough here.

          It depends how the pattern matching is done, but as it stands headline_* won't match any static field but it could generate a match with the dynamic field *_en (and in our case it does)? But is is non-trivial to work that out, since the wildcards don't make for easy comparison (one at the start, one at the end).

          I think this is more than the "subset pattern" as defined in SOLR-4567, but I can't see any other way to do what we want (and it used to work!)

          Show
          Daniel Collins added a comment - Yes, I hadn't seen SOLR-4567 , but not sure if your fix for that will be good enough here. It depends how the pattern matching is done, but as it stands headline_* won't match any static field but it could generate a match with the dynamic field *_en (and in our case it does)? But is is non-trivial to work that out, since the wildcards don't make for easy comparison (one at the start, one at the end). I think this is more than the "subset pattern" as defined in SOLR-4567 , but I can't see any other way to do what we want (and it used to work!)
          Hide
          Steve Rowe added a comment -

          Yes, I hadn't seen SOLR-4567, but not sure if your fix for that will be good enough here.

          It depends how the pattern matching is done, but as it stands headline_* won't match any static field but it could generate a match with the dynamic field *_en (and in our case it does)? But is is non-trivial to work that out, since the wildcards don't make for easy comparison (one at the start, one at the end).

          Yeah, this is definitely a different case from SOLR-4567. AFAICT, there is no way to align the copyField source in your scenario to declared dynamic or explicit fields, and changes I introduced in SOLR-3798 and SOLR-4567 assume that non-aligned copyField sources are errors.

          I think this is more than the "subset pattern" as defined in SOLR-4567, but I can't see any other way to do what we want (and it used to work!)

          Yup, it's definitely a regression. See above for a workaround.

          Show
          Steve Rowe added a comment - Yes, I hadn't seen SOLR-4567 , but not sure if your fix for that will be good enough here. It depends how the pattern matching is done, but as it stands headline_* won't match any static field but it could generate a match with the dynamic field *_en (and in our case it does)? But is is non-trivial to work that out, since the wildcards don't make for easy comparison (one at the start, one at the end). Yeah, this is definitely a different case from SOLR-4567 . AFAICT, there is no way to align the copyField source in your scenario to declared dynamic or explicit fields, and changes I introduced in SOLR-3798 and SOLR-4567 assume that non-aligned copyField sources are errors. I think this is more than the "subset pattern" as defined in SOLR-4567 , but I can't see any other way to do what we want (and it used to work!) Yup, it's definitely a regression. See above for a workaround.
          Hide
          Daniel Collins added a comment -

          Cool, workaround is fine for now, at least we can see what's new in 4.2 now.

          Show
          Daniel Collins added a comment - Cool, workaround is fine for now, at least we can see what's new in 4.2 now.
          Hide
          Steve Rowe added a comment -

          Patch, relaxes copyField source validation to allow any valid glob, including those that don't match any explicit or dynamic fields, adds test for this case.

          Retains the validation check that no-asterisk copyField sources must match either an explicit or a dynamic field, and adds a test for this case.

          Adds tests for copyField source and dest glob validity.

          Committing shortly.

          Show
          Steve Rowe added a comment - Patch, relaxes copyField source validation to allow any valid glob, including those that don't match any explicit or dynamic fields, adds test for this case. Retains the validation check that no-asterisk copyField sources must match either an explicit or a dynamic field, and adds a test for this case. Adds tests for copyField source and dest glob validity. Committing shortly.
          Hide
          Steve Rowe added a comment -

          Committed to trunk and branch_4x.

          Thanks Daniel Collins for reporting!

          Show
          Steve Rowe added a comment - Committed to trunk and branch_4x. Thanks Daniel Collins for reporting!
          Hide
          Uwe Schindler added a comment -

          Closed after release.

          Show
          Uwe Schindler added a comment - Closed after release.

            People

            • Assignee:
              Steve Rowe
              Reporter:
              Daniel Collins
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development