Solr
  1. Solr
  2. SOLR-4893

Add a FieldMutatingUpdateProcessor FieldNameSelector that checks whether a field matches any schema field

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.4, 6.0
    • Component/s: update
    • Labels:
      None

      Description

      Add a new field update processor selector that will configure the processor to select fields that match any schema field, or that don't match any schema field, depending on its boolean parameter, e.g. to select fields that don't match any schema field:

      <bool name="fieldNameMatchesSchemaField">false</bool>
      
      1. SOLR-4893.patch
        19 kB
        Steve Rowe
      2. SOLR-4893.patch
        17 kB
        Hoss Man
      3. SOLR-4893.patch
        14 kB
        Steve Rowe

        Issue Links

          Activity

          Hide
          Steve Rowe added a comment -

          patch, I think it's ready to go.

          Show
          Steve Rowe added a comment - patch, I think it's ready to go.
          Hide
          Hoss Man added a comment -
          16:39 <@sarowe1:#solr> hoss: In the patch I just put up on SOLR-4893, I added a new member to 
                                 FieldUpdateProcessorFactory.SelectorParams, and then to reduce method parameter passing of each member of that 
                                 structure, I switched to passing instances of the structure.
          16:39 <@sarowe1:#solr> hoss:   I'd appreciate a sanity check on that - it looks like cleanup to me, but maybe there's a reason each 
                                 member of SelectorParams was being passed around instead?
          

          I can't think of any reason why i had it that way either. Your patch looks much cleaner (although my preference would be to leave a deprecated version of the old createFieldNameSelector that delegates to your new version)

          misc comments...

          • you should change getDefaultSelector in IgnoreFieldUpdateProcessorFactory to take advantage of your new logic (ie: call createFieldNameSelector(...))
          • i would make getBooleanArg give an explict error if the key exists multiple times
          • to be nice to users who might not no any better, i would suggest that getBooleanArg should also accept a String and pass it to Boolean.valueOf
          • would be good to have a test of fieldNameMatchesSchemaField in an excludes
          • i don't think your ignore-not-in-schema-and-foo-name-prefix is verifying that the foo prefix is required?
          • even with your added comments, the logic involving fieldNameMatchesSchemaField in shouldMutate seems kind of hairy (too hairy for me to be confident it's correct by reading it w/o looking at the test) .. i'm wondering if it can't be "unwound" a bit to make it more straight forward?
          Show
          Hoss Man added a comment - 16:39 <@sarowe1:#solr> hoss: In the patch I just put up on SOLR-4893, I added a new member to FieldUpdateProcessorFactory.SelectorParams, and then to reduce method parameter passing of each member of that structure, I switched to passing instances of the structure. 16:39 <@sarowe1:#solr> hoss: I'd appreciate a sanity check on that - it looks like cleanup to me, but maybe there's a reason each member of SelectorParams was being passed around instead? I can't think of any reason why i had it that way either. Your patch looks much cleaner (although my preference would be to leave a deprecated version of the old createFieldNameSelector that delegates to your new version) misc comments... you should change getDefaultSelector in IgnoreFieldUpdateProcessorFactory to take advantage of your new logic (ie: call createFieldNameSelector(...)) i would make getBooleanArg give an explict error if the key exists multiple times to be nice to users who might not no any better, i would suggest that getBooleanArg should also accept a String and pass it to Boolean.valueOf would be good to have a test of fieldNameMatchesSchemaField in an excludes i don't think your ignore-not-in-schema-and-foo-name-prefix is verifying that the foo prefix is required? even with your added comments, the logic involving fieldNameMatchesSchemaField in shouldMutate seems kind of hairy (too hairy for me to be confident it's correct by reading it w/o looking at the test) .. i'm wondering if it can't be "unwound" a bit to make it more straight forward?
          Hide
          Hoss Man added a comment -

          updated patch that addresses a few of my previous comments...

          • would be good to have a test of fieldNameMatchesSchemaField in an excludes
          • i don't think your ignore-not-in-schema-and-foo-name-prefix is verifying that the foo prefix is required?
          • even with your added comments, the logic involving fieldNameMatchesSchemaField in shouldMutate seems kind of hairy (too hairy for me to be confident it's correct by reading it w/o looking at the test) .. i'm wondering if it can't be "unwound" a bit to make it more straight forward?

          ...lemme know what you think of the new logic in ConfigurableFieldNameSelector.shouldMutate .. i think it's easier to read, but i'm curious if that's just how my mind works

          I retract this suggestion...

          you should change getDefaultSelector in IgnoreFieldUpdateProcessorFactory to take advantage of your new logic (ie: call createFieldNameSelector(...))

          ..no reason to make that method reuse createFieldNameSelector with anartificials set of params just to get a selector that would be slower then what it uses right now.

          Show
          Hoss Man added a comment - updated patch that addresses a few of my previous comments... would be good to have a test of fieldNameMatchesSchemaField in an excludes i don't think your ignore-not-in-schema-and-foo-name-prefix is verifying that the foo prefix is required? even with your added comments, the logic involving fieldNameMatchesSchemaField in shouldMutate seems kind of hairy (too hairy for me to be confident it's correct by reading it w/o looking at the test) .. i'm wondering if it can't be "unwound" a bit to make it more straight forward? ...lemme know what you think of the new logic in ConfigurableFieldNameSelector.shouldMutate .. i think it's easier to read, but i'm curious if that's just how my mind works I retract this suggestion... you should change getDefaultSelector in IgnoreFieldUpdateProcessorFactory to take advantage of your new logic (ie: call createFieldNameSelector(...)) ..no reason to make that method reuse createFieldNameSelector with anartificials set of params just to get a selector that would be slower then what it uses right now.
          Hide
          Steve Rowe added a comment -

          updated patch that addresses a few of my previous comments...

          * would be good to have a test of fieldNameMatchesSchemaField in an excludes

          * i don't think your ignore-not-in-schema-and-foo-name-prefix is verifying that the foo prefix is required?

          Thanks, these look good.

          even with your added comments, the logic involving fieldNameMatchesSchemaField in shouldMutate seems kind of hairy (too hairy for me to be confident it's correct by reading it w/o looking at the test) .. i'm wondering if it can't be "unwound" a bit to make it more straight forward?

          ...lemme know what you think of the new logic in ConfigurableFieldNameSelector.shouldMutate .. i think it's easier to read, but i'm curious if that's just how my mind works

          It looks easier to read to me, thanks.

          • i would make getBooleanArg give an explict error if the key exists multiple times
          • to be nice to users who might not no any better, i would suggest that getBooleanArg should also accept a String and pass it to Boolean.valueOf

          The attached patch implements the above two suggestions. I think that covers your comments?

          Show
          Steve Rowe added a comment - updated patch that addresses a few of my previous comments... * would be good to have a test of fieldNameMatchesSchemaField in an excludes * i don't think your ignore-not-in-schema-and-foo-name-prefix is verifying that the foo prefix is required? Thanks, these look good. even with your added comments, the logic involving fieldNameMatchesSchemaField in shouldMutate seems kind of hairy (too hairy for me to be confident it's correct by reading it w/o looking at the test) .. i'm wondering if it can't be "unwound" a bit to make it more straight forward? ...lemme know what you think of the new logic in ConfigurableFieldNameSelector.shouldMutate .. i think it's easier to read, but i'm curious if that's just how my mind works It looks easier to read to me, thanks. i would make getBooleanArg give an explict error if the key exists multiple times to be nice to users who might not no any better, i would suggest that getBooleanArg should also accept a String and pass it to Boolean.valueOf The attached patch implements the above two suggestions. I think that covers your comments?
          Hide
          Steve Rowe added a comment -

          I think that covers your comments?

          Oops, I missed one:

          my preference would be to leave a deprecated version of the old createFieldNameSelector that delegates to your new version

          I went to do this and noticed two things: a) it's marked with @lucene.internal; and b) in order to support the new selector parameter fieldNameMatchesSchemaField, we'd have to change the signature on the original version anyway. So since it's going to break users' compilation either way, I don't it's worth it to keep the old version. (I guess we could keep the old version without the new selector parameter, and fill in null for it when passing to the new version, but I'd rather people have to deal with the new parameter.)

          Show
          Steve Rowe added a comment - I think that covers your comments? Oops, I missed one: my preference would be to leave a deprecated version of the old createFieldNameSelector that delegates to your new version I went to do this and noticed two things: a) it's marked with @lucene.internal; and b) in order to support the new selector parameter fieldNameMatchesSchemaField , we'd have to change the signature on the original version anyway. So since it's going to break users' compilation either way, I don't it's worth it to keep the old version. (I guess we could keep the old version without the new selector parameter, and fill in null for it when passing to the new version, but I'd rather people have to deal with the new parameter.)
          Hide
          Hoss Man added a comment -

          So since it's going to break users' compilation either way, I don't it's worth it to keep the old version.

          +1

          patch: +1

          Show
          Hoss Man added a comment - So since it's going to break users' compilation either way, I don't it's worth it to keep the old version. +1 patch: +1
          Hide
          Steve Rowe added a comment -

          Thanks Hoss!

          Committed:

          Show
          Steve Rowe added a comment - Thanks Hoss! Committed: trunk: r1491102 branch_4x: r1491109
          Hide
          Commit Tag Bot added a comment -

          [trunk commit] sarowe
          http://svn.apache.org/viewvc?view=revision&revision=1495574

          SOLR-4893: javadoc for the new fieldNameMatchesSchemaField selector

          Show
          Commit Tag Bot added a comment - [trunk commit] sarowe http://svn.apache.org/viewvc?view=revision&revision=1495574 SOLR-4893 : javadoc for the new fieldNameMatchesSchemaField selector
          Hide
          Commit Tag Bot added a comment -

          [branch_4x commit] sarowe
          http://svn.apache.org/viewvc?view=revision&revision=1495575

          SOLR-4893: javadoc for the new fieldNameMatchesSchemaField selector (merged trunk r1495574)

          Show
          Commit Tag Bot added a comment - [branch_4x commit] sarowe http://svn.apache.org/viewvc?view=revision&revision=1495575 SOLR-4893 : javadoc for the new fieldNameMatchesSchemaField selector (merged trunk r1495574)
          Hide
          Steve Rowe added a comment -

          Bulk close resolved 4.4 issues

          Show
          Steve Rowe added a comment - Bulk close resolved 4.4 issues

            People

            • Assignee:
              Steve Rowe
              Reporter:
              Steve Rowe
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development