Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-10011

Refactor PointField & TrieField to share common code

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.5, 7.0
    • Component/s: None
    • Security Level: Public (Default Security Level. Issues are Public)
    • Labels:
      None

      Description

      We should eliminate PointTypes and TrieTypes enum to have a common enum for both. That would enable us to share a lot of code between the two field types.

      In the process, fix the bug:
      PointFields with indexed=false, docValues=true seem to be using (Int|Double|Float|Long)Point.newRangeQuery() for performing exact matches and range queries. However, they should instead be using DocValues based range query.

      1. SOLR-10011.patch
        42 kB
        Tomás Fernández Löbbe
      2. SOLR-10011.patch
        36 kB
        Ishan Chattopadhyaya
      3. SOLR-10011.patch
        36 kB
        Ishan Chattopadhyaya
      4. SOLR-10011.patch
        15 kB
        Ishan Chattopadhyaya
      5. SOLR-10011.patch
        14 kB
        Ishan Chattopadhyaya

        Issue Links

          Activity

          Hide
          ichattopadhyaya Ishan Chattopadhyaya added a comment -

          Here's a patch that introduces a new abstract method, getPointRangeQuery(). The getRangeQuery() method either delegates to this getPointRangeQuery method, i.e. in case of indexed=true or docValues=false, or delegates to another method for create a DocValuesRangeQuery. Modified the tests to check for the correctness of such fields.

          FWIW, this was necessary while working on SOLR-5944, since for fields where indexed=false, stored=false, docValues=true, the updates can do in-place modification to the docValues, but current code would've ignored the docValues altogether and performed queries on the points field.

          Tomás Fernández Löbbe, can you please review?

          Show
          ichattopadhyaya Ishan Chattopadhyaya added a comment - Here's a patch that introduces a new abstract method, getPointRangeQuery(). The getRangeQuery() method either delegates to this getPointRangeQuery method, i.e. in case of indexed=true or docValues=false, or delegates to another method for create a DocValuesRangeQuery. Modified the tests to check for the correctness of such fields. FWIW, this was necessary while working on SOLR-5944 , since for fields where indexed=false, stored=false, docValues=true, the updates can do in-place modification to the docValues, but current code would've ignored the docValues altogether and performed queries on the points field. Tomás Fernández Löbbe , can you please review?
          Hide
          ichattopadhyaya Ishan Chattopadhyaya added a comment - - edited

          Updating patch to master (after LUCENE-7643 commit).
          We might need to refactor a bit, either now or later, to better re-use DV code between TrieField and PointField classes.

          Show
          ichattopadhyaya Ishan Chattopadhyaya added a comment - - edited Updating patch to master (after LUCENE-7643 commit). We might need to refactor a bit, either now or later, to better re-use DV code between TrieField and PointField classes.
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          LGTM.

          We might need to refactor a bit, either now or later, to better re-use DV code between TrieField and PointField classes.

          Either way is fine with me. I was thinking in may make sense to remove PointField.PointTypes and TrieField.TrieTypes and have a NumericField.Type that can be shared with both implementations. That would help this refactor, but may need to be master-only

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - LGTM. We might need to refactor a bit, either now or later, to better re-use DV code between TrieField and PointField classes. Either way is fine with me. I was thinking in may make sense to remove PointField.PointTypes and TrieField.TrieTypes and have a NumericField.Type that can be shared with both implementations. That would help this refactor, but may need to be master-only
          Hide
          ichattopadhyaya Ishan Chattopadhyaya added a comment -

          Here's a stab at refactoring, by introducing a new common base class NumericFieldType.

          The tests pass after this refactoring. Tomás Fernández Löbbe, can you please review?

          That would help this refactor, but may need to be master-only

          I think the refactoring in this patch is backwards compatible. Had an offline discussion with Hoss Man, who suggested that such an approach (introducing a new base class on top of TrieField) is backwards compatible.

          Show
          ichattopadhyaya Ishan Chattopadhyaya added a comment - Here's a stab at refactoring, by introducing a new common base class NumericFieldType. The tests pass after this refactoring. Tomás Fernández Löbbe , can you please review? That would help this refactor, but may need to be master-only I think the refactoring in this patch is backwards compatible. Had an offline discussion with Hoss Man , who suggested that such an approach (introducing a new base class on top of TrieField) is backwards compatible.
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          I like the refactor.

          I think the refactoring in this patch is backwards compatible.

          Kind of... Anyone using TrieFields.TrieTypes, TrieTypes.getType() or TrieTypes.type will have their code failing. So I don't know if we consider those kind of internal, or if we are OK with the change anyway since it's easy to fix.

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - I like the refactor. I think the refactoring in this patch is backwards compatible. Kind of... Anyone using TrieFields.TrieTypes , TrieTypes.getType() or TrieTypes.type will have their code failing. So I don't know if we consider those kind of internal, or if we are OK with the change anyway since it's easy to fix.
          Hide
          ichattopadhyaya Ishan Chattopadhyaya added a comment -

          Kind of... Anyone using TrieFields.TrieTypes, TrieTypes.getType() or TrieTypes.type will have their code failing. So I don't know if we consider those kind of internal, or if we are OK with the change anyway since it's easy to fix.

          AFAIR, we've cared about such internal changes only on a best effort basis. Since it should be fairly apparent to a developer as to why their plugin fails, I suggest we go ahead with the refactoring even for 6x. But I shall defer to your judgement. Hoss Man, do you have any opinion?

          Show
          ichattopadhyaya Ishan Chattopadhyaya added a comment - Kind of... Anyone using TrieFields.TrieTypes, TrieTypes.getType() or TrieTypes.type will have their code failing. So I don't know if we consider those kind of internal, or if we are OK with the change anyway since it's easy to fix. AFAIR, we've cared about such internal changes only on a best effort basis. Since it should be fairly apparent to a developer as to why their plugin fails, I suggest we go ahead with the refactoring even for 6x. But I shall defer to your judgement. Hoss Man , do you have any opinion?
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 285a1013ad04dd1cd5e5e41ffa93a87fe862c152 in lucene-solr's branch refs/heads/master from Ishan Chattopadhyaya
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=285a101 ]

          SOLR-10011: Refactor PointField & TrieField to now have a common base class, NumericFieldType.

          The TrieField.TrieTypes and PointField.PointTypes are now consolidated to NumericFieldType.NumberType. This
          refactoring also fixes a bug whereby PointFields were not using DocValues for range queries for
          indexed=false, docValues=true fields.

          Show
          jira-bot ASF subversion and git services added a comment - Commit 285a1013ad04dd1cd5e5e41ffa93a87fe862c152 in lucene-solr's branch refs/heads/master from Ishan Chattopadhyaya [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=285a101 ] SOLR-10011 : Refactor PointField & TrieField to now have a common base class, NumericFieldType. The TrieField.TrieTypes and PointField.PointTypes are now consolidated to NumericFieldType.NumberType. This refactoring also fixes a bug whereby PointFields were not using DocValues for range queries for indexed=false, docValues=true fields.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 0f7990b2c8590d169add59354cc2678260f94e03 in lucene-solr's branch refs/heads/master from Ishan Chattopadhyaya
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0f7990b ]

          SOLR-10011: Fix exception log message

          Show
          jira-bot ASF subversion and git services added a comment - Commit 0f7990b2c8590d169add59354cc2678260f94e03 in lucene-solr's branch refs/heads/master from Ishan Chattopadhyaya [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0f7990b ] SOLR-10011 : Fix exception log message
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 285a1013ad04dd1cd5e5e41ffa93a87fe862c152 in lucene-solr's branch refs/heads/apiv2 from Ishan Chattopadhyaya
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=285a101 ]

          SOLR-10011: Refactor PointField & TrieField to now have a common base class, NumericFieldType.

          The TrieField.TrieTypes and PointField.PointTypes are now consolidated to NumericFieldType.NumberType. This
          refactoring also fixes a bug whereby PointFields were not using DocValues for range queries for
          indexed=false, docValues=true fields.

          Show
          jira-bot ASF subversion and git services added a comment - Commit 285a1013ad04dd1cd5e5e41ffa93a87fe862c152 in lucene-solr's branch refs/heads/apiv2 from Ishan Chattopadhyaya [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=285a101 ] SOLR-10011 : Refactor PointField & TrieField to now have a common base class, NumericFieldType. The TrieField.TrieTypes and PointField.PointTypes are now consolidated to NumericFieldType.NumberType. This refactoring also fixes a bug whereby PointFields were not using DocValues for range queries for indexed=false, docValues=true fields.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 0f7990b2c8590d169add59354cc2678260f94e03 in lucene-solr's branch refs/heads/apiv2 from Ishan Chattopadhyaya
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0f7990b ]

          SOLR-10011: Fix exception log message

          Show
          jira-bot ASF subversion and git services added a comment - Commit 0f7990b2c8590d169add59354cc2678260f94e03 in lucene-solr's branch refs/heads/apiv2 from Ishan Chattopadhyaya [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0f7990b ] SOLR-10011 : Fix exception log message
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment - - edited

          Ishan Chattopadhyaya as part of this refactor, what do you think about:
          deprecating (and removing in master)

          FieldType.java
            public LegacyNumericType getNumericType() {
              return null;
            }
          

          and replacing it with

          FieldType.java
           public NumericFieldType.NumberType getNumberType() {
              return null;
            }
          

          ...also replacing the newly added

          NumericFieldType.java
            final public NumberType getType() {
              return type;
            }
          

          with

          NumericFieldType.java
            @Override
           final public NumericFieldType.NumberType getNumberType() {
              return type;
            }
          

          That way clients don't need to cast to NumericFieldType for getting the actual type. Also, if we do it, we should probably move NumberType out of NumericFieldType and make it top level

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - - edited Ishan Chattopadhyaya as part of this refactor, what do you think about: deprecating (and removing in master) FieldType.java public LegacyNumericType getNumericType() { return null ; } and replacing it with FieldType.java public NumericFieldType.NumberType getNumberType() { return null ; } ...also replacing the newly added NumericFieldType.java final public NumberType getType() { return type; } with NumericFieldType.java @Override final public NumericFieldType.NumberType getNumberType() { return type; } That way clients don't need to cast to NumericFieldType for getting the actual type. Also, if we do it, we should probably move NumberType out of NumericFieldType and make it top level
          Hide
          ichattopadhyaya Ishan Chattopadhyaya added a comment -

          +1.

          I was myself wondering what to do about LegacyNumericType getType() and deprecating it sounds good, since LegacyNumericType is deprecated.

          Show
          ichattopadhyaya Ishan Chattopadhyaya added a comment - +1. I was myself wondering what to do about LegacyNumericType getType() and deprecating it sounds good, since LegacyNumericType is deprecated.
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          I'll upload a patch for this shortly

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - I'll upload a patch for this shortly
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          Sorry for the delay, here is the patch. I replaced all the uses of FieldType.getNumericType() with FieldType.getNumberType(). Note that since NumberType has a DATE value, TrieDateField.getNumberType() returns DATE (instead of LONG, like TrieDateField.getNumericType()). Let me know what you think Ishan Chattopadhyaya

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - Sorry for the delay, here is the patch. I replaced all the uses of FieldType.getNumericType() with FieldType.getNumberType() . Note that since NumberType has a DATE value, TrieDateField.getNumberType() returns DATE (instead of LONG, like TrieDateField.getNumericType()). Let me know what you think Ishan Chattopadhyaya
          Hide
          ichattopadhyaya Ishan Chattopadhyaya added a comment -

          +1, LGTM! Sorry for the delay in reviewing; I missed your comment completely.

          Show
          ichattopadhyaya Ishan Chattopadhyaya added a comment - +1, LGTM! Sorry for the delay in reviewing; I missed your comment completely.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 59c41e2a6c685dd9ac943c69d12e9bfe2a7d380e in lucene-solr's branch refs/heads/master from Tomas Fernandez Lobbe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=59c41e2 ]

          SOLR-10011: Add NumberType getNumberType() to FieldType and deprecate LegacyNumericType getNumericType()

          Modify references to getNumericType() to use the new getNumberType(). NumberType is shared for the different numeric implementations supported in Solr (TrieFields and PointFields).
          CC SOLR-8396

          Show
          jira-bot ASF subversion and git services added a comment - Commit 59c41e2a6c685dd9ac943c69d12e9bfe2a7d380e in lucene-solr's branch refs/heads/master from Tomas Fernandez Lobbe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=59c41e2 ] SOLR-10011 : Add NumberType getNumberType() to FieldType and deprecate LegacyNumericType getNumericType() Modify references to getNumericType() to use the new getNumberType(). NumberType is shared for the different numeric implementations supported in Solr (TrieFields and PointFields). CC SOLR-8396
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 6a97952a6173298f457aebe869a53ba130512f6f in lucene-solr's branch refs/heads/branch_6x from Ishan Chattopadhyaya
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=6a97952 ]

          SOLR-10011: Refactor PointField & TrieField to now have a common base class, NumericFieldType.

          The TrieField.TrieTypes and PointField.PointTypes are now consolidated to NumericFieldType.NumberType. This
          refactoring also fixes a bug whereby PointFields were not using DocValues for range queries for
          indexed=false, docValues=true fields.

          Show
          jira-bot ASF subversion and git services added a comment - Commit 6a97952a6173298f457aebe869a53ba130512f6f in lucene-solr's branch refs/heads/branch_6x from Ishan Chattopadhyaya [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=6a97952 ] SOLR-10011 : Refactor PointField & TrieField to now have a common base class, NumericFieldType. The TrieField.TrieTypes and PointField.PointTypes are now consolidated to NumericFieldType.NumberType. This refactoring also fixes a bug whereby PointFields were not using DocValues for range queries for indexed=false, docValues=true fields.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 5a7cdd89756baed3a7d49d923fa9f66cb2baff98 in lucene-solr's branch refs/heads/branch_6x from Ishan Chattopadhyaya
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5a7cdd8 ]

          SOLR-10011: Fix exception log message

          Show
          jira-bot ASF subversion and git services added a comment - Commit 5a7cdd89756baed3a7d49d923fa9f66cb2baff98 in lucene-solr's branch refs/heads/branch_6x from Ishan Chattopadhyaya [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5a7cdd8 ] SOLR-10011 : Fix exception log message
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit c27880e332722e992294e05749b63300d3eaab44 in lucene-solr's branch refs/heads/branch_6x from Tomas Fernandez Lobbe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c27880e ]

          SOLR-10011: Add NumberType getNumberType() to FieldType and deprecate LegacyNumericType getNumericType()

          Modify references to getNumericType() to use the new getNumberType(). NumberType is shared for the different numeric implementations supported in Solr (TrieFields and PointFields).
          CC SOLR-8396

          Show
          jira-bot ASF subversion and git services added a comment - Commit c27880e332722e992294e05749b63300d3eaab44 in lucene-solr's branch refs/heads/branch_6x from Tomas Fernandez Lobbe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c27880e ] SOLR-10011 : Add NumberType getNumberType() to FieldType and deprecate LegacyNumericType getNumericType() Modify references to getNumericType() to use the new getNumberType(). NumberType is shared for the different numeric implementations supported in Solr (TrieFields and PointFields). CC SOLR-8396
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 81b4288a2133dce87e0ac92da5f6e37dc28176f6 in lucene-solr's branch refs/heads/branch_6x from Tomas Fernandez Lobbe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=81b4288 ]

          SOLR-8396, SOLR-9987, SOLR-10011: Move CHANGES entries from 7.0 to 6.5

          Show
          jira-bot ASF subversion and git services added a comment - Commit 81b4288a2133dce87e0ac92da5f6e37dc28176f6 in lucene-solr's branch refs/heads/branch_6x from Tomas Fernandez Lobbe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=81b4288 ] SOLR-8396 , SOLR-9987 , SOLR-10011 : Move CHANGES entries from 7.0 to 6.5
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          Backported the changes. resolving

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - Backported the changes. resolving

            People

            • Assignee:
              ichattopadhyaya Ishan Chattopadhyaya
              Reporter:
              ichattopadhyaya Ishan Chattopadhyaya
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development