Lucene - Core
  1. Lucene - Core
  2. LUCENE-5192

FieldInfos.Builder failed to catch adding field with different DV type under some circumstances

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.5, 6.0
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      I found it while working on LUCENE-5189. I'll attach a patch with a simple testcase which reproduces the problem and a fix.

      1. LUCENE-5192.patch
        1 kB
        Michael McCandless
      2. LUCENE-5192.patch
        3 kB
        Shai Erera

        Activity

        Hide
        Shai Erera added a comment -

        Patch adds a testcase and fixes the bug. The bug only happens if you add same field name as indexable and DV, and then in another segment change its DV type. I'll commit it shortly.

        Show
        Shai Erera added a comment - Patch adds a testcase and fixes the bug. The bug only happens if you add same field name as indexable and DV, and then in another segment change its DV type. I'll commit it shortly.
        Hide
        Michael McCandless added a comment -

        +1, sneaky!

        Show
        Michael McCandless added a comment - +1, sneaky!
        Hide
        ASF subversion and git services added a comment -

        Commit 1518591 from Shai Erera in branch 'dev/trunk'
        [ https://svn.apache.org/r1518591 ]

        LUCENE-5192: FieldInfos.Builder failed to catch adding field with different DV type under some circumstances

        Show
        ASF subversion and git services added a comment - Commit 1518591 from Shai Erera in branch 'dev/trunk' [ https://svn.apache.org/r1518591 ] LUCENE-5192 : FieldInfos.Builder failed to catch adding field with different DV type under some circumstances
        Hide
        Adrien Grand added a comment -

        Wow, good catch!

        Show
        Adrien Grand added a comment - Wow, good catch!
        Hide
        ASF subversion and git services added a comment -

        Commit 1518616 from Shai Erera in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1518616 ]

        LUCENE-5192: FieldInfos.Builder failed to catch adding field with different DV type under some circumstances

        Show
        ASF subversion and git services added a comment - Commit 1518616 from Shai Erera in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1518616 ] LUCENE-5192 : FieldInfos.Builder failed to catch adding field with different DV type under some circumstances
        Hide
        Shai Erera added a comment -

        Committed to trunk and 4x. On 4x I had to also fix DocFieldProcessor to call FieldInfos.addOrUpdate even when the field has been encountered. That's because the logic has changed in trunk and now DV fields are processed as stored fields, therefore FIS.addOrUpdate is called for both the posting and NDV, but in 4x it's not, and only the FI was updated in case you added same field with two types (and FIS didn't know about it at all!).

        Show
        Shai Erera added a comment - Committed to trunk and 4x. On 4x I had to also fix DocFieldProcessor to call FieldInfos.addOrUpdate even when the field has been encountered. That's because the logic has changed in trunk and now DV fields are processed as stored fields, therefore FIS.addOrUpdate is called for both the posting and NDV, but in 4x it's not, and only the FI was updated in case you added same field with two types (and FIS didn't know about it at all!).
        Hide
        Michael McCandless added a comment -

        Hmm, that fix wasn't thread safe (the map inside FieldInfos.FieldNumbers is an ordinary HashMap).

        Show
        Michael McCandless added a comment - Hmm, that fix wasn't thread safe (the map inside FieldInfos.FieldNumbers is an ordinary HashMap).
        Hide
        Michael McCandless added a comment -

        Maybe something like this? (for trunk)

        Show
        Michael McCandless added a comment - Maybe something like this? (for trunk)
        Hide
        Shai Erera added a comment -

        Ahh, good catch. I didn't notice FieldNumbers is sync'd. But, I think this if is wrong/problematic:

        -        if (docValues != null) {
        +        if (!fi.hasDocValues() && docValues != null) {
        +          // First time we are seeing doc values type for
        +          // this field:
        

        With this fix, if somebody tries to add a field 'f' as NUMERIC and then BINARY, we won't catch it? This is caught today by FI.setDVType, but with this fix, that won't be called? Do I miss something? Perhaps you can add an 'else if' and compare the given type and fi.getDVType(), but that's just duplicating code from FI.setDVType.

        Show
        Shai Erera added a comment - Ahh, good catch. I didn't notice FieldNumbers is sync'd. But, I think this if is wrong/problematic: - if (docValues != null) { + if (!fi.hasDocValues() && docValues != null) { + // First time we are seeing doc values type for + // this field: With this fix, if somebody tries to add a field 'f' as NUMERIC and then BINARY, we won't catch it? This is caught today by FI.setDVType, but with this fix, that won't be called? Do I miss something? Perhaps you can add an 'else if' and compare the given type and fi.getDVType(), but that's just duplicating code from FI.setDVType.
        Hide
        Robert Muir added a comment -

        I don't care about code duplication here. We should not invoke the global synced fieldnumbers shit for every element, only when the setting actually changes

        Show
        Robert Muir added a comment - I don't care about code duplication here. We should not invoke the global synced fieldnumbers shit for every element, only when the setting actually changes
        Hide
        Shai Erera added a comment -

        In that case it should change to:

        if (docValues != null) {
          if (!fi.hasDocValues()) {
            // First time we are seeing doc values type for
            // this field:
            fi.setDocValuesType(docValues);
        
            // must also update docValuesType map so it's
            // aware of this field's DocValueType 
            globalFieldNumbers.setDocValuesType(fi.number, name, docValues);
          } else if (docValues != fi.getDocValuesType()) {
            // THROW EX
          }
        }
        

        Or, we do this:

        if (docValues != null) {
          // only pay the synchronization cost if fi does not already have a DVType
          boolean updateGlobal = !fi.hasDocValues();
          fi.setDocValuesType(docValues); // this will also perform the consistency check.
          if (updateGlobal) {
            globalFieldNumbers.set(...);
          }
        }
        

        Since FieldInfo.setDVType is also called from DocFieldsProcessor, I prefer to try and keep the consistency check in one place.

        Show
        Shai Erera added a comment - In that case it should change to: if (docValues != null ) { if (!fi.hasDocValues()) { // First time we are seeing doc values type for // this field: fi.setDocValuesType(docValues); // must also update docValuesType map so it's // aware of this field's DocValueType globalFieldNumbers.setDocValuesType(fi.number, name, docValues); } else if (docValues != fi.getDocValuesType()) { // THROW EX } } Or, we do this: if (docValues != null ) { // only pay the synchronization cost if fi does not already have a DVType boolean updateGlobal = !fi.hasDocValues(); fi.setDocValuesType(docValues); // this will also perform the consistency check. if (updateGlobal) { globalFieldNumbers.set(...); } } Since FieldInfo.setDVType is also called from DocFieldsProcessor, I prefer to try and keep the consistency check in one place.
        Hide
        Michael McCandless added a comment -

        With this fix, if somebody tries to add a field 'f' as NUMERIC and then BINARY, we won't catch it?

        Actually, we still catch it, because in DocValuesProcessor.addField we always call fieldInfo.setDocValuesType(), so the exc will be thrown from there.

        Still, I think addOrUpdate should fold in the docValues type ... so I'll just go with Shai's 2nd suggestion ...

        Show
        Michael McCandless added a comment - With this fix, if somebody tries to add a field 'f' as NUMERIC and then BINARY, we won't catch it? Actually, we still catch it, because in DocValuesProcessor.addField we always call fieldInfo.setDocValuesType(), so the exc will be thrown from there. Still, I think addOrUpdate should fold in the docValues type ... so I'll just go with Shai's 2nd suggestion ...
        Hide
        Shai Erera added a comment -

        Ahh that explains it. +1 to commit the synchronization fix!

        Show
        Shai Erera added a comment - Ahh that explains it. +1 to commit the synchronization fix!
        Hide
        ASF subversion and git services added a comment -

        Commit 1518936 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1518936 ]

        LUCENE-5192: use syn'd method to set field's DV type

        Show
        ASF subversion and git services added a comment - Commit 1518936 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1518936 ] LUCENE-5192 : use syn'd method to set field's DV type
        Hide
        ASF subversion and git services added a comment -

        Commit 1518937 from Michael McCandless in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1518937 ]

        LUCENE-5192: use syn'd method to set field's DV type

        Show
        ASF subversion and git services added a comment - Commit 1518937 from Michael McCandless in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1518937 ] LUCENE-5192 : use syn'd method to set field's DV type
        Hide
        Shai Erera added a comment -

        Thanks Mike for catching and fixing!!

        Show
        Shai Erera added a comment - Thanks Mike for catching and fixing!!
        Hide
        Adrien Grand added a comment -

        4.5 release -> bulk close

        Show
        Adrien Grand added a comment - 4.5 release -> bulk close

          People

          • Assignee:
            Shai Erera
            Reporter:
            Shai Erera
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development