Lucene - Core
  1. Lucene - Core
  2. LUCENE-6826

java.lang.ClassCastException: org.apache.lucene.index.TermsEnum$2 cannot be cast to org.apache.lucene.index.MultiTermsEnum when adding indexes

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.2.1
    • Fix Version/s: 5.4, 6.0
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      We are using addIndexes and FilterCodecReader tricks as part of index migration.

      Whether FilterCodecReader tricks are required to reproduce this is uncertain, but in any case, when migrating a particular index, I saw this exception:

      java.lang.ClassCastException: org.apache.lucene.index.TermsEnum$2 cannot be cast to org.apache.lucene.index.MultiTermsEnum
      	at org.apache.lucene.index.MappedMultiFields$MappedMultiTerms.iterator(MappedMultiFields.java:65)
      	at org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.write(BlockTreeTermsWriter.java:426)
      	at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.write(PerFieldPostingsFormat.java:198)
      	at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:105)
      	at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:193)
      	at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:95)
      	at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:2519)
      

      TermsEnum$2 appears to be TermsEnum.EMPTY. The place where it creates it is here:

      MultiTermsEnum#reset:

          if (queue.size() == 0) {
            return TermsEnum.EMPTY;   // <- this is not a MultiTermsEnum
          } else {
            return this;
          }
      

      A quick hack would be for MappedMultiFields to check for TermsEnum.EMPTY specifically before casting, but there might be some way to avoid the cast entirely and that would obviously be a better idea.

        Activity

        Hide
        Michael McCandless added a comment -

        Hmm, no good ... I think we first need a small test case exposing this.

        I think it should only happen if you have a FilterCodecReader that has filters a field by providing no terms in the TermsEnum?

        I.e. I think Lucene (at least the default codec) would normally not write a field if it has 0 terms.

        Show
        Michael McCandless added a comment - Hmm, no good ... I think we first need a small test case exposing this. I think it should only happen if you have a FilterCodecReader that has filters a field by providing no terms in the TermsEnum ? I.e. I think Lucene (at least the default codec) would normally not write a field if it has 0 terms.
        Hide
        Trejkaz added a comment -

        One of the fields, our test indexes all have the same value, which happens to be the value we filter out, and then the contents of that filtered stream get merged with another field. It might not be too hard to mock up a test case with similar behaviour, will see what I can do tomorrow.

        Show
        Trejkaz added a comment - One of the fields, our test indexes all have the same value, which happens to be the value we filter out, and then the contents of that filtered stream get merged with another field. It might not be too hard to mock up a test case with similar behaviour, will see what I can do tomorrow.
        Hide
        Michael McCandless added a comment -

        Thank you Trejkaz!

        Show
        Michael McCandless added a comment - Thank you Trejkaz !
        Hide
        Trejkaz added a comment -

        This test creates an index with one document which contains a value which does not match the filter. It then migrates the index in a fashion that just filters out the values, we don't want, which becomes all values in that field, which triggers the error.

        The first half of the day I tried to reproduce the exact same thing from scratch with no success - it happily migrated. This version comes from working code, simplified as far as possible without removing the issue, so it could turn out that there is a subtle bug in my code as well.

        Show
        Trejkaz added a comment - This test creates an index with one document which contains a value which does not match the filter. It then migrates the index in a fashion that just filters out the values, we don't want, which becomes all values in that field, which triggers the error. The first half of the day I tried to reproduce the exact same thing from scratch with no success - it happily migrated. This version comes from working code, simplified as far as possible without removing the issue, so it could turn out that there is a subtle bug in my code as well.
        Hide
        Michael McCandless added a comment -

        Thanks Trejkaz, I'll have a look...

        Show
        Michael McCandless added a comment - Thanks Trejkaz , I'll have a look...
        Hide
        ASF subversion and git services added a comment -

        Commit 1707387 from Michael McCandless in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1707387 ]

        LUCENE-6826: add explicit check for TermsEnum.EMPTY to avoid ClassCastException

        Show
        ASF subversion and git services added a comment - Commit 1707387 from Michael McCandless in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1707387 ] LUCENE-6826 : add explicit check for TermsEnum.EMPTY to avoid ClassCastException
        Hide
        ASF subversion and git services added a comment -

        Commit 1707388 from Michael McCandless in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1707388 ]

        LUCENE-6826: fix java7 compilation

        Show
        ASF subversion and git services added a comment - Commit 1707388 from Michael McCandless in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1707388 ] LUCENE-6826 : fix java7 compilation
        Hide
        ASF subversion and git services added a comment -

        Commit 1707390 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1707390 ]

        LUCENE-6826: add explicit check for TermsEnum.EMPTY to avoid ClassCastException

        Show
        ASF subversion and git services added a comment - Commit 1707390 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1707390 ] LUCENE-6826 : add explicit check for TermsEnum.EMPTY to avoid ClassCastException
        Hide
        Michael McCandless added a comment -

        Thanks Trejkaz!

        Show
        Michael McCandless added a comment - Thanks Trejkaz !

          People

          • Assignee:
            Unassigned
            Reporter:
            Trejkaz
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development