Lucene - Core
  1. Lucene - Core
  2. LUCENE-3616

Illegal Field Configurations should throw exceptions

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 4.0-ALPHA
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      When working on LUCENE-3615, I came across:

      java.lang.IllegalArgumentException: field field is stored but does not have binaryValue, stringValue nor numericValue
      at org.apache.lucene.index.codecs.DefaultStoredFieldsWriter.writeField(DefaultStoredFieldsWriter.java:177)
      at org.apache.lucene.index.StoredFieldsConsumer.finishDocument(StoredFieldsConsumer.java:119)
      at org.apache.lucene.index.DocFieldProcessor.finishDocument(DocFieldProcessor.java:295)
      at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:255)
      at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:380)
      at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1480)
      at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1242)
      at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1223)
      at org.apache.lucene.index.Test2BTerms.test2BTerms(Test2BTerms.java:194)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
      at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
      at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
      at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
      at org.junit.rules.TestWatchman$1.evaluate(TestWatchman.java:48)
      at org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:525)
      at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
      at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
      at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
      at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:168)
      at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:47)
      at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
      at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
      at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
      at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
      at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
      at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
      at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
      at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
      at org.junit.runner.JUnitCore.run(JUnitCore.java:157)
      at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:71)
      at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:199)
      at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:62)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)

      which is due to the using Textfield.TYPE_STORED when using a TokenStream. Since this is an illegal combination, we should throw an exception upon construction of the Field, not later when actually trying to do the indexing.

        Activity

        Hide
        Robert Muir added a comment -

        Thanks Jan: i fixed these.

        Show
        Robert Muir added a comment - Thanks Jan: i fixed these.
        Hide
        Jan Høydahl added a comment -

        Thanks for fixing that one. Think there may be a few more, untested ones, only did a code search, no testing:
        SearchGroupsResultTransformer # 113,115
        TopGroupsResultTransformer # 255,257
        GroupedEndResultTransformer # 72

        Show
        Jan Høydahl added a comment - Thanks for fixing that one. Think there may be a few more, untested ones, only did a code search, no testing: SearchGroupsResultTransformer # 113,115 TopGroupsResultTransformer # 255,257 GroupedEndResultTransformer # 72
        Hide
        Robert Muir added a comment -

        The bug is in grouping, apply a 0.0f boost.

        Perhaps the setBoost() in Solr's FieldType should be conditional depending on field type:

        No, we should throw exception. Thats the whole point, to not silently discard users boosts when they will have no effect.

        Show
        Robert Muir added a comment - The bug is in grouping, apply a 0.0f boost. Perhaps the setBoost() in Solr's FieldType should be conditional depending on field type: No, we should throw exception. Thats the whole point, to not silently discard users boosts when they will have no effect.
        Hide
        Jan Høydahl added a comment - - edited

        The commit yesterday (1369196) causes the "Group By" tab of Solritas to stop working, where we try to group-by a string field.:

        INFO: [collection1] webapp=/solr path=/browse params={group=true&group.field=manu_exact&queryOpts=group} status=500 QTime=44 
        Aug 4, 2012 11:25:40 PM org.apache.solr.common.SolrException log
        SEVERE: null:java.lang.IllegalArgumentException: You cannot set an index-time boost on an unindexed field, or one that omits norms
        	at org.apache.lucene.document.Field.setBoost(Field.java:382)
        	at org.apache.solr.schema.FieldType.createField(FieldType.java:277)
        	at org.apache.solr.schema.FieldType.createField(FieldType.java:263)
        	at org.apache.solr.schema.SchemaField.createField(SchemaField.java:101)
        	at org.apache.solr.search.Grouping$CommandField.finish(Grouping.java:790)
        

        Perhaps the setBoost() in Solr's FieldType should be conditional depending on field type:

        FieldType.java line 275
          protected IndexableField createField(String name, String val, org.apache.lucene.document.FieldType type, float boost){
            Field f = new Field(name, val, type);
            f.setBoost(boost);
            return f;
          }
        
        Show
        Jan Høydahl added a comment - - edited The commit yesterday (1369196) causes the "Group By" tab of Solritas to stop working, where we try to group-by a string field.: INFO: [collection1] webapp=/solr path=/browse params={group=true&group.field=manu_exact&queryOpts=group} status=500 QTime=44 Aug 4, 2012 11:25:40 PM org.apache.solr.common.SolrException log SEVERE: null:java.lang.IllegalArgumentException: You cannot set an index-time boost on an unindexed field, or one that omits norms at org.apache.lucene.document.Field.setBoost(Field.java:382) at org.apache.solr.schema.FieldType.createField(FieldType.java:277) at org.apache.solr.schema.FieldType.createField(FieldType.java:263) at org.apache.solr.schema.SchemaField.createField(SchemaField.java:101) at org.apache.solr.search.Grouping$CommandField.finish(Grouping.java:790) Perhaps the setBoost() in Solr's FieldType should be conditional depending on field type: FieldType.java line 275 protected IndexableField createField( String name, String val, org.apache.lucene.document.FieldType type, float boost){ Field f = new Field(name, val, type); f.setBoost(boost); return f; }
        Hide
        Chris Male added a comment -

        In my opinion if i have a ShortDocValuesField, it shouldnt have a setReader method

        Agreed. The setABC() methods are extremely confusing and add another level of validation (using your example, we have to validate that you're not setting a Reader on a NumericField).

        Perhaps we can re-arrange this a little. If we genuinely feel there there are use cases out there that we haven't covered with the typed impls and that we don't want to cover, then why not make a GenericField or something, which is abstract and accepts just name, FieldType and maybe an Object value. We can then emphasis in documentation that it is expert only, should only be subclassed in the extremely rare situations that our typed impls are insufficient, and won't be validated so buyer-beware kind of thing.

        We can then gut Field down to a very simple abstract class / interface, and promote our typed impls to being 1st class and the recommended entry points for users.

        Of course if we feel we have provided adequate support through the typed impls, then we can skip straight to the gutting.

        Show
        Chris Male added a comment - In my opinion if i have a ShortDocValuesField, it shouldnt have a setReader method Agreed. The setABC() methods are extremely confusing and add another level of validation (using your example, we have to validate that you're not setting a Reader on a NumericField). Perhaps we can re-arrange this a little. If we genuinely feel there there are use cases out there that we haven't covered with the typed impls and that we don't want to cover, then why not make a GenericField or something, which is abstract and accepts just name, FieldType and maybe an Object value. We can then emphasis in documentation that it is expert only, should only be subclassed in the extremely rare situations that our typed impls are insufficient, and won't be validated so buyer-beware kind of thing. We can then gut Field down to a very simple abstract class / interface, and promote our typed impls to being 1st class and the recommended entry points for users. Of course if we feel we have provided adequate support through the typed impls, then we can skip straight to the gutting.
        Hide
        Robert Muir added a comment -

        Chris: that's a good point.

        The current design seems to be that Field can do "everything" and the others are simply sugar on top.

        Personally I think this is confusing and error-prone.
        thats why i wrote such a huge test, but its silly.

        In my opinion if i have a ShortDocValuesField, it shouldnt have a setReader method

        Show
        Robert Muir added a comment - Chris: that's a good point. The current design seems to be that Field can do "everything" and the others are simply sugar on top. Personally I think this is confusing and error-prone. thats why i wrote such a huge test, but its silly. In my opinion if i have a ShortDocValuesField, it shouldnt have a setReader method
        Hide
        Chris Male added a comment -

        With all the various typed XYZField implementations we have now, what do we see as the role of Field? Is it just serving as a parent class to the implementations or do we expect users will be using it too?

        Show
        Chris Male added a comment - With all the various typed XYZField implementations we have now, what do we see as the role of Field? Is it just serving as a parent class to the implementations or do we expect users will be using it too?
        Hide
        Robert Muir added a comment -

        I committed this patch. But we should still look at Field+FieldType, it
        looks like its possible for all kinds of wierd things to happen.

        The fact that Field allows this ctor:

          protected Field(String name, FieldType type)
        

        means it really cannot check against anything.

        Show
        Robert Muir added a comment - I committed this patch. But we should still look at Field+FieldType, it looks like its possible for all kinds of wierd things to happen. The fact that Field allows this ctor: protected Field(String name, FieldType type) means it really cannot check against anything.
        rmuir committed 1369199 (22 files)
        Reviews: none

        LUCENE-3616: throw exception on some illegal field configurations (applying boost to omitNorms or unindexed field), fix TextField(Reader) ctor as it cannot Store

        Lucene branch_4x
        Hide
        Michael McCandless added a comment -

        +1

        Show
        Michael McCandless added a comment - +1
        Robert Muir made changes -
        Attachment LUCENE-3616.patch [ 12539089 ]
        Hide
        Robert Muir added a comment -

        heres some tests and 2 improvements for the pre-supplied types (I didnt mess with custom Field/FieldType, thats more complicated)

        • TextField(Reader, Field.Store) is silly because superclass (Field) will only throw exception if Store = true for a Reader. so I removed the store argument
        • throw exception if boost is set for unindexed or omitNorms fields.
        Show
        Robert Muir added a comment - heres some tests and 2 improvements for the pre-supplied types (I didnt mess with custom Field/FieldType, thats more complicated) TextField(Reader, Field.Store) is silly because superclass (Field) will only throw exception if Store = true for a Reader. so I removed the store argument throw exception if boost is set for unindexed or omitNorms fields.
        Hide
        Michael McCandless added a comment -

        I have a tentative fix for this on the LUCENE-3453 branch...

        Show
        Michael McCandless added a comment - I have a tentative fix for this on the LUCENE-3453 branch...
        Michael McCandless made changes -
        Field Original Value New Value
        Assignee Michael McCandless [ mikemccand ]
        Hide
        Robert Muir added a comment -

        +1, we shouldn't rely upon codec assertions/extra safety, they should get IAE immediately.

        if you had run Test2BTerms with SimpleText you would have gotten NPE (it did not even have
        this low-level check until now, i just added it).

        Show
        Robert Muir added a comment - +1, we shouldn't rely upon codec assertions/extra safety, they should get IAE immediately. if you had run Test2BTerms with SimpleText you would have gotten NPE (it did not even have this low-level check until now, i just added it).
        rmuir committed 1209639 (1 file)
        Grant Ingersoll created issue -

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Grant Ingersoll
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development