Issue Details (XML | Word | Printable)

Key: LUCENE-1217
Type: Improvement Improvement
Status: Resolved Resolved
Resolution: Fixed
Priority: Trivial Trivial
Assignee: Michael McCandless
Reporter: Eks Dev
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Lucene - Java

use isBinary cached variable instead of instanceof in Field

Created: 11/Mar/08 09:23 AM   Updated: 12/Mar/08 04:57 PM
Return to search
Component/s: Other
Affects Version/s: None
Fix Version/s: None

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works Lucene-1217-take1.patch 2008-03-11 09:09 PM Eks Dev 2 kB
Text File Licensed for inclusion in ASF works LUCENE-1217.patch 2008-03-11 09:25 AM Eks Dev 1 kB
Issue Links:
Blocker
 

Lucene Fields: Patch Available, New
Resolution Date: 12/Mar/08 10:10 AM


 Description  « Hide
Field class can hold three types of values,
See: AbstractField.java protected Object fieldsData = null;

currently, mainly RTTI (instanceof) is used to determine the type of the value stored in particular instance of the Field, but for binary value we have mixed RTTI and cached variable "boolean isBinary"

This patch makes consistent use of cached variable isBinary.

Benefit: consistent usage of method to determine run-time type for binary case (reduces chance to get out of sync on cached variable). It should be slightly faster as well.

Thinking aloud:
Would it not make sense to maintain type with some integer/byte"poor man's enum" (Interface with a couple of constants)
code:java{
public static final interface Type{
public static final byte BOOLEAN = 0;
public static final byte STRING = 1;
public static final byte READER = 2;
....
}
}

and use that instead of isBinary + instanceof?



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Eks Dev made changes - 11/Mar/08 09:25 AM
Field Original Value New Value
Attachment LUCENE-1217.patch [ 12377589 ]
Michael McCandless made changes - 11/Mar/08 01:37 PM
Assignee Michael McCandless [ mikemccand ]
Michael McCandless added a comment - 11/Mar/08 06:26 PM
Patch looks good. I will commit shortly. Thanks Eks Dev.

Would it not make sense to maintain type with some integer/byte"poor man's enum" (Interface with a couple of constants)

Or we could wait until Java 5 (3.0) and use real enums?

Or ... maybe we should have subclasses of Field (TextField, BinaryField,
ReaderField, TokenStreamField) which override the corresponding method
(and the base Field.java would still implement these methods but
return null)? Though this would be a rather large change...


Eks Dev added a comment - 11/Mar/08 08:35 PM
thanks fof looking into it!
Subclassing now with backwards compatibility would be clumsy, I was thinking about it but could not find clean way to make it.

>>Or we could wait until Java 5 (3.0) and use real enums?
yes, that is ultimate solution, but my line of thoughts was that "poor man's enum"->java 5 enum migration would be trivial later... but do not change working code kicks-in here


Michael McCandless added a comment - 11/Mar/08 08:48 PM
Actually seeing a test failure with this:

[junit] Testcase: testLazyFields(org.apache.lucene.index.TestFieldsReader): FAILED
[junit] bytes is null and it shouldn't be
[junit] junit.framework.AssertionFailedError: bytes is null and it shouldn't be
[junit] at org.apache.lucene.index.TestFieldsReader.testLazyFields(TestFieldsReader.java:132)


Eks Dev added a comment - 11/Mar/08 09:08 PM
hah, this bug just justified this patch
sorry, I should have run tests before... nothing is trivial enough.
The problem was indeed isBinary that went out of sync in LazyField, new patch follows

Eks Dev added a comment - 11/Mar/08 09:09 PM
new patch, fixes isBinary status in LazyField

Eks Dev made changes - 11/Mar/08 09:09 PM
Attachment Lucene-1217-take1.patch [ 12377639 ]
Eks Dev made changes - 11/Mar/08 09:59 PM
Link This issue blocks LUCENE-1219 [ LUCENE-1219 ]
Michael McCandless added a comment - 11/Mar/08 10:49 PM
OK the new patch passes all tests – thanks!

One unrelated thing I noticed: it looks like you can get a binary LazyField and then ask for its stringValue(), and vice-versa. Ie we are failing to check in binaryValue() that the field is in fact binary even though when we create the LazyField we know whether it is. I'll open a separate issue for this.


Repository Revision Date User Message
ASF #636139 Tue Mar 11 22:53:50 UTC 2008 mikemccand LUCENE-1217: use Field.isBinary instead of 'instanceof'
Files Changed
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/FieldsReader.java
MODIFY /lucene/java/trunk/CHANGES.txt
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/document/Field.java

Michael McCandless made changes - 12/Mar/08 10:10 AM
Lucene Fields [Patch Available, New] [New, Patch Available]
Resolution Fixed [ 1 ]
Status Open [ 1 ] Resolved [ 5 ]
Doug Cutting added a comment - 12/Mar/08 04:57 PM
fix typo that's been bugging me

Doug Cutting made changes - 12/Mar/08 04:57 PM
Description Filed class can hold three types of values,
See: AbstractField.java protected Object fieldsData = null;

currently, mainly RTTI (instanceof) is used to determine the type of the value stored in particular instance of the Field, but for binary value we have mixed RTTI and cached variable "boolean isBinary"

This patch makes consistent use of cached variable isBinary.

Benefit: consistent usage of method to determine run-time type for binary case (reduces chance to get out of sync on cached variable). It should be slightly faster as well.

Thinking aloud:
Would it not make sense to maintain type with some integer/byte"poor man's enum" (Interface with a couple of constants)
code:java{
public static final interface Type{
public static final byte BOOLEAN = 0;
public static final byte STRING = 1;
public static final byte READER = 2;
....
}
}

and use that instead of isBinary + instanceof?
Field class can hold three types of values,
See: AbstractField.java protected Object fieldsData = null;

currently, mainly RTTI (instanceof) is used to determine the type of the value stored in particular instance of the Field, but for binary value we have mixed RTTI and cached variable "boolean isBinary"

This patch makes consistent use of cached variable isBinary.

Benefit: consistent usage of method to determine run-time type for binary case (reduces chance to get out of sync on cached variable). It should be slightly faster as well.

Thinking aloud:
Would it not make sense to maintain type with some integer/byte"poor man's enum" (Interface with a couple of constants)
code:java{
public static final interface Type{
public static final byte BOOLEAN = 0;
public static final byte STRING = 1;
public static final byte READER = 2;
....
}
}

and use that instead of isBinary + instanceof?
Lucene Fields [Patch Available, New] [New, Patch Available]
Summary use isBinary cached variable instead of instanceof in Filed use isBinary cached variable instead of instanceof in Field