|
[
Permlink
| « Hide
]
Otis Gospodnetic added a comment - 20/Dec/06 10:33 PM
Committed. I'll also remove the old version of this code (+ its unit test), the one that still lives in contrib/miscellaneous/src/java/org/apache/lucene/misc/ .
The norm-removing functionality was bogus - it simply "normalized the norms" to be 1 for the given field, but did not completely remove norms for a field, and did not flip the omitNorms bit for the given field, so it was never a true NO_NORMS field.
I'll upload a new patch that does this, but it does it only for Lucene 2.0.0 and Lucene 2.1-dev before the new .nrm changes from I was looking at what it would take to make this work with .nrm file as well.
I expected there will be a test that fails currently, but there is none. So I looked into the tests and the implementation and have a few questions: (1) under contrib, FieldNormModifier and LengthNormModifier seem quite similar, right?
(2) TestFieldNormModifier.testFieldWithNoNorm() calls resetNorms() for a field that does not exist. Some work is done by the modifier to collect the term frequencies, and then reader.setNorm is called but it does nothing, because there are no norms. And indeed the test verifies that there are still no norms for this field. Confusing I think. For some reason I assumed that calling resetNorms() for a field that has none, would implicitly set omitNorms to false for that field and compute it - the inverse of killNorms(). Since this is not the case, perhaps resetNorms should throw an exception in this case? (3) I would feel safer about this feature if the test was more strict - something like TestNorms - have several fields, modify some, each in a unique way, remove some others, then at the end verify that all the values of each field norms are exactly as expected. (4) For killNorms to work, you can first revert the index to not use .nrm, and then "kill" as before. The code knows to read .fN files, for both backwards compatibility, and for reading segments created be DocumentWriter. The following steps will do this:
(5) It would have been more efficient to optimize (and remove the .nrm file) once in the application, so perhaps modify the public API to take an array of fields and operate on all? Attached for.nrm.patch was very noisy - so I replaced it with one created with
svn diff -x --ignore-eol-style contrib/miscellaneous It is relative to trunk. A test is added to TestFieldNormModifier.java - testModifiedNormValuesCombinedWithKill - that verifies exactly what are the values of norms after modification. FieldNormModifier modified to handle .nrm file as outlined above. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||