|
[
Permlink
| « Hide
]
Doğacan Güney added a comment - 23/Nov/06 01:27 PM
A simple patch that writes nulls as empty strings.
Null value is not equivalent to an empty String - perhaps we should simply skip such values.
How about something like this then?
Hi Andrzej, Do?acan,
+1. I think it makes a lot of sense to just not include the null key in the Met container. Do?acan, in the future, when you attach a new version of a patch for a JIRA issue, please indicate the change by renaming the patch. Not a big deal, but good style points I'll commit this patch shortly. Cheers, Erhm, -1 from me. This code checks only if the first value is null, and then discards all other values (which may be non-null), thus we could lose valuable data if only the first value happens to be null ...
I think we should indeed check if the first value is null, but then if it is then loop over all other values, count non-nulls, and if the count > 0 then write out the <key, <non-null values>> set. Hi Do?acan,
Loooking at your latest patch, I'm not sure that it completely does the right behavior. For example, what happens if there are 3 met values for a key k, and one of them is null, but the other 2 are not? Specifically, what if the first value is null, but the other 2 are not. In that case, your patch would skip over writing all of the keys. Wouldn't it just be easier to do something like this? Index: src/java/org/apache/nutch/metadata/Metadata.java
Hi Andrzej,
Yup, you caught the same thing as me. +1 for your solution. I will extend my above patch by writing getNumNonNullValues(values) instead of values.length. Cheers, Fix applied and tested in trunk.
Patch applied to trunk:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||