Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Resolved
-
None
-
None
-
None
Description
In trying to use the "sort" streaming evaluator on float field (pfloat), I am getting casting errors back based upon which values are calculated based upon underlying values in a field.
Example:
Docs: (paste each into "Documents" pane in Solr Admin UI as type:"json")
{"id": "1", "name":"donut","vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]} {"id": "2", "name":"cheese pizza","vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]}
Streaming Expression:
sort(select(search(food_collection, q="*:*", fl="id,vector_fs", sort="id asc"), cosineSimilarity(vector_fs, array(5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as sim, id), by="sim desc")
Response:
{ "result-set": { "docs": [ { "EXCEPTION": "class java.lang.Double cannot be cast to class java.lang.Long (java.lang.Double and java.lang.Long are in module java.base of loader 'bootstrap')", "EOF": true, "RESPONSE_TIME": 13 } ] } }
This is because in org.apache.solr.client.solrj.io.eval.RecursiveEvaluator, there is a line which examines a numeric (BigDecimal) value and - regardless of the type of the field the value originated from - converts it to a Long if it looks like a whole number. This is the code in question from that class:
protected Object normalizeOutputType(Object value) { if(null == value){ return null; } else if (value instanceof VectorFunction) { return value; } else if(value instanceof BigDecimal){ BigDecimal bd = (BigDecimal)value; if(bd.signum() == 0 || bd.scale() <= 0 || bd.stripTrailingZeros().scale() <= 0){ try{ return bd.longValueExact(); } catch(ArithmeticException e){ // value was too big for a long, so use a double which can handle scientific notation } } return bd.doubleValue(); } ... [other type conversions]
Because of the return bd.longValueExact(); line, the calculated value for "sim" in doc 1 is "Float(1)", whereas the calculated value for "sim" for doc 2 is "Double(0.88938313). These are coming back as incompatible data types, even though the source data is all of the same type and should be comparable.
Thus when the sort evaluator streaming expression (and probably others) runs on these calculated values and the list should contain ["0.88938313", "1.0"], an exception is thrown because the it's trying to compare incompatible data types [Double("0.99"), Long(1)].
This bug is occurring on master currently, but has probably existed in the codebase since at least August 2017.
Attachments
Attachments
Issue Links
- links to