Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
Description
Can be reproduced with the following command and the example configuration shipped with Solr:
cd exampledocs
curl -F "file=@hd.xml" "http://localhost:8983/solr/update/extract?commit=true&literal.id=myid&literalsOverride=true&lowernames=true&literal.content_type=mytype"
The added doc contains both values:
http://localhost:8983/solr/collection1/select?q=id%3Amyid&wt=xml&indent=true
<arr name="content_type"> <str>mytype</str> <str>application/xml</str> </arr>
If the corresponding field is not multi-valued, the request raises an org.apache.solr.common.SolrException: "ERROR: multiple values encountered for non multiValued field content_type: ...".
Debugging the code (Solr 4.4.0) I found out that the parameter "lowernames" is not considered at several places in org.apache.solr.handler.extraction.SolrContentHandler looking like:
if (literalsOverride && literalFieldNames.contains(name)) continue;
The same problem occurs for the following command (though its correctness could be discussed):
curl -F "file=@hd.xml" "http://localhost:8983/solr/update/extract?commit=true&literal.id=myid&literalsOverride=true&lowernames=false&fmap.Content-Type=content_type&literal.content_type=mytype"
Attachments
Attachments
Issue Links
- breaks
-
SOLR-1856 In Solr Cell, literals should override Tika-parsed values
- Closed