1. Solr

contrib - Solr Cell (Tika extraction)



Extract content from rich documents using Tika

Issues: Unresolved

Key Summary Due Date
Wish SOLR-1605 ExtractingRequestHandler does not embed original document
Improvement SOLR-1645 Add human content-type
Bug SOLR-1847 Solrj doesn't know if PDF was actually parsed by Tika

View Issues

Issues: Updated recently

Key Summary Updated
New Feature SOLR-7445 Solr Cell field default value parameter
Bug SOLR-7430 Encrypted pptx/xlsx causes a ClassNotFoundException
Improvement SOLR-7027 ExtractingRequestHandler indiscriminantly dumps all source HTML attributes into the catch-all field when captureAttr=false, but it should be more selective, something like only href, title, alt, etc. attributes

View Issues

Versions: Unreleased

Name Release date
Unreleased 4.10.5  
Unreleased Trunk  
Unreleased 5.0.1  
Unreleased 5.2  
Unreleased 5.3  
Unreleased 6  

...and 1 more

Show first 5 only