[SOLR-1786] Solr (trunk rev. 912116) suffers from PDFBOX-537 [Endless loop in org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary()] fixed in PDFbox 1.0? - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Critical
Resolution: Fixed
Affects Version/s: 1.5
Fix Version/s: 3.1
Component/s: contrib - Solr Cell (Tika extraction)
Labels:
- PDFbox
Environment:

Ubuntu 9.10, 32bit

Description

I tried indexing several thousand PDF documents but could not finish as Solr was falling into an endless loop for some of them, for instance: http://cdsweb.cern.ch/record/702585/files/sl-note-2000-019.pdf (the PDF seems OK).
Can Solr start using PDFbox 1.0?

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Jan Iversen

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 22/Feb/10 09:12

Updated:: 10/May/13 10:40

Resolved:: 07/Jun/12 18:11