Details
Description
Parsing Error, Skipping Object
java.io.IOException: expected='endstream' actual='' org.apache.pdfbox.io.PushBackInputStream@38011d45
at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:439)
at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:552)
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:184)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1088)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1053)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:74)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
at org.apache.tika.Tika.parseToString(Tika.java:357)
at edu.uci.ics.crawler4j.crawler.BinaryParser.parse(BinaryParser.java:37)
at edu.uci.ics.crawler4j.crawler.WebCrawler.handleBinary(WebCrawler.java:223)
at edu.uci.ics.crawler4j.crawler.WebCrawler.processPage(WebCrawler.java:462)
at edu.uci.ics.crawler4j.crawler.WebCrawler.run(WebCrawler.java:129)
at java.lang.Thread.run(Thread.java:662)
Did not found XRef object at specified startxref position 0
This is the sample URL where I am facing this problem:-
http://www.qualcomm.com/documents/files/rev-b-enhanced-mobile-broadband-for-all.pdf
Any suggestions why is it happening...!! Or its a bug??