Details
Description
workaround changes needed for PDGraphicsState constructor as reproduced below:
attached pdf file can be used to reproduce the exception
public PDGraphicsState(PDRectangle page) {
/*
- TB - changes made here are a workaround which creates a default
- GeneralPath assigned to currentClippingPath if the constructor
- argument page is null. Probably a better remedy would be to ensure
- that the page argument is not null or use a dedicated constructor if
- page is null
*/
if (page != null) {
Dimension dimension = page.createDimension();
Rectangle rectangle = new Rectangle(dimension);
currentClippingPath = new GeneralPath(rectangle);
currentClippingPath = new GeneralPath(new Rectangle(page.createDimension()));
if (page.getLowerLeftX() != 0 || page.getLowerLeftY() != 0)
} else
{ currentClippingPath = new GeneralPath(); }}
Also, as a side effect of above workaround, made following change within PDFStreamEngine.processEncodedText:
/*
- TB - needed to make change here, as we encounter here a knock on
- effect of allowing null page arguments through in PDGraphicsState
- constructor which creates a default GeneralPath assigned to
- currentClippingPath. That workaround causes findMediaBox to return
- null, so in that case we assign default values to pageHeight and
- pageWidth here. Everything else seems to work as far as text
- extraction is concerned.
*/
if (page.findMediaBox() != null) { pageHeight = page.findMediaBox().getHeight(); pageWidth = page.findMediaBox().getWidth(); }else
{ pageHeight = 0; pageWidth = 0; }