Details
-
Bug
-
Status: Closed
-
Trivial
-
Resolution: Fixed
-
2.0.13, 2.0.20
-
None
-
None
Description
I've noticed the following null pointer exception can happen on scanned documents when we're checking if it has a layer of a given name:
java.lang.NullPointerExceptionjava.lang.NullPointerException at org.apache.pdfbox.pdmodel.graphics.optionalcontent.PDOptionalContentProperties.getGroupNames(PDOptionalContentProperties.java:227) at org.apache.pdfbox.pdmodel.graphics.optionalcontent.PDOptionalContentProperties.hasGroup(PDOptionalContentProperties.java:245)
Looking into this, it happens because for these particular documents the following line returns null - https://github.com/apache/pdfbox/blob/da378798e5f2c8a1394725518db478f7ffaf5177/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/optionalcontent/PDOptionalContentProperties.java#L227
and subsequently gets dereferenced
It may be that these files are just generated in an invalid way (unfortunately I cannot provide them as it contains customer information). Anyway, looking at the code other methods getGroup(name) call a private method getOCGs that safeguards against the null case by defaulting it then, see here - https://github.com/apache/pdfbox/blob/da378798e5f2c8a1394725518db478f7ffaf5177/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/optionalcontent/PDOptionalContentProperties.java#L118
So maybe getGroupNames() (which hasGroup uses) can just use that as a safeguard? Anyway, the workaround was to replace our hasGroup(name) check with getGroup(name) != null