The way in which PDFBox handles the Page tree needs to be rewritten, preferably from scratch. Currently the document catalog returns the raw objects from the page tree, wrapped in either a PDPage or PDPageNode.
We need to abstract over the page tree and get rid of PDPageNode, we should provide methods which can add/remove PDPage objects only. The existing low-level access to the page tree is not needed at the PD-level.
Inheritance of page properties such as crop box, resources, and rotation should be reimplemented to use whatever new page tree abstraction we invent. We can finally remove the old broken methods which didn't look up the inheritance tree when retrieving these values.