Details
Description
I have a compressed pdf from which i extract pages (each page will become an individual pdf file). The extracted pages are clipped incorrectly (text is cut), as opposed to original pdf that is not clipped. I traced it down to a missing mediabox attribute in the extracted pages, which exists in the original file as an attribute on all pages. Using the same file, but uncompressed, the extracted pages are not cut and the mediabox attribute is present.
The main code (without initializations and checks) used to load and extract pages is the following:
temp = new File("e:/temp.tmp");
rand = new RandomAccessFile(temp,"rw");
doc = PDDocument.loadNonSeq(file,rand);
PDPage page = (PDPage) doc.getPrintable(pageIndex);
PDDocument newDoc = new PDDocument();
newDoc.importPage(page);
newDoc.close();
doc.close();
rand.close();
temp.delete();