[PDFBOX-5788] ID References changes when saving PDFs. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: 3.0.1 PDFBox, 3.0.2 PDFBox
Fix Version/s: None
Component/s: None
Labels:
None

Description

private static void runPDF(String name) throws IOException, NoSuchAlgorithmException {
    PDDocument doc = Loader.loadPDF(new File(name));

    File tmpFile = File.createTempFile("tmp", ".pdf");
    doc.save(tmpFile);
    byte[] data = Files.readAllBytes(Paths.get(tmpFile.getAbsolutePath()));
    byte[] hash = MessageDigest.getInstance("SHA256").digest(data);
    System.out.println(encodeHexString(hash));

    File tmpFile2 = File.createTempFile("tmp", ".pdf");
    doc.save(tmpFile2);
    byte[] data2 = Files.readAllBytes(Paths.get(tmpFile2.getAbsolutePath()));
    byte[] hash2 = MessageDigest.getInstance("SHA256").digest(data2);
    System.out.println(encodeHexString(hash2));
}

Not sure, this might be expected behavior but it makes my testing framework a bit less robust so I thought I'd report it here. In the newer versions 3.0.2 and 3.0.1 when you save a PDF the second time the reference ID's continue incrementing which means that the PDF stored the first time is not identical to the second time.

In my test case depending on what thread executes first there might be difference in the run and the expected result changes.

I've not seen this with 3.0.0 and earlier versions of PDFBox.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Daniel Persson

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 19/Mar/24 07:11

Updated:: 20/Mar/24 14:50