Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-361

NullPointerException in PDPageNode.getAllKids

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0-incubator
    • Component/s: Parsing
    • Labels:
      None

      Description

      [Issue from SourceForge]
      http://sourceforge.net/tracker/index.php?func=detail&aid=2008371&group_id=78314&atid=552832

      The parser cannot seem to find the Pages object in files created with
      Acrobat Pro 9. A sample file is attached.

      public static void main(String[] argv) throws Exception {
      String name = "./test.pdf";
      PDDocument doc = PDDocument.load(name);
      doc.close();
      PDPageNode root = doc.getDocumentCatalog().getPages();
      ArrayList<PDPage> pages = new ArrayList<PDPage>();
      root.getAllKids(pages);
      System.out.println("pages.size() == "+pages.size());
      }

      Exception in thread "main" java.lang.NullPointerException
      at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:194)
      at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:182)

      http://sourceforge.net/tracker/download.php?group_id=78314&atid=552832&file_id=283367&aid=2008371

      [Comment on SourceForge]
      Date: 2008-07-02 00:57
      Sender: foundart
      Logged In: YES
      user_id=1693709
      Originator: YES

      This happens with the latest code from CVS and also in older versions.

      [Comment on SourceForge]
      Date: 2008-07-14 17:25
      Sender: orthello
      Logged In: YES
      user_id=853566
      Originator: NO

      We are experiencing the same problem. Offending pdf available if any of
      you need it (jwilson@nmcourt.fed.us). Looks like pdfbox does not support
      some new feature introduced in Acrobat 9.

      [Comment on SourceForge]
      Date: 2008-07-14 23:20
      Sender: foundart
      Logged In: YES
      user_id=1693709
      Originator: YES

      In Acrobat 8, the default was to generate PDFs following version 1.4 of
      the PDF specification. In Acrobat 9, the default is to to generate PDFs
      following version 1.5 of the PDF specification. PDF1.5 has objects known
      as cross-reference streams and it turns out that PDFBox does not parse them
      correctly.

        Attachments

        1. Long_9.pdf
          2.03 MB
          James Wilson
        2. Long_9.pdf.txt
          48 kB
          Justin LeFebvre
        3. Long_9.pdf-sorted.txt
          48 kB
          Justin LeFebvre
        4. PDFParser.diff
          14 kB
          Justin LeFebvre
        5. PDFParser.java
          20 kB
          James Wilson

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              jukkaz Jukka Zitting
            • Votes:
              6 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: