Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-1607

StringIndexOutOfBoundsException in PDFParser

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.8.1
    • Fix Version/s: 1.8.3, 2.0.0
    • Component/s: Parsing
    • Labels:
      None
    • Environment:
      Windows 7, JRE 1.7.0_15-b03

      Description

      I have few test files parsed fine in PDFBox 1.7.1 but not in 1.8.1:

      java.lang.StringIndexOutOfBoundsException: String index out of range: 2047
      at java.lang.AbstractStringBuilder.deleteCharAt(AbstractStringBuilder.java:762)
      at java.lang.StringBuilder.deleteCharAt(StringBuilder.java:258)
      at org.apache.pdfbox.pdfparser.BaseParser.parseCOSHexString(BaseParser.java:1000)
      at org.apache.pdfbox.pdfparser.BaseParser.parseCOSString(BaseParser.java:808)
      at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:1241)
      at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:558)
      at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:188)

        Attachments

        1. pdf-govdocs-036902.pdf
          123 kB
          Alex Alishevskikh
        2. pdf-govdocs-107566.pdf
          80 kB
          Alex Alishevskikh
        3. pdfbox-1607-fix.patch
          1 kB
          Arjohn Kampman

          Issue Links

            Activity

              People

              • Assignee:
                lehmi Andreas Lehmkühler
                Reporter:
                alexeya Alex Alishevskikh
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: