Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-1694

Bug in org.apache.pdfbox.io.Ascii85InputStream

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.7.1
    • Fix Version/s: 1.8.3, 2.0.0
    • Component/s: None
    • Labels:
    • Environment:
      Any

      Description

      Method 'org.apache.pdfbox.io.Ascii85InputStream.read()' has bug when reading final set of char that are not modulo-4.
      Test file="www.mzweb.com.br/grupobimbo/web/arquivos/Bimbo_Historia_20070409_Esp.pdf".
      On page#0 there is a dictionary "323 0 obj << /Length 1492 /Filter [/Ascii85Decode /FlateDecode]>>"
      Last set of bytes to decode is "%f" or 0x25, 0x66
      Ascii85InputStream pads this to "%f~!!" and correctly generates the final byte 0x0f.
      Including the '~' end-of-data char in the padding is a major bug.
      If the final padding were "%f!!!", the final byte decoded would be 0x0e (which is wrong).
      The correct padding is the 'u' char, or "%fuuu" (See http://en.wikipedia.org/wiki/Ascii85)
      This is a quick fix.
      The PDF files for corporate website "Grupo Bimbo" include lots of examples using Ascii85Decode/

        Attachments

        1. test.java
          4 kB
          Tilman Hausherr

          Issue Links

            Activity

              People

              • Assignee:
                lehmi Andreas Lehmkühler
                Reporter:
                peterwcostello Peter Costello
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 0.5h
                  0.5h
                  Remaining:
                  Remaining Estimate - 0.5h
                  0.5h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified