Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-1086

Error when decoding CCITT compressed data that contains EOLs, fill bits etc.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0.0
    • Component/s: Parsing

      Description

      The TIFFFaxDecoder class (originally coming from JAI via XML Graphics Commons) does not handle cases like EOLs between lines and in front. But the PDF CCITTFaxDecode filter needs to allow many different variants of the encoding. Apparently, TIFF has a relatively restricted way of encoding CCITT data, so TIFFFaxDecoder was not written to be as flexible as we need it. Ideally, PDFBox should handle anything that gets thrown at it.

      It apprears that it would be rather difficult to retrofit TIFFFaxDecoder with the necessary flexibility. So, new decoders for T.4 and T.6 should probably be written.

        Attachments

          Activity

            People

            • Assignee:
              jeremias@apache.org Jeremias Maerki
              Reporter:
              jeremias@apache.org Jeremias Maerki
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: