Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-3769

Cannot read JBIG2 image when JBIG2-Image-Decoder is in path

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.5
    • Fix Version/s: 2.0.6, 3.0.0 PDFBox
    • Component/s: None
    • Labels:
    • Environment:
      Windows 10

      Description

      I understand from the "Dependencies" page (https://pdfbox.apache.org/2.0/dependencies.html) that "JBIG2 ImageIO" and "JBIG2-Image-Decoder" should be interchangeable but I get the following error message when using JBIG2-Image-Decoder:

      GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed

      Command line is:

      java -cp "JBIG2-Image-Decoder.jar;bcmail-jdk15on-1.54.jar;bcpkix-jdk15on-1.54.jar;bcprov-jdk15on-1.54.jar;commons-logging-1.2.jar;diffutils-1.3.0.jar;fontbox-2.0.5.jar;hamcrest-core-1.3.jar;jai-imageio-core-1.3.1.jar;jai-imageio-jpeg2000-1.3.0.jar;junit-4.12.jar;JBIG2-Image-Decoder.jar;pdfbox-app-2.0.5.jar" org.apache.pdfbox.tools.PDFToImage jbig2.pdf

      If I change JBIG2-Image-Decoder.jar for levigo-jbig2-imageio-1.6.5.jar then conversion is successfull:

      java -cp "levigo-jbig2-imageio-1.6.5.jar;bcmail-jdk15on-1.54.jar;bcpkix-jdk15on-1.54.jar;bcprov-jdk15on-1.54.jar;commons-logging-1.2.jar;diffutils-1.3.0.jar;fontbox-2.0.5.jar;hamcrest-core-1.3.jar;jai-imageio-core-1.3.1.jar;jai-imageio-jpeg2000-1.3.0.jar;junit-4.12.jar;JBIG2-Image-Decoder.jar;pdfbox-app-2.0.5.jar" org.apache.pdfbox.tools.PDFToImage jbig2.pdf

      1. jbig2.jar
        85 kB
        Tilman Hausherr
      2. jbig2.pdf
        1 kB
        Esteban Nicolas Ruiz
      3. JBIG2-Image-Decoder.jar
        250 kB
        Esteban Nicolas Ruiz

        Issue Links

          Activity

          Hide
          tilman Tilman Hausherr added a comment - - edited

          Where did you get the executable of the non-levigo decoder? Did you build the project yourself?

          Please build a project with this code:

          System.out.println(Arrays.toString(ImageIO.getReaderFormatNames()));
          

          and attach only one or the other JBIG2 decoder, or both. What is the output you get each time?

          Btw in your command lines you don't need bc, commons log, diffutils, fontbox and hamcrest. These are for testing only / are in the pdfbox-app jar.

          Show
          tilman Tilman Hausherr added a comment - - edited Where did you get the executable of the non-levigo decoder? Did you build the project yourself? Please build a project with this code: System .out.println(Arrays.toString(ImageIO.getReaderFormatNames())); and attach only one or the other JBIG2 decoder, or both. What is the output you get each time? Btw in your command lines you don't need bc, commons log, diffutils, fontbox and hamcrest. These are for testing only / are in the pdfbox-app jar.
          Hide
          eruiz0 Esteban Nicolas Ruiz added a comment -

          My compiled JBIG2-Image-Decoder (sources from https://github.com/Borisvl/JBIG2-Image-Decoder)

          Show
          eruiz0 Esteban Nicolas Ruiz added a comment - My compiled JBIG2-Image-Decoder (sources from https://github.com/Borisvl/JBIG2-Image-Decoder )
          Hide
          eruiz0 Esteban Nicolas Ruiz added a comment -

          When attaching levigo-jbig2-imageio-1.6.5.jar (and also when attaching both):
          [JPG, jpg, bmp, BMP, gif, GIF, WBMP, png, PNG, JPEG, JBIG2, jpeg, wbmp, jbig2]

          When attaching only JBIG2-Image-Decoder.jar:
          [JPG, jpg, bmp, BMP, gif, GIF, WBMP, png, PNG, jpeg, wbmp, JPEG]

          For JBIG2-Image-Decoder.jar I have downloaded the zipped sources from https://github.com/Borisvl/JBIG2-Image-Decoder (where Dependencies.html points). imported them in Netbeans and compiled. Maybe I'm missing something?

          I have just attached my compiled JBIG2-Image-Decoder.jar.

          Show
          eruiz0 Esteban Nicolas Ruiz added a comment - When attaching levigo-jbig2-imageio-1.6.5.jar (and also when attaching both): [JPG, jpg, bmp, BMP, gif, GIF, WBMP, png, PNG, JPEG, JBIG2, jpeg, wbmp, jbig2] When attaching only JBIG2-Image-Decoder.jar: [JPG, jpg, bmp, BMP, gif, GIF, WBMP, png, PNG, jpeg, wbmp, JPEG] For JBIG2-Image-Decoder.jar I have downloaded the zipped sources from https://github.com/Borisvl/JBIG2-Image-Decoder (where Dependencies.html points). imported them in Netbeans and compiled. Maybe I'm missing something? I have just attached my compiled JBIG2-Image-Decoder.jar.
          Hide
          tilman Tilman Hausherr added a comment -

          I was able to make it work:

          • open the jar with a zip utility, e..g 7zip
          • create the directory "services" in "META-INF" and insert a file named "javax.imageio.spi.ImageReaderSpi", and as content it must have "org.jpedal.jbig2.jai.JBIG2ImageReaderSpi" in it.

          I don't know why this is so, I just looked at some jar files.

          I was able to display one file, I don't know if it works with all files. Years ago we observed that only the levigo plugin worked.

          Show
          tilman Tilman Hausherr added a comment - I was able to make it work: open the jar with a zip utility, e..g 7zip create the directory "services" in "META-INF" and insert a file named "javax.imageio.spi.ImageReaderSpi", and as content it must have "org.jpedal.jbig2.jai.JBIG2ImageReaderSpi" in it. I don't know why this is so, I just looked at some jar files. I was able to display one file, I don't know if it works with all files. Years ago we observed that only the levigo plugin worked.
          Hide
          tilman Tilman Hausherr added a comment -

          jbig2.jar is the lib that works... don't know why sizes are different.

          Show
          tilman Tilman Hausherr added a comment - jbig2.jar is the lib that works... don't know why sizes are different.
          Hide
          tilman Tilman Hausherr added a comment -

          Please give some feedback which library worked best for you.

          Show
          tilman Tilman Hausherr added a comment - Please give some feedback which library worked best for you.
          Hide
          eruiz0 Esteban Nicolas Ruiz added a comment -

          Great! I have tried with the provided jar and also with modifying my jar as suggested and it works fine in both cases.

          The difference in the size of the jars is mostly due to some sample files in the res folder (org\jpedal\jbig2\examples\viewer\res) that are not included in jbig2.jar.

          I have performed some performance meassures and compared the output with some sample files (three small ones, accounting for 8 pages and a big one with 982 pages):

          Output files are exactly the same regardless of the jar used (i.e.: my own compiled and modified jar, levigo-jbig2-imageio-1.6.5.jar and provided jbig2.jar).

          Performance seems to be better when using JBIG2-Image-Decoder instead of levigo-jbig2-imageio (results for the 982 pages pdf):

          time.jbig2 real 5m56,164s (provided jbig2.jar)
          time.levigo real 7m3,032s (levigo, but I was watching a video)
          time.levigo2 real 6m41,841s (levigo, again, avoided using the computer)
          time.myjbig2 real 5m52,687s (my compiled jar)
          time.no.jbig real 1m6,441s (no jars i.e.: jbig2 not supported)

          Show
          eruiz0 Esteban Nicolas Ruiz added a comment - Great! I have tried with the provided jar and also with modifying my jar as suggested and it works fine in both cases. The difference in the size of the jars is mostly due to some sample files in the res folder (org\jpedal\jbig2\examples\viewer\res) that are not included in jbig2.jar. I have performed some performance meassures and compared the output with some sample files (three small ones, accounting for 8 pages and a big one with 982 pages): Output files are exactly the same regardless of the jar used (i.e.: my own compiled and modified jar, levigo-jbig2-imageio-1.6.5.jar and provided jbig2.jar). Performance seems to be better when using JBIG2-Image-Decoder instead of levigo-jbig2-imageio (results for the 982 pages pdf): time.jbig2 real 5m56,164s (provided jbig2.jar) time.levigo real 7m3,032s (levigo, but I was watching a video) time.levigo2 real 6m41,841s (levigo, again, avoided using the computer) time.myjbig2 real 5m52,687s (my compiled jar) time.no.jbig real 1m6,441s (no jars i.e.: jbig2 not supported )
          Hide
          tilman Tilman Hausherr added a comment -

          It is weird that the IDR solution is faster because it has several disadvantages:

          • not available for maven
          • jar broken, as described. (I had a look at the IDR website through archive.org. The original source code download did not have a build.xml nor pom.xml)
          • not maintained. Last commit is from 2012
          • I can't test it because I would need to put it in the local maven repository.

          Maruan Sahyoun

          Two possibilities for the documentation:

          • remove it from the dependencies page
            or
          • keep it but mention that it must be built locally and one file must be added, as described here or in the issue on github I have linked to.
          Show
          tilman Tilman Hausherr added a comment - It is weird that the IDR solution is faster because it has several disadvantages: not available for maven jar broken, as described. (I had a look at the IDR website through archive.org. The original source code download did not have a build.xml nor pom.xml) not maintained. Last commit is from 2012 I can't test it because I would need to put it in the local maven repository. Maruan Sahyoun Two possibilities for the documentation: remove it from the dependencies page or keep it but mention that it must be built locally and one file must be added, as described here or in the issue on github I have linked to.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit f2e44e4e65d745f5d70750606b1079252637962b in pdfbox-docs's branch refs/heads/master from Maruan Sahyoun
          [ https://git-wip-us.apache.org/repos/asf?p=pdfbox-docs.git;h=f2e44e4 ]

          PDFBOX-3769: add description to documentation to overcome JBIG2-Image-Decoder issue

          Show
          jira-bot ASF subversion and git services added a comment - Commit f2e44e4e65d745f5d70750606b1079252637962b in pdfbox-docs's branch refs/heads/master from Maruan Sahyoun [ https://git-wip-us.apache.org/repos/asf?p=pdfbox-docs.git;h=f2e44e4 ] PDFBOX-3769 : add description to documentation to overcome JBIG2-Image-Decoder issue
          Hide
          tilman Tilman Hausherr added a comment -

          Thanks!

          Show
          tilman Tilman Hausherr added a comment - Thanks!

            People

            • Assignee:
              Unassigned
              Reporter:
              eruiz0 Esteban Nicolas Ruiz
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development