Bug 40676 - [PATCH] png graphics are expanded/uncompressed in pdf causing massive file size increase
Summary: [PATCH] png graphics are expanded/uncompressed in pdf causing massive file si...
Status: RESOLVED FIXED
Alias: None
Product: Fop - Now in Jira
Classification: Unclassified
Component: images (show other bugs)
Version: trunk
Hardware: PC other
: P3 enhancement
Target Milestone: ---
Assignee: fop-dev
URL:
Keywords:
Depends on: 53408
Blocks:
  Show dependency tree
 
Reported: 2006-10-04 03:44 UTC by Paul A. Bristow
Modified: 2012-10-20 23:53 UTC (History)
0 users



Attachments
output PDF of test case after fix (162.34 KB, application/pdf)
2012-05-13 22:05 UTC, Luis Bernardo
Details
patch (51.04 KB, patch)
2012-06-13 13:06 UTC, Luis Bernardo
Details | Diff
test examples (101.08 KB, application/x-gzip)
2012-06-13 13:13 UTC, Luis Bernardo
Details
another example (760.08 KB, application/x-gzip)
2012-06-13 15:02 UTC, Luis Bernardo
Details
patch for findbugs (2.41 KB, patch)
2012-06-15 11:42 UTC, Luis Bernardo
Details | Diff
patch for documentation (2.90 KB, patch)
2012-06-17 06:39 UTC, Luis Bernardo
Details | Diff
patch to disallow multiple image filters (5.22 KB, patch)
2012-06-24 14:11 UTC, Luis Bernardo
Details | Diff
update to previous test examples (100.54 KB, application/x-gzip)
2012-06-24 14:13 UTC, Luis Bernardo
Details
update to previous example (760.25 KB, application/x-gzip)
2012-06-24 14:14 UTC, Luis Bernardo
Details
support for sRGB and iCCP chunks (10.79 KB, patch)
2012-07-28 23:42 UTC, Luis Bernardo
Details | Diff
example with ICC profiles (582.99 KB, application/pdf)
2012-07-28 23:45 UTC, Luis Bernardo
Details
example images with ICC profiles (582.62 KB, application/x-gzip)
2012-07-28 23:47 UTC, Luis Bernardo
Details
example with sRGB profile (sRGB chunk) (930.62 KB, application/pdf)
2012-07-28 23:48 UTC, Luis Bernardo
Details
example images with sRGB chunk (924.67 KB, application/x-gzip)
2012-07-28 23:49 UTC, Luis Bernardo
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Paul A. Bristow 2006-10-04 03:44:48 UTC
graphics as png files (total files size about 2 Mb) are expanded to some bitmap?
representation in pdf leading to a massive increase in file size (from 0.6 Mb
using version 2 to 10 Mb using version 0.92).

Is there some way of specifying that compressed graphics should be used in the pdf.

(The graphics are actually math equations - fairly complicated with integrals
etc, and some simple graphs of functions like normal distribution. About 30 of
each, so quite a lot of work to change representation ;-) Is there some other
workaround?  vector format?)
Comment 1 Jeremias Maerki 2006-10-04 14:17:54 UTC
PNG images are currently decompressed, normalized to RGBA (32bit) and then
recompressed for PDF. Of course, that's suboptimal.

Yes, it's theoretically possible to embed the compressed PNG data directly in
the PDF. This is something that is on my task list for the next three months as
part of a general redesign of the image adapter package which is also
responsible for the implicit conversion to RGBA of the image.

Work-arounds for you: Yes, please try a vector format, preferably SVG because we
can handle this natively. Your equations will look much nicer.
Comment 2 Paul A. Bristow 2006-10-10 09:36:42 UTC
Thanks for helpful comment and promises ;-) But meanwhile:  Do you any
recommendations on MathML to SVG conversion.  A quick Google turns up some tools
but some seem not to work well yet.

We also have a problem with simple x y graphs to show simple functions as
coloured curves, for example normal distribution 'bell' curve.  Do you have
suggestions for doing these (from tables of data as decimal digits).

Thanks

Paul
Comment 3 Jeremias Maerki 2006-10-11 01:12:12 UTC
We use JEuclid.
See the FO extension in examples/mathml
Comment 4 Glenn Adams 2012-04-07 01:43:32 UTC
resetting P2 open bugs to P3 pending further review
Comment 5 Glenn Adams 2012-04-11 06:17:49 UTC
change status from ASSIGNED to NEW for consistency
Comment 6 Luis Bernardo 2012-05-13 22:05:42 UTC
Created attachment 28771 [details]
output PDF of test case after fix
Comment 7 Luis Bernardo 2012-05-13 22:10:28 UTC
sorry, above attachment, 28771, is for bug 40699. stupid bugzilla...
Comment 8 Luis Bernardo 2012-06-13 13:06:12 UTC
Created attachment 28929 [details]
patch

this patch includes new image handlers for PNG for PDF and PS output. these image handlers can handle the raw IDAT chunk and lead to smaller files.

to use these handlers the ImageLoaderRawPNG needs to be enabled in the fop.xconf file.
Comment 9 Luis Bernardo 2012-06-13 13:13:18 UTC
Created attachment 28930 [details]
test examples

the attachment includes a test.fo file, some test images, a fop.xconf, and PDF and PS output of the test file. note that the image loader in the fop.xconf needs to be enabled/disabled.
Comment 10 Luis Bernardo 2012-06-13 15:02:04 UTC
Created attachment 28932 [details]
another example

this example uses a larger image to better show the differences in file size of the resulting PDF.
Comment 11 Glenn Adams 2012-06-15 03:00:32 UTC
patch applied at http://svn.apache.org/viewvc?rev=1350455&view=rev

thanks luis and matthias!

luis, please provide an additional patch (against this rev) that fixes the 13 findbugs warnings you introduced (I fixed the 2 checkstyle warnings introduced); also update the relevant documentation to describe the configuration/use of these features

in the mean time, I will transition this bug to NEEDINFO
Comment 12 Luis Bernardo 2012-06-15 11:42:58 UTC
Created attachment 28944 [details]
patch for findbugs

fixes/ignores findbugs
Comment 13 Luis Bernardo 2012-06-15 14:18:32 UTC
Wiki updated: http://wiki.apache.org/xmlgraphics-fop/HowTo/ImageLoaderRawPNG
Comment 14 Glenn Adams 2012-06-15 20:45:14 UTC
findbugs fix patch applied at http://svn.apache.org/viewvc?rev=1350790&view=rev

thanks luis! please review and close if satisfied
Comment 15 Glenn Adams 2012-06-15 21:53:27 UTC
(In reply to comment #13)
> Wiki updated: http://wiki.apache.org/xmlgraphics-fop/HowTo/ImageLoaderRawPNG

you may also want to update [1] and [2]

[1] http://xmlgraphics.apache.org/fop/trunk/graphics.html#png
[2] http://xmlgraphics.apache.org/fop/trunk/configuration.html#image-loading

for example, adding info to [1] about the new codecs, and adding info to [2] about negative penalties
Comment 16 Luis Bernardo 2012-06-17 06:39:40 UTC
Created attachment 28950 [details]
patch for documentation

I was not aware the web pages content was part of the source too... The patch updates the documentation.
Comment 17 Glenn Adams 2012-06-19 00:21:02 UTC
(In reply to comment #16)
> Created attachment 28950 [details]
> patch for documentation
> 
> I was not aware the web pages content was part of the source too... The
> patch updates the documentation.

Applied final docs patch at http://svn.apache.org/viewvc?rev=1351540&view=rev.

Thanks luis! Please close this bug when you finish review.
Comment 18 Luis Bernardo 2012-06-24 14:11:00 UTC
Created attachment 28988 [details]
patch to disallow multiple image filters

Adobe Reader fails to display the submitted examples when the raw png image loader is used, even though they seem valid according to the spec, and are correctly displayed by evince, ghostscript and Preview (Mac OS X PDF viewer). The Adobe Reader issue only happens if the filter flate is turned on and seems to be due to the fact that Adobe Reader may not like more than one filter applied to an image. Multiple filters applied to a stream are valid, but are also used by PDF malware exploits (like zip bombs). I suspect this is the reason why Adobe Reader does not like them.

The patch fixes this problem by only using one image filter even when the default flate filter is turned on.
Comment 19 Luis Bernardo 2012-06-24 14:13:17 UTC
Created attachment 28989 [details]
update to previous test examples
Comment 20 Luis Bernardo 2012-06-24 14:14:34 UTC
Created attachment 28990 [details]
update to previous example
Comment 21 Glenn Adams 2012-06-24 18:12:26 UTC
(In reply to comment #18)
> Created attachment 28988 [details]
> patch to disallow multiple image filters
> 
> Adobe Reader fails to display the submitted examples when the raw png image
> loader is used, even though they seem valid according to the spec, and are
> correctly displayed by evince, ghostscript and Preview (Mac OS X PDF
> viewer). The Adobe Reader issue only happens if the filter flate is turned
> on and seems to be due to the fact that Adobe Reader may not like more than
> one filter applied to an image. Multiple filters applied to a stream are
> valid, but are also used by PDF malware exploits (like zip bombs). I suspect
> this is the reason why Adobe Reader does not like them.
> 
> The patch fixes this problem by only using one image filter even when the
> default flate filter is turned on.

patch applied at http://svn.apache.org/viewvc?rev=1353303&view=rev

thanks luis!
Comment 22 Glenn Adams 2012-06-24 18:13:18 UTC
(In reply to comment #19)
> Created attachment 28989 [details]
> update to previous test examples

not sure what to do with these... do you wish to commit to trunk somewhere? if so, you need to organize into standard test cases, etc
Comment 23 Luis Bernardo 2012-06-24 22:22:11 UTC
no, the idea was never to make these part of the source. they are just provided to show what type of png images are currently supported by the new png image loader, and that the resulting output files are smaller.
Comment 24 Glenn Adams 2012-06-24 23:40:02 UTC
(In reply to comment #23)
> no, the idea was never to make these part of the source. they are just
> provided to show what type of png images are currently supported by the new
> png image loader, and that the resulting output files are smaller.

ok, if you don't think any additional test cases are useful for this bug, then please close this when you're ready
Comment 25 Luis Bernardo 2012-07-28 23:42:15 UTC
Created attachment 29132 [details]
support for sRGB and iCCP chunks

this new patch improves the previous one and adds support to color spaces to the PDFImageHandlerRawPNG.
Comment 26 Luis Bernardo 2012-07-28 23:45:59 UTC
Created attachment 29133 [details]
example with ICC profiles
Comment 27 Luis Bernardo 2012-07-28 23:47:06 UTC
Created attachment 29134 [details]
example images with ICC profiles
Comment 28 Luis Bernardo 2012-07-28 23:48:47 UTC
Created attachment 29135 [details]
example with sRGB profile (sRGB chunk)
Comment 29 Luis Bernardo 2012-07-28 23:49:55 UTC
Created attachment 29136 [details]
example images with sRGB chunk
Comment 30 Ognjen Blagojevic 2012-10-08 07:44:53 UTC
Although file size increase is now solved, I am reopening this issue as a reminder to apply patches starting with attachment 29132 [details]. 

Those new patches perform better, and they also solve issue 51149.
Comment 31 Luis Bernardo 2012-10-20 23:53:38 UTC
applied patch 29132 with changes: http://svn.apache.org/viewvc?view=revision&revision=1400536