Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
1.0-alpha1
-
None
Description
TIFF files support many different formats, some of them legacy or specialty formats, others that are widely used. DataReaderStrips and DataReaderTiled were originally written with a single block of code that collected the raw data (samples) for each pixel and then passed it into a single method that branched depending on the format. This approach meant that for each pixel, the reader loops had the extra overhead of a method call that executed multiple conditional evaluations. In 2012, enhancements were added to imaging to execute dedicated blocks of code for a few commonly used formats, most notably 3-byte RGB. However, at this time, the code does not support the case where the RGB is stored with a differencing predictor. Predictors improve the compression ratios (often significantly) when compressing RGB images. So I propose to enhance the dedicated RGB code to support predictors.
Here's an example of some performance testing on a large image that uses compression with imaging. The time to load images was extracted using the ApacheImagingSpeedAndMemoryTest.java code that is included in the examples directory in the Commons Imaging code distribution.
Processing file: CONUS_LandWaterMask_LZW_RGB.tif (original)
image size: 6000 by 4000
Processing file: CONUS_LandWaterMask_LZW_RGB.tif (original) image size: 6000 by 4000 time to load image -- memory time ms avg ms -- used mb total mb 971.817 0.000 -- 213.592 252.000 921.690 0.000 -- 143.229 260.000 895.587 895.587 -- 96.234 174.000 899.227 897.407 -- 117.259 154.000 899.078 897.964 -- 134.200 184.000 889.602 895.873 -- 143.226 180.000 896.170 895.933 -- 128.183 188.000 894.250 895.652 -- 97.187 178.000 896.436 895.764 -- 103.226 186.000 891.540 895.236 -- 119.185 171.000 Processing file: CONUS_LandWaterMask_LZW_RGB.tif (with changes) image size: 6000 by 4000 time to load image -- memory time ms avg ms -- used mb total mb 498.123 0.000 -- 212.589 252.000 423.136 0.000 -- 110.733 237.000 396.021 396.021 -- 100.735 164.000 400.435 398.228 -- 115.725 160.000 400.901 399.119 -- 114.726 162.000 395.092 398.112 -- 118.711 159.000 394.106 397.311 -- 118.710 159.000 400.866 397.903 -- 118.710 159.000 400.972 398.342 -- 115.710 160.000 397.218 398.201 -- 109.691 164.000
Additionally, the special-purpose RGB block of code included additional logic to support a case for non-RGB formats where image samples were organized 3 one byte samples, but the photometric interpretation was not RGB. According to Coveralls, this block of code is not exercised by any of our test images. Thus that part of the code is uncovered by testing. So I will be removing it to improve the code-coverage scores. I believe that this change is appropriate because, even if there are TIFF files "in the wild" that use this configuration, the commons imaging library will still work properly. In such a case, the image samples would be handled properly by the original, non-specialized block of code. Furthermore, I went through the TIFF specification and did not see any obvious examples of a case where that configuration would be likely.