What happens if you compress a tarball of those files instead?
Here are my results, on a directory with 4.1GB of ~64MB files. The content is mixed binary/text (key/value data, binary keys, mixed binary/text values).
This is on CentOS 5.5, with the 'xz' and 'bzip2' packages installed via yum.
Compression / decompression speed. Disk is capable of 200MB/sec read/write, 16GB RAM, Nehalem based processor (Xeon E5620, 2.4Ghz).
Tests confirmed to be CPU bound with no iowait. measurements are in MB/sec for the uncompressed data.
Source tarball, 4130 MB (100%)
|| compressed size
|| compressed size as percent
|| time to compress
|| compression rate
|| time to decompress
|| decompression rate
|| 105 s
|| (39.3 MB/sec)
|| 42 s
|| 98.3 MB/sec
|| 251 s
|| (16.5 MB/sec)
|| 99.5 MB/sec
|| 656 s
|| (6.3 MB/sec)
|| 168 s
|| 24.6 MB/sec
|| 725 s
|| (5.7 MB/sec)
|| 176 s
|| 23.5 MB/sec
|| 763 s
|| (5.4 MB/sec)
|| 181 s
|| 22.8 MB/sec
|| 429 s
|| (9.63 MB/sec)
|| 43.5 MB/sec
|| 2861 s
|| (1.44 MB/sec)
|| 49.7 MB/sec
Note that on today's newest processors, gzip decompresses at gigabit ethernet speeds. xz is half that, and bzip2 about half that again. Gzip ane zx decompress faster at higher compression ratios, bzip2 decompresses slower at higher ratios. All compress slower the higher the ratio, but bzip2 only slows down by ~20% or so from the fast to slow settings, while gzip and xz slow down by a factor of 10+ (I did not do -9 tests here for those, they are very slow).
IMO, since xz-2 is almost 2x as fast at compression and decompression as bzip2, and similar in compression ratio, it leaves little room for bzip2's use.
At higher compression levels, xz is very slow to compress, but achieves compression ratios significantly better than anything else and still decompresses very fast, so its great for archival storage.
For faster compression, gzip -1 or lzo and other compression types without an entropy coder are the only options.
The link I provided above has several cases where xz is 3 or more times faster than bzip2 at decompression, but my data doesn't behave that way.
$ time cat packed.tar | gzip -c1 > packed.gz1
$ time cat packed.tar | gzip -c6 > packed.gz6
$ time cat packed.tar | bzip2 -2 > packed.bz2-2
$ time cat packed.tar | bzip2 -6 > packed.bz2-6
$ time cat packed.tar | bzip2 -9 > packed.bz2-9
$ time cat packed.tar | xz -zv -2 - > packed.xz
100.0 % 991.1 MiB / 4,125.0 MiB = 0.240 9.6 MiB/s 7:09
$ time cat packed.tar | xz -zv -6 - > packed.xz6
100.0 % 792.6 MiB / 4,125.0 MiB = 0.192 1.4 MiB/s 47:41
Tests of decompression:
$ time cat packed.gz1 | gunzip > /dev/null
$ time cat packed.gz6 | gunzip > /dev/null
$ time cat packed.bz2-2 | bunzip2 > /dev/null
$ time cat packed.bz2-6 | bunzip2 > /dev/null
$ time cat packed.bz2-9 | bunzip2 > /dev/null
$ time cat packed.xz | xz -dc > /dev/null
$ time cat packed.xz6 | xz -dc > /dev/null