Description
When attempting in PHP to deserialize a file containing a large number of records (see example file attached – 20,000 records) that uses the DEFLATE codec, the `$decoder` instance advances through the file incorrectly, eventually yielding an empty string that is passed into `gzinflate(...)` on this line: https://github.com/apache/avro/blob/a6f13b269a359d3839e55a75e0662d834d76992c/lang/php/lib/DataFile/AvroDataIOReader.php#L176
...resulting in a PHP error being raised. Notably, at the time when this happens, not all records have been deserialized, so it seems that this is related to there being multiple "blocks" in the file.
I've attached a file that meets this condition, and also a quick Kotlin project using the official Java library that I used to generate the file.
The PHP code in question to reproduce this behavior is pretty standard, lifted directly from the provided examples/write_read.php file:
<?phpif (count($argv) < 2) {
echo "USAGE: php main.php FILENAME";
exit(1);
}
$filename = $argv[1];
require_once _DIR_ . '/../vendor/avro-php-1.11.0/lib/autoload.php';
use Apache\Avro\DataFile\AvroDataIO;
$data_reader = AvroDataIO::openFile($filename);
echo "Reading from $filename:\n";
foreach ($data_reader->data() as $datum) {
echo var_export($datum, true) . "\n";
}
$data_reader->close();