Details
Description
The Python library for avro fails to write some blocks when used with snappy compression.
The error is:
Traceback (most recent call last): File "tools/json_to_avro.py", line 74, in <module> writer.append(line) File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 185, in append self._write_block() File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 169, in _write_block self.encoder.write_crc32(uncompressed_data) File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/io.py", line 364, in write_crc32 self.write(STRUCT_CRC32.pack(crc32(bytes))); struct.error: integer out of range for 'I' format code
From my investigation, str(crc32(bytes)) is showing negative integers, so the issue seems to be fixed by masking the output.
This fix appears to work from my limited testing:
--- io.old.py 2011-09-21 14:32:38.992544680 +1000
+++ io.py 2011-09-21 14:33:11.492544686 +1000
@@ -360,7 +360,7 @@
"""
A 4-byte, big-endian CRC32 checksum
"""
- self.write(STRUCT_CRC32.pack(crc32(bytes)));
+ self.write(STRUCT_CRC32.pack(crc32(bytes) & 0xffffffff));
#
# DatumReader/Writer
Attachments
Issue Links
- is cloned by
-
AVRO-2227 CLONE - Python snappy error: "integer out of range for 'I' format code"
- Resolved