The Python tool to decode avro files is currently missing support for bzip2 compression.
Got it, sorry
This is not a HADOOP project issue, but an AVRO one. As my comment noted, you have filed it with the correct project at AVRO-1527. Therefore, I've marked this as Invalid.
Sorry for any ambiguity. You had marked a similar mistake invalid yesterday BTW: HADOOP-10708
Example stack trace:
Traceback (most recent call last):
File "../tools/avro2json.py", line 10, in <module>
df_reader = datafile.DataFileReader(avrofile, io.DatumReader())
File "/Users/eustache/anaconda/lib/python2.7/site-packages/avro/datafile.py", line 240, in _init_
raise DataFileException('Unknown codec: %s.' % self.codec)
avro.datafile.DataFileException: Unknown codec: bzip2.
How is it invalid ? could you elaborate please ?
As of 1.7.6 the python tool doesn't support bzip2... while snappy for instance is supported.
Moved to AVRO-1527