Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
2.0.0-alpha
-
None
-
None
Description
At present it is not possible to write or read block-compressed SequenceFiles using the GZIP codec without the native libraries being available.
The SequenceFile.Writer code checks for the availability of native libraries and throws a useful exception, but the SequenceFile.Reader doesn't do the same:
Exception in thread "main" java.io.EOFException at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:249) at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:239) at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:142) at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:58) at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:67) at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.<init>(GzipCodec.java:95) at org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.<init>(GzipCodec.java:104) at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:173) at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:183) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1591) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1493) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1480) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1475) at test.SequenceReader.read(SequenceReader.java:23)
Attachments
Attachments
Issue Links
- is duplicated by
-
HADOOP-6817 SequenceFile.Reader can't read gzip format compressed sequence file, which produce by a mapreduce job, without native compression library
- Resolved