Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Duplicate
-
2.6.0
-
None
-
None
Description
Create an HDFS EZ for HBase under /apps/hbase with some basic testing passed, including creating tables, listing, adding a few rows, scanning them, etc. However, when doing bulk load 100's k rows. After 10 minutes or so, we get the following error on the Region Server that owns the table.
2015-03-02 10:25:47,784 FATAL [regionserver60020-WAL.AsyncSyncer0] wal.FSHLog: Error while AsyncSyncer sync, request close of hlog java.io.IOException: java.nio.BufferOverflowException at org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:156) at org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.encrypt(JceAesCtrCryptoCodec.java:127) at org.apache.hadoop.crypto.CryptoOutputStream.encrypt(CryptoOutputStream.java:162) at org.apache.hadoop.crypto.CryptoOutputStream.flush(CryptoOutputStream.java:232) at org.apache.hadoop.crypto.CryptoOutputStream.hflush(CryptoOutputStream.java:267) at org.apache.hadoop.crypto.CryptoOutputStream.sync(CryptoOutputStream.java:262) at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:123) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:165) at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1241) at java.lang.Thread.run(Thread.java:744) Caused by: java.nio.BufferOverflowException at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:357) at javax.crypto.CipherSpi.bufferCrypt(CipherSpi.java:823) at javax.crypto.CipherSpi.engineUpdate(CipherSpi.java:546) at javax.crypto.Cipher.update(Cipher.java:1760) at org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:145) ... 9 more
It looks like the HBase WAL (Write Ahead Log) use case is broken on the CryptoOutputStream(). The use case has one flusher thread that keeps calling the hflush() on WAL file while other roller threads are trying to write concurrently to that same file handle.
As the class comments mentioned. ""CryptoOutputStream encrypts data. It is not thread-safe." I check the code and it seems the buffer overflow is related to the race between the CryptoOutputStream#write() and CryptoOutputStream#flush() as both can call CryptoOutputStream#encrypt(). The inBuffer/outBuffer of the CryptoOutputStream is not thread safe. They can be changed during encrypt for flush() when write() is coming from other threads.
I have validated this with multi-threaded unit tests that mimic the HBase WAL use case. For file not under encryption zone (DFSOutputStream), multi-threaded flusher/writer works fine. For file under encryption zone (CryptoOutputStream), multi-threaded flusher/writer randomly fails with Buffer Overflow/Underflow.
Attachments
Issue Links
- duplicates
-
HADOOP-11708 CryptoOutputStream synchronization differences from DFSOutputStream break HBase
- Open
- is required by
-
HADOOP-11708 CryptoOutputStream synchronization differences from DFSOutputStream break HBase
- Open