Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-7911

Buffer Overflow when running HBase on HDFS Encryption Zone

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Duplicate
    • 2.6.0
    • None
    • encryption
    • None

    Description

      Create an HDFS EZ for HBase under /apps/hbase with some basic testing passed, including creating tables, listing, adding a few rows, scanning them, etc. However, when doing bulk load 100's k rows. After 10 minutes or so, we get the following error on the Region Server that owns the table.

      2015-03-02 10:25:47,784 FATAL [regionserver60020-WAL.AsyncSyncer0] wal.FSHLog: Error while AsyncSyncer sync, request close of hlog 
      java.io.IOException: java.nio.BufferOverflowException 
      at org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:156)
      at org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.encrypt(JceAesCtrCryptoCodec.java:127)
      at org.apache.hadoop.crypto.CryptoOutputStream.encrypt(CryptoOutputStream.java:162) 
      at org.apache.hadoop.crypto.CryptoOutputStream.flush(CryptoOutputStream.java:232) 
      at org.apache.hadoop.crypto.CryptoOutputStream.hflush(CryptoOutputStream.java:267) 
      at org.apache.hadoop.crypto.CryptoOutputStream.sync(CryptoOutputStream.java:262) 
      at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:123) 
      at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:165) 
      at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1241) 
      at java.lang.Thread.run(Thread.java:744) 
      Caused by: java.nio.BufferOverflowException 
      at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:357) 
      at javax.crypto.CipherSpi.bufferCrypt(CipherSpi.java:823) 
      at javax.crypto.CipherSpi.engineUpdate(CipherSpi.java:546) 
      at javax.crypto.Cipher.update(Cipher.java:1760) 
      at org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:145)
      ... 9 more 
      

      It looks like the HBase WAL (Write Ahead Log) use case is broken on the CryptoOutputStream(). The use case has one flusher thread that keeps calling the hflush() on WAL file while other roller threads are trying to write concurrently to that same file handle.

      As the class comments mentioned. ""CryptoOutputStream encrypts data. It is not thread-safe." I check the code and it seems the buffer overflow is related to the race between the CryptoOutputStream#write() and CryptoOutputStream#flush() as both can call CryptoOutputStream#encrypt(). The inBuffer/outBuffer of the CryptoOutputStream is not thread safe. They can be changed during encrypt for flush() when write() is coming from other threads.

      I have validated this with multi-threaded unit tests that mimic the HBase WAL use case. For file not under encryption zone (DFSOutputStream), multi-threaded flusher/writer works fine. For file under encryption zone (CryptoOutputStream), multi-threaded flusher/writer randomly fails with Buffer Overflow/Underflow.

      Attachments

        Issue Links

          Activity

            People

              hitliuyi Yi Liu
              xyao Xiaoyu Yao
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: