Details
Description
Discovered in IMPALA-5909.
Opening an encrypted HDFS file returns a chain of wrapped input streams:
HdfsDataInputStream CryptoInputStream DFSInputStream
If an application such as Impala or HBase calls HdfsDataInputStream#unbuffer, FSDataInputStream#unbuffer will be called:
try { ((CanUnbuffer)in).unbuffer(); } catch (ClassCastException e) { throw new UnsupportedOperationException("this stream does not " + "support unbuffering."); }
If the in class does not implement CanUnbuffer, UOE will be thrown. If the application is not careful, tons of UOEs will show up in logs.
In comparison, opening an non-encrypted HDFS file returns this chain:
HdfsDataInputStream DFSInputStream
DFSInputStream implements CanUnbuffer.
It is good for CryptoInputStream to implement CanUnbuffer for 2 reasons:
- Release buffer, cache, or any other resource when instructed
- Able to call its wrapped DFSInputStream unbuffer
- Avoid the UOE described above. Applications may not handle the UOE very well.
Attachments
Attachments
Issue Links
- depends upon
-
HADOOP-15012 Add readahead, dropbehind, and unbuffer to StreamCapabilities
- Resolved
- relates to
-
HADOOP-13327 Add OutputStream + Syncable to the Filesystem Specification
- Resolved
-
HADOOP-14747 S3AInputStream to implement CanUnbuffer
- Resolved
-
HADOOP-14748 Wasb input streams to implement CanUnbuffer
- Resolved
-
IMPALA-5909 File handle cache causes HDFS to log excessive errors when trying to unbuffer files
- Resolved
-
HDFS-12604 StreamCapability enums are not displayed in javadoc
- Resolved
-
HADOOP-12805 Annotate CanUnbuffer with @InterfaceAudience.Public
- Closed