Description
server-side transcoding, predicate-pushdown filtering in storage, client-side decryption can all result in an input stream shorter or longer than the file size as measured in getFileStatus/listFiles/listStatus.
Assuming the length is known once open() returns, the FSDataInputStream can return the length of the data, which can then be used for accurate seeks within the data.
- requires the streams to know their length; easy to check (hasCapabilities), or make the new method return an Optional<Long>, but caller will need to (a) look for this if present and (b) fall back.
- Could be rolled out for all our own clients/connectors, would take time for others
Attachments
Issue Links
- blocks
-
HADOOP-15006 Encrypt S3A data client-side with Hadoop libraries & Hadoop KMS
- Open
- is depended upon by
-
HADOOP-15364 Add support for S3 Select to S3A
- Resolved