Details
-
Umbrella
-
Status: Resolved
-
Major
-
Resolution: Implemented
-
None
-
None
-
None
Description
We can support reasonably well use cases on non-HDFS filesystems, like S3, where an external writer has loaded (and continues to load) HFiles via the bulk load mechanism, and then we serve out a read only workload at the HBase API.
Mixed workloads or write-heavy workloads won't fare as well. In fact, data loss seems certain. It will depend in the specific filesystem, but all of the S3 backed Hadoop filesystems suffer from a couple of obvious problems, notably a lack of atomic rename.
This umbrella will serve to collect some related ideas for consideration.
Attachments
Issue Links
- relates to
-
HBASE-21070 SnapshotFileCache won't update for snapshots stored in S3
- Resolved
1.
|
Store commit transaction for filesystems that do not support an atomic rename | Closed | Unassigned |
Let me just state this to get it out of the way. As you can imagine, reading between the lines, the motivation to look at this where I work is the good probability our storage stack is either going to utilize Amazon's S3 service "where applicable" or a compatible API analogue. Please don't take this to infer anything about business relationships, or not. Really, I would personally have no idea one way or the other.