[HBASE-22149] HBOSS: A FileSystem implementation to provide HBase's required semantics on object stores - ASF JIRA

Details

Type: New Feature
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: None
Fix Version/s: hbase-filesystem-1.0.0-alpha1
Component/s: Filesystem Integration
Labels:
- HBOSS

Release Note:

Hide


Initial implementation of the hbase-oss module. Defines a wrapper implementation of Apache Hadoop's FileSystem interface that bridges the gap between Apache HBase, which assumes that many operations are atomic, and object-store implementations of FileSystem (such as s3a) which inherently cannot provide atomic semantics to those operations natively.

The implementation can be used e.g. with the s3a filesystem by using a root fs like `s3a://bucket/` and defining

* `fs.s3a.impl` set to `org.apache.hadoop.hbase.oss.HBaseObjectStoreSemantics`
* `fs.hboss.fs.s3a.impl` set to `org.apache.hadoop.fs.s3a.S3AFileSystem`

more details in the module's README.md

NOTE: This module is labeled with an ALPHA version. It is not considered production ready and makes no promises about compatibility between versions.

Show
 Initial implementation of the hbase-oss module. Defines a wrapper implementation of Apache Hadoop's FileSystem interface that bridges the gap between Apache HBase, which assumes that many operations are atomic, and object-store implementations of FileSystem (such as s3a) which inherently cannot provide atomic semantics to those operations natively. The implementation can be used e.g. with the s3a filesystem by using a root fs like ` s3a://bucket/ ` and defining * `fs.s3a.impl` set to `org.apache.hadoop.hbase.oss.HBaseObjectStoreSemantics` * `fs.hboss.fs.s3a.impl` set to `org.apache.hadoop.fs.s3a.S3AFileSystem` more details in the module's README.md NOTE: This module is labeled with an ALPHA version. It is not considered production ready and makes no promises about compatibility between versions.

Description

(Have been using the name HBOSS for HBase / Object Store Semantics)

I've had some thoughts about how to solve the problem of running HBase on object stores. There has been some thought in the past about adding the required semantics to S3Guard, but I have some concerns about that. First, it's mixing complicated solutions to different problems (bridging the gap between a flat namespace and a hierarchical namespace vs. solving inconsistency). Second, it's S3-specific, whereas other objects stores could use virtually identical solutions. And third, we can't do things like atomic renames in a true sense. There would have to be some trade-offs specific to HBase's needs and it's better if we can solve that in an HBase-specific module without mixing all that logic in with the rest of S3A.

Ideas to solve this above the FileSystem layer have been proposed and considered (~~HBASE-20431~~, for one), and maybe that's the right way forward long-term, but it certainly seems to be a hard problem and hasn't been done yet. But I don't know enough of all the internal considerations to make much of a judgment on that myself.

I propose a FileSystem implementation that wraps another FileSystem instance and provides locking of FileSystem operations to ensure correct semantics. Locking could quite possibly be done on the same ZooKeeper ensemble as an HBase cluster already uses (I'm sure there are some performance considerations here that deserve more attention). I've put together a proof-of-concept on which I've tested some aspects of atomic renames and atomic file creates. Both of these tests fail reliably on a naked s3a instance. I've also done a small YCSB run against a small cluster to sanity check other functionality and was successful. I will post the patch, and my laundry list of things that still need work. The WAL is still placed on HDFS, but the HBase root directory is otherwise on S3.

Note that my prototype is built on Hadoop's source tree right now. That's purely for my convenience in putting it together quickly, as that's where I mostly work. I actually think long-term, if this is accepted as a good solution, it makes sense to live in HBase (or it's own repository). It only depends on stable, public APIs in Hadoop and is targeted entirely at HBase's needs, so it should be able to iterate on the HBase community's terms alone.

Another idea stevel@apache.org proposed to me is that of an inode-based FileSystem that keeps hierarchical metadata in a more appropriate store that would allow the required transactions (maybe a special table in HBase could provide that store itself for other tables), and stores the underlying files with unique identifiers on S3. This allows renames to actually become fast instead of just large atomic operations. It does however place a strong dependency on the metadata store. I have not explored this idea much. My current proof-of-concept has been pleasantly simple, so I think it's the right solution unless it proves unable to provide the required performance characteristics.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HBASE-22149-hbase-filesystem-1.patch
29/Apr/19 22:00
200 kB
Sean Mackrory
HBASE-22149-hbase-filesystem-1.patch
30/Apr/19 13:20
197 kB
Sean Mackrory
HBASE-22149-hbase-5.patch
25/Apr/19 22:59
166 kB
Sean Mackrory
HBASE-22149-hbase-4.patch
22/Apr/19 16:54
153 kB
Sean Mackrory
HBASE-22149-hbase-3.patch
10/Apr/19 01:31
136 kB
Sean Mackrory
HBASE-22149-hbase-2.patch
09/Apr/19 13:41
130 kB
Sean Mackrory
HBASE-22149-hbase.patch
03/Apr/19 23:34
89 kB
Sean Mackrory
HBASE-22149-hadoop.patch
02/Apr/19 20:15
93 kB
Sean Mackrory

Issue Links

is related to

HBASE-22393 HBOSS: Shaded external dependencies to avoid conflicts with Hadoop and HBase

Resolved

HBASE-22386 HBOSS: Limit depth that listing locks check for other locks

Resolved

HBASE-22427 HBOSS: TestTreeLockManager fails on non-ZK implementations

Resolved

HBASE-22416 HBOSS: unit tests fail with ConnectionLoss when IPv6 enabled and not set up locally

Resolved

HBASE-22437 HBOSS: Add Hadoop 2 / 3 profiles

Resolved

HBASE-22415 HBOSS: Reduce log verbosity in ZKTreeLockManager when waiting on a parent/child node lock

Resolved

HBASE-22493 HBOSS: Document supported hadoop versions.

Resolved

links to

GitHub Pull Request #1

(2 is related to, 1 links to)

HBOSS: A FileSystem implementation to provide HBase's required semantics on object stores

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates