[HBASE-18784] Use of filesystem that requires hflush / hsync / append / etc should query outputstream capabilities - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 1.4.0, 2.0.0-alpha-2
Fix Version/s: 2.0.0
Component/s: Filesystem Integration
Labels:
- s3

Release Note:

Hide


If HBase is run on top of Apache Hadoop libraries that support the needed APIs it will verify that underlying Filesystem implementations provide the needed durability mechanisms to safely operate. The needed APIs *should* be present in Hadoop 3 release and Hadoop 2 releases starting in the Hadoop 2.9 series. If the APIs are not available, HBase behaves as it has in previous releases (that is, it moves forward assuming such a check would pass).

Where this check fails, it is unsafe to rely on HBase in a production setting. In the event of process or node failure, the HBase RegionServer process may fail to have access to all the data it previously wrote to its write ahead log, resulting in data loss. In the event of process or node failure, the HBase master process may lose all or part of the write ahead log that it relies on for cluster management operations, leaving the cluster in an inconsistent state that we aren't sure it could recover from.

Notably, the LocalFileSystem implementation provided by Hadoop reports (accurately) via these new APIs that it can not provide the durability HBase needs to operate. As such, the current instructions for single-node HBase operation have been updated both with a) how to bypass this safety check and b) a strong warning about the dire consequences of doing so outside of a dev/test environment.

Show
 If HBase is run on top of Apache Hadoop libraries that support the needed APIs it will verify that underlying Filesystem implementations provide the needed durability mechanisms to safely operate. The needed APIs *should* be present in Hadoop 3 release and Hadoop 2 releases starting in the Hadoop 2.9 series. If the APIs are not available, HBase behaves as it has in previous releases (that is, it moves forward assuming such a check would pass). Where this check fails, it is unsafe to rely on HBase in a production setting. In the event of process or node failure, the HBase RegionServer process may fail to have access to all the data it previously wrote to its write ahead log, resulting in data loss. In the event of process or node failure, the HBase master process may lose all or part of the write ahead log that it relies on for cluster management operations, leaving the cluster in an inconsistent state that we aren't sure it could recover from. Notably, the LocalFileSystem implementation provided by Hadoop reports (accurately) via these new APIs that it can not provide the durability HBase needs to operate. As such, the current instructions for single-node HBase operation have been updated both with a) how to bypass this safety check and b) a strong warning about the dire consequences of doing so outside of a dev/test environment.

Description

In places where we rely on the underlying filesystem holding up the promises of hflush/hsync (most importantly the WAL), we should use the new interfaces provided by ~~HDFS-11644~~ to fail loudly when they are not present (e.g. on S3, on EC mounts, etc).

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HBASE-18784-branch-1.v2.patch
05/Nov/17 03:27
132 kB
Sean Busbey
HBASE-18784.2.patch
02/Nov/17 20:16
148 kB
Sean Busbey
HBASE-18784.1.patch
02/Nov/17 18:26
148 kB
Sean Busbey
HBASE-18784.0.patch
01/Nov/17 18:16
146 kB
Sean Busbey

Issue Links

relates to

HBASE-21735 Port HBASE-18784 (Use of filesystem that requires hflush / hsync / append / etc should query outputstream capabilities) to branch-1

Closed

links to

reviewboard

Activity

People

Assignee:: Sean Busbey

Reporter:: Sean Busbey

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 08/Sep/17 21:35

Updated:: 23/Jun/22 19:30

Resolved:: 11/Apr/18 22:00