Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-18784

Use of filesystem that requires hflush / hsync / append / etc should query outputstream capabilities

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.4.0, 2.0.0-alpha-2
    • 2.0.0
    • Filesystem Integration
    • Hide
      <!-- markdown -->

      If HBase is run on top of Apache Hadoop libraries that support the needed APIs it will verify that underlying Filesystem implementations provide the needed durability mechanisms to safely operate. The needed APIs *should* be present in Hadoop 3 release and Hadoop 2 releases starting in the Hadoop 2.9 series. If the APIs are not available, HBase behaves as it has in previous releases (that is, it moves forward assuming such a check would pass).

      Where this check fails, it is unsafe to rely on HBase in a production setting. In the event of process or node failure, the HBase RegionServer process may fail to have access to all the data it previously wrote to its write ahead log, resulting in data loss. In the event of process or node failure, the HBase master process may lose all or part of the write ahead log that it relies on for cluster management operations, leaving the cluster in an inconsistent state that we aren't sure it could recover from.

      Notably, the LocalFileSystem implementation provided by Hadoop reports (accurately) via these new APIs that it can not provide the durability HBase needs to operate. As such, the current instructions for single-node HBase operation have been updated both with a) how to bypass this safety check and b) a strong warning about the dire consequences of doing so outside of a dev/test environment.
      Show
      <!-- markdown --> If HBase is run on top of Apache Hadoop libraries that support the needed APIs it will verify that underlying Filesystem implementations provide the needed durability mechanisms to safely operate. The needed APIs *should* be present in Hadoop 3 release and Hadoop 2 releases starting in the Hadoop 2.9 series. If the APIs are not available, HBase behaves as it has in previous releases (that is, it moves forward assuming such a check would pass). Where this check fails, it is unsafe to rely on HBase in a production setting. In the event of process or node failure, the HBase RegionServer process may fail to have access to all the data it previously wrote to its write ahead log, resulting in data loss. In the event of process or node failure, the HBase master process may lose all or part of the write ahead log that it relies on for cluster management operations, leaving the cluster in an inconsistent state that we aren't sure it could recover from. Notably, the LocalFileSystem implementation provided by Hadoop reports (accurately) via these new APIs that it can not provide the durability HBase needs to operate. As such, the current instructions for single-node HBase operation have been updated both with a) how to bypass this safety check and b) a strong warning about the dire consequences of doing so outside of a dev/test environment.

    Description

      In places where we rely on the underlying filesystem holding up the promises of hflush/hsync (most importantly the WAL), we should use the new interfaces provided by HDFS-11644 to fail loudly when they are not present (e.g. on S3, on EC mounts, etc).

      Attachments

        1. HBASE-18784-branch-1.v2.patch
          132 kB
          Sean Busbey
        2. HBASE-18784.2.patch
          148 kB
          Sean Busbey
        3. HBASE-18784.1.patch
          148 kB
          Sean Busbey
        4. HBASE-18784.0.patch
          146 kB
          Sean Busbey

        Issue Links

          Activity

            People

              busbey Sean Busbey
              busbey Sean Busbey
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: