Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-14790

Implement a new DFSOutputStream for logging WAL only

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0.0-beta-1, 2.0.0
    • Component/s: wal
    • Labels:
      None
    • Release Note:
      Hide
      Implement a FanOutOneBlockAsyncDFSOutput for writing WAL only, the WAL provider which uses this class is AsyncFSWALProvider.

      It is based on netty, and will write to 3 DNs at the same time concurrently(fan-out) so generally it will lead to a lower latency. And it is also fail-fast, the stream will become unwritable immediately after there are any read/write errors, no pipeline recovery. You need to call recoverLease to force close the output for this case. And it only supports to write a file with a single block. For WAL this is a good behavior as we can always open a new file when the old one is broken. The performance analysis in HBASE-16890 shows that it has a better performance.

      Behavior changes:
      1. As now we write to 3 DNs concurrently, according to the visibility guarantee of HDFS, the data will be available immediately when arriving at DN since all the DNs will be considered as the last one in pipeline. This means replication may read uncommitted data and replicate it to the remote cluster and cause data inconsistency. HBASE-14004 is used to solve the problem.
      2. There will be no sync failure. When the output is broken, we will open a new file and write all the unacked wal entries to the new file. This means that we may have duplicated entries in wal files. HBASE-14949 is used to solve this problem.
      Show
      Implement a FanOutOneBlockAsyncDFSOutput for writing WAL only, the WAL provider which uses this class is AsyncFSWALProvider. It is based on netty, and will write to 3 DNs at the same time concurrently(fan-out) so generally it will lead to a lower latency. And it is also fail-fast, the stream will become unwritable immediately after there are any read/write errors, no pipeline recovery. You need to call recoverLease to force close the output for this case. And it only supports to write a file with a single block. For WAL this is a good behavior as we can always open a new file when the old one is broken. The performance analysis in HBASE-16890 shows that it has a better performance. Behavior changes: 1. As now we write to 3 DNs concurrently, according to the visibility guarantee of HDFS, the data will be available immediately when arriving at DN since all the DNs will be considered as the last one in pipeline. This means replication may read uncommitted data and replicate it to the remote cluster and cause data inconsistency. HBASE-14004 is used to solve the problem. 2. There will be no sync failure. When the output is broken, we will open a new file and write all the unacked wal entries to the new file. This means that we may have duplicated entries in wal files. HBASE-14949 is used to solve this problem.

      Description

      The original DFSOutputStream is very powerful and aims to serve all purposes. But in fact, we do not need most of the features if we only want to log WAL. For example, we do not need pipeline recovery since we could just close the old logger and open a new one. And also, we do not need to write multiple blocks since we could also open a new logger if the old file is too large.

      And the most important thing is that, it is hard to handle all the corner cases to avoid data loss or data inconsistency(such as HBASE-14004) when using original DFSOutputStream due to its complicated logic. And the complicated logic also force us to use some magical tricks to increase performance. For example, we need to use multiple threads to call hflush when logging, and now we use 5 threads. But why 5 not 10 or 100?

      So here, I propose we should implement our own DFSOutputStream when logging WAL. For correctness, and also for performance.

        Attachments

          Issue Links

          1.
          Resolve name conflict when splitting if there are duplicated WAL entries Sub-task Resolved Duo Zhang
          2.
          Implement a fan out HDFS OutputStream Sub-task Resolved Duo Zhang
          3.
          Implement an asynchronous FSHLog Sub-task Resolved Duo Zhang
          4.
          Add SASL support for fan out OutputStream Sub-task Resolved Duo Zhang
          5.
          Connection leak in FanOutOneBlockAsyncDFSOutputHelper Sub-task Resolved Duo Zhang
          6.
          Make AsyncFSWAL as our default WAL Sub-task Resolved Duo Zhang
          7.
          Make multi WAL work with WALs other than FSHLog Sub-task Resolved Duo Zhang
          8.
          Implement secure async protobuf wal writer Sub-task Resolved Duo Zhang
          9.
          Implement an AsyncOutputStream which can work with any FileSystem implementation Sub-task Resolved Duo Zhang
          10.
          Handle large edits for asynchronous WAL Sub-task Resolved Duo Zhang
          11.
          Add Transparent Data Encryption support for FanOutOneBlockAsyncDFSOutput Sub-task Resolved Duo Zhang
          12.
          Add testcase for AES encryption Sub-task Resolved Duo Zhang
          13.
          Rename DefaultWALProvider to a more specific name and clean up unnecessary reference to it Sub-task Resolved Duo Zhang
          14.
          Analyze the performance of AsyncWAL and fix the same Sub-task Resolved ramkrishna.s.vasudevan
          15.
          Try copying to the Netty ByteBuf directly from the WALEdit Sub-task Resolved Duo Zhang
          16.
          Refactor FanOutOneBlockAsyncDFSOutput Sub-task Resolved Duo Zhang
          17.
          Use RingBuffer to reduce the contention in AsyncFSWAL Sub-task Resolved Duo Zhang
          18.
          Check why we roll a wal writer at 10MB when the configured roll size is 120M+ with AsyncFSWAL Sub-task Resolved Duo Zhang
          19.
          Calcuate suitable ByteBuf size when allocating send buffer in FanOutOneBlockAsyncDFSOutput Sub-task Resolved ramkrishna.s.vasudevan
          20.
          Do not issue sync request when there are still entries in ringbuffer Sub-task Resolved Duo Zhang
          21.
          Remove LogRollerExitedChecker Sub-task Resolved Duo Zhang
          22.
          AsyncFSWAL may issue unnecessary AsyncDFSOutput.sync Sub-task Resolved Duo Zhang
          23.
          improve asyncWAL by using Independent thread for netty #IO in FanOutOneBlockAsyncDFSOutput Sub-task Resolved Duo Zhang
          24.
          Use EventLoopGroup to create AsyncFSOutput Sub-task Resolved Duo Zhang
          25.
          Add note for operators to refguide on AsyncFSWAL Sub-task Resolved Michael Stack

            Activity

              People

              • Assignee:
                zhangduo Duo Zhang
                Reporter:
                zhangduo Duo Zhang
              • Votes:
                0 Vote for this issue
                Watchers:
                39 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: