Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-11708

CryptoOutputStream synchronization differences from DFSOutputStream break HBase

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 2.6.0
    • Fix Version/s: None
    • Component/s: fs
    • Labels:
      None
    • Target Version/s:

      Description

      For the write-ahead-log, HBase writes to DFS from a single thread and sends sync/flush/hflush from a configurable number of other threads (default 5).

      FSDataOutputStream does not document anything about being thread safe, and it is not thread safe for concurrent writes.

      However, DFSOutputStream is thread safe for concurrent writes + syncs. When it is the stream FSDataOutputStream wraps, the combination is threadsafe for 1 writer and multiple syncs (the exact behavior HBase relies on).

      When HDFS Transparent Encryption is turned on, CryptoOutputStream is inserted between FSDataOutputStream and DFSOutputStream. It is proactively labeled as not thread safe, and this composition is not thread safe for any operations.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                busbey Sean Busbey
                Reporter:
                busbey Sean Busbey
              • Votes:
                0 Vote for this issue
                Watchers:
                22 Start watching this issue

                Dates

                • Created:
                  Updated: