Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12943

Consistent Reads from Standby Node

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.10.0, 3.3.0, 3.1.4, 3.2.2
    • Component/s: hdfs
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Observer is a new type of a NameNode in addition to Active and Standby Nodes in HA settings. An Observer Node maintains a replica of the namespace same as a Standby Node. It additionally allows execution of clients read requests.

      To ensure read-after-write consistency within a single client, a state ID is introduced in RPC headers. The Observer responds to the client request only after its own state has caught up with the client’s state ID, which it previously received from the Active NameNode.

      Clients can explicitly invoke a new client protocol call msync(), which ensures that subsequent reads by this client from an Observer are consistent.

      A new client-side ObserverReadProxyProvider is introduced to provide automatic switching between Active and Observer NameNodes for submitting respectively write and read requests.
      Show
      Observer is a new type of a NameNode in addition to Active and Standby Nodes in HA settings. An Observer Node maintains a replica of the namespace same as a Standby Node. It additionally allows execution of clients read requests. To ensure read-after-write consistency within a single client, a state ID is introduced in RPC headers. The Observer responds to the client request only after its own state has caught up with the client’s state ID, which it previously received from the Active NameNode. Clients can explicitly invoke a new client protocol call msync(), which ensures that subsequent reads by this client from an Observer are consistent. A new client-side ObserverReadProxyProvider is introduced to provide automatic switching between Active and Observer NameNodes for submitting respectively write and read requests.

      Description

      StandbyNode in HDFS is a replica of the active NameNode. The states of the NameNodes are coordinated via the journal. It is natural to consider StandbyNode as a read-only replica. As with any replicated distributed system the problem of stale reads should be resolved. Our main goal is to provide reads from standby in a consistent way in order to enable a wide range of existing applications running on top of HDFS.

        Attachments

        1. ConsistentReadsFromStandbyNode.pdf
          396 kB
          Konstantin Shvachko
        2. ConsistentReadsFromStandbyNode.pdf
          394 kB
          Konstantin Shvachko
        3. HDFS-12943-001.patch
          328 kB
          Konstantin Shvachko
        4. HDFS-12943-002.patch
          354 kB
          Konstantin Shvachko
        5. HDFS-12943-003.patch
          353 kB
          Konstantin Shvachko
        6. HDFS-12943-004.patch
          353 kB
          Konstantin Shvachko
        7. TestPlan-ConsistentReadsFromStandbyNode.pdf
          79 kB
          Konstantin Shvachko

          Issue Links

          1.
          Tailing edits should not update quota counts on ObserverNode Sub-task Resolved Erik Krogen
          2.
          Changes to the NameNode to support reads from standby Sub-task Resolved Chao Sun
          3.
          Introduce ObserverReadProxyProvider Sub-task Resolved Chao Sun
          4.
          [Edit Tail Fast Path] Allow SbNN to tail in-progress edits from JN via RPC Sub-task Resolved Erik Krogen
          5.
          Make Client field AlignmentContext non-static. Sub-task Resolved Plamen Jeliazkov
          6.
          Add stateId to RPC headers. Sub-task Resolved Plamen Jeliazkov
          7.
          Fine-grained locking while consuming journal stream. Sub-task Resolved Konstantin Shvachko
          8.
          StandbyNode should upload FsImage to ObserverNode after checkpointing. Sub-task Resolved Chen Liang
          9.
          Add haadmin commands to transition between standby and observer Sub-task Resolved Chao Sun
          10.
          Support observer reads for WebHDFS Sub-task Open Chao Sun
          11.
          Allow Observer to participate in NameNode failover Sub-task Open Unassigned
          12.
          Standby NameNode should roll active edit log when checkpointing Sub-task Resolved Unassigned
          13.
          Add lastSeenStateId to RpcRequestHeader. Sub-task Resolved Plamen Jeliazkov
          14.
          Support observer node from Router-Based Federation Sub-task Open Chao Sun
          15.
          Support observer nodes in MiniDFSCluster Sub-task Resolved Konstantin Shvachko
          16.
          Add ReadOnly annotation to methods in ClientProtocol Sub-task Resolved Chao Sun
          17.
          [Edit Tail Fast Path Pt 1] Enhance JournalNode with an in-memory cache of recent edit transactions Sub-task Resolved Erik Krogen
          18.
          [Edit Tail Fast Path Pt 2] Add ability for JournalNode to serve edits via RPC Sub-task Resolved Erik Krogen
          19.
          [Edit Tail Fast Path Pt 3] NameNode-side changes to support tailing edits via RPC Sub-task Resolved Erik Krogen
          20.
          [Edit Tail Fast Path Pt 4] Cleanup: integration test, documentation, remove unnecessary dummy sync Sub-task Resolved Erik Krogen
          21.
          Move RPC response serialization into Server.doResponse Sub-task Resolved Plamen Jeliazkov
          22.
          Introduce msync API call Sub-task Resolved Chen Liang
          23.
          NameNodeRpcServer getEditsFromTxid assumes it is run on active NameNode Sub-task Open Unassigned
          24.
          ClientGCIContext should be correctly named ClientGSIContext Sub-task Resolved Konstantin Shvachko
          25.
          Use getServiceStatus to discover observer namenodes Sub-task Resolved Chao Sun
          26.
          Add msync server implementation. Sub-task Resolved Chen Liang
          27.
          TestStateAlignmentContextWithHA should use real ObserverReadProxyProvider instead of AlignmentContextProxyProvider. Sub-task Resolved Plamen Jeliazkov
          28.
          Implement performFailover logic for ObserverReadProxyProvider. Sub-task Resolved Erik Krogen
          29.
          Postpone NameNode state discovery in ObserverReadProxyProvider until the first real RPC call. Sub-task Resolved Chen Liang
          30.
          Unit tests for standby reads. Sub-task Resolved Unassigned
          31.
          ObserverReadProxyProvider should work with IPFailoverProxyProvider Sub-task Resolved Konstantin Shvachko
          32.
          Reduce logging frequency of QuorumJournalManager#selectInputStreams Sub-task Resolved Erik Krogen
          33.
          Limit logging frequency of edit tail related statements Sub-task Resolved Erik Krogen
          34.
          Refactor NameNode failover proxy providers Sub-task Resolved Konstantin Shvachko
          35.
          Remove AlignmentContext from AbstractNNFailoverProxyProvider Sub-task Resolved Konstantin Shvachko
          36.
          Only some protocol methods should perform msync wait Sub-task Resolved Erik Krogen
          37.
          ObserverNode should reject read requests when it is too far behind. Sub-task Resolved Konstantin Shvachko
          38.
          Add mechanism to allow certain RPC calls to bypass sync Sub-task Resolved Chen Liang
          39.
          Throw retriable exception for getBlockLocations when ObserverNameNode is in safemode Sub-task Resolved Chao Sun
          40.
          Add a configuration to turn on/off observer reads Sub-task Open Shweta
          41.
          Handle BlockMissingException when reading from observer Sub-task Resolved Chao Sun
          42.
          Unit Test for transitioning between different states Sub-task Resolved Sherwood Zheng
          43.
          Fix crlf line endings in HDFS-12943 branch Sub-task Resolved Konstantin Shvachko
          44.
          Test reads from standby on a secure cluster with IP failover Sub-task Resolved Chen Liang
          45.
          TestObserverNode refactoring Sub-task Resolved Konstantin Shvachko
          46.
          Introduce the single Observer failure Sub-task Resolved Sherwood Zheng
          47.
          ObserverReadProxyProvider should enable observer read by default Sub-task Resolved Chen Liang
          48.
          ObserverReadProxyProviderWithIPFailover should work with HA configuration Sub-task Resolved Chen Liang
          49.
          Emulate Observer node falling far behind the Active Sub-task Resolved Sherwood Zheng
          50.
          NN status discovery does not leverage delegation token Sub-task Resolved Chen Liang
          51.
          Test reads from standby on a secure cluster with Configured failover Sub-task Resolved Plamen Jeliazkov
          52.
          Allow manual failover between standby and observer Sub-task Resolved Chao Sun
          53.
          Allow manual transition from Standby to Observer Sub-task Resolved Unassigned
          54.
          Fix the order of logging arguments in ObserverReadProxyProvider. Sub-task Resolved Ayush Saxena
          55.
          Fix class cast error in NNThroughputBenchmark with ObserverReadProxyProvider. Sub-task Resolved Chao Sun
          56.
          ORFPP should also clone DT for the virtual IP Sub-task Resolved Chen Liang
          57.
          Make ZKFC ObserverNode aware Sub-task Resolved xiangheng
          58.
          Create user guide for "Consistent reads from Observer" feature. Sub-task Resolved Chao Sun
          59.
          Move ipfailover config key out of HdfsClientConfigKeys Sub-task Resolved Chen Liang
          60.
          Handle exception from internalQueueCall Sub-task Resolved Chao Sun
          61.
          Adjust annotations on new interfaces/classes for SBN reads. Sub-task Resolved Chao Sun
          62.
          Description errors in the comparison logic of transaction ID Sub-task Resolved xiangheng
          63.
          Update "Consistent Read from Observer" User Guide with Edit Tailing Frequency Sub-task Resolved Erik Krogen
          64.
          Document dfs.ha.tail-edits.period in user guide. Sub-task Resolved Chao Sun
          65.
          ObserverReadInvocationHandler should implement RpcInvocationHandler Sub-task Resolved Konstantin Shvachko
          66.
          Balancer should work with ObserverNode Sub-task Resolved Erik Krogen
          67.
          Fix white spaces related to SBN reads. Sub-task Resolved Konstantin Shvachko
          68.
          [SBN read] Unclear Log.WARN message in GlobalStateIdContext Sub-task Resolved Shweta
          69.
          [SBN Read] StateId and TrasactionId not present in Trace level logging Sub-task Resolved Shweta
          70.
          Throwing RemoteException in the time of Read Operation Sub-task Resolved Unassigned
          71.
          [SBN Read] Add the document link to the top page Sub-task Resolved Takanobu Asanuma
          72.
          [SBN read] Got an unexpected txid when tail editlog Sub-task Resolved wangzhaohui
          73.
          Fix logging error in TestEditLog#testMultiStreamsLoadEditWithConfMaxTxns Sub-task Resolved Jonathan Hung
          74.
          [SBN read] Change client logging to be less aggressive Sub-task Resolved Chen Liang
          75.
          [SBN read] StanbyNode does not come out of safemode while adding new blocks. Sub-task Resolved Unassigned
          76.
          [SBN read] reportBadBlock is rejected by Observer. Sub-task Open Unassigned
          77.
          [SBN read] Revisit GlobalStateIdContext locking when getting server state id Sub-task Resolved Chen Liang
          78.
          [SBN read] Allow configurably enable/disable AlignmentContext on NameNode Sub-task Resolved Chen Liang

            Activity

              People

              • Assignee:
                shv Konstantin Shvachko
                Reporter:
                shv Konstantin Shvachko
              • Votes:
                4 Vote for this issue
                Watchers:
                76 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: