Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.10.0, 3.3.0, 3.1.4, 3.2.2
    • hdfs
    • None
    • Reviewed
    • Hide
      Observer is a new type of a NameNode in addition to Active and Standby Nodes in HA settings. An Observer Node maintains a replica of the namespace same as a Standby Node. It additionally allows execution of clients read requests.

      To ensure read-after-write consistency within a single client, a state ID is introduced in RPC headers. The Observer responds to the client request only after its own state has caught up with the client’s state ID, which it previously received from the Active NameNode.

      Clients can explicitly invoke a new client protocol call msync(), which ensures that subsequent reads by this client from an Observer are consistent.

      A new client-side ObserverReadProxyProvider is introduced to provide automatic switching between Active and Observer NameNodes for submitting respectively write and read requests.
      Show
      Observer is a new type of a NameNode in addition to Active and Standby Nodes in HA settings. An Observer Node maintains a replica of the namespace same as a Standby Node. It additionally allows execution of clients read requests. To ensure read-after-write consistency within a single client, a state ID is introduced in RPC headers. The Observer responds to the client request only after its own state has caught up with the client’s state ID, which it previously received from the Active NameNode. Clients can explicitly invoke a new client protocol call msync(), which ensures that subsequent reads by this client from an Observer are consistent. A new client-side ObserverReadProxyProvider is introduced to provide automatic switching between Active and Observer NameNodes for submitting respectively write and read requests.

    Description

      StandbyNode in HDFS is a replica of the active NameNode. The states of the NameNodes are coordinated via the journal. It is natural to consider StandbyNode as a read-only replica. As with any replicated distributed system the problem of stale reads should be resolved. Our main goal is to provide reads from standby in a consistent way in order to enable a wide range of existing applications running on top of HDFS.

      Attachments

        1. ConsistentReadsFromStandbyNode.pdf
          394 kB
          Konstantin Shvachko
        2. ConsistentReadsFromStandbyNode.pdf
          396 kB
          Konstantin Shvachko
        3. TestPlan-ConsistentReadsFromStandbyNode.pdf
          79 kB
          Konstantin Shvachko
        4. HDFS-12943-001.patch
          328 kB
          Konstantin Shvachko
        5. HDFS-12943-002.patch
          354 kB
          Konstantin Shvachko
        6. HDFS-12943-003.patch
          353 kB
          Konstantin Shvachko
        7. HDFS-12943-004.patch
          353 kB
          Konstantin Shvachko

        Issue Links

          1.
          Tailing edits should not update quota counts on ObserverNode Sub-task Resolved Erik Krogen  
          2.
          Changes to the NameNode to support reads from standby Sub-task Resolved Chao Sun  
          3.
          Introduce ObserverReadProxyProvider Sub-task Resolved Chao Sun  
          4.
          [Edit Tail Fast Path] Allow SbNN to tail in-progress edits from JN via RPC Sub-task Resolved Erik Krogen  
          5.
          Make Client field AlignmentContext non-static. Sub-task Resolved Plamen Jeliazkov  
          6.
          Add stateId to RPC headers. Sub-task Resolved Plamen Jeliazkov  
          7.
          Fine-grained locking while consuming journal stream. Sub-task Resolved Konstantin Shvachko  
          8.
          StandbyNode should upload FsImage to ObserverNode after checkpointing. Sub-task Resolved Chen Liang  
          9.
          Add haadmin commands to transition between standby and observer Sub-task Resolved Chao Sun  
          10.
          Support observer reads for WebHDFS Sub-task Open Chao Sun  
          11.
          Allow Observer to participate in NameNode failover Sub-task Open Unassigned  
          12.
          Standby NameNode should roll active edit log when checkpointing Sub-task Resolved Unassigned  
          13.
          Add lastSeenStateId to RpcRequestHeader. Sub-task Resolved Plamen Jeliazkov  
          14.
          HDFS-13522: Add federated nameservices states to client protocol and propagate it between routers and clients. Sub-task Resolved Simbarashe Dzinamarira

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 20h 50m
          15.
          Support observer nodes in MiniDFSCluster Sub-task Resolved Konstantin Shvachko  
          16.
          Add ReadOnly annotation to methods in ClientProtocol Sub-task Resolved Chao Sun  
          17.
          [Edit Tail Fast Path Pt 1] Enhance JournalNode with an in-memory cache of recent edit transactions Sub-task Resolved Erik Krogen  
          18.
          [Edit Tail Fast Path Pt 2] Add ability for JournalNode to serve edits via RPC Sub-task Resolved Erik Krogen  
          19.
          [Edit Tail Fast Path Pt 3] NameNode-side changes to support tailing edits via RPC Sub-task Resolved Erik Krogen  
          20.
          [Edit Tail Fast Path Pt 4] Cleanup: integration test, documentation, remove unnecessary dummy sync Sub-task Resolved Erik Krogen  
          21.
          Move RPC response serialization into Server.doResponse Sub-task Resolved Plamen Jeliazkov  
          22.
          Introduce msync API call Sub-task Resolved Chen Liang  
          23.
          NameNodeRpcServer getEditsFromTxid assumes it is run on active NameNode Sub-task Open Unassigned  
          24.
          ClientGCIContext should be correctly named ClientGSIContext Sub-task Resolved Konstantin Shvachko  
          25.
          Use getServiceStatus to discover observer namenodes Sub-task Resolved Chao Sun  
          26.
          Add msync server implementation. Sub-task Resolved Chen Liang  
          27.
          TestStateAlignmentContextWithHA should use real ObserverReadProxyProvider instead of AlignmentContextProxyProvider. Sub-task Resolved Plamen Jeliazkov  
          28.
          Implement performFailover logic for ObserverReadProxyProvider. Sub-task Resolved Erik Krogen  
          29.
          Postpone NameNode state discovery in ObserverReadProxyProvider until the first real RPC call. Sub-task Resolved Chen Liang  
          30.
          Unit tests for standby reads. Sub-task Resolved Unassigned  
          31.
          ObserverReadProxyProvider should work with IPFailoverProxyProvider Sub-task Resolved Konstantin Shvachko  
          32.
          Reduce logging frequency of QuorumJournalManager#selectInputStreams Sub-task Resolved Erik Krogen  
          33.
          Limit logging frequency of edit tail related statements Sub-task Resolved Erik Krogen  
          34.
          Refactor NameNode failover proxy providers Sub-task Resolved Konstantin Shvachko  
          35.
          Remove AlignmentContext from AbstractNNFailoverProxyProvider Sub-task Resolved Konstantin Shvachko  
          36.
          Only some protocol methods should perform msync wait Sub-task Resolved Erik Krogen  
          37.
          ObserverNode should reject read requests when it is too far behind. Sub-task Resolved Konstantin Shvachko  
          38.
          Add mechanism to allow certain RPC calls to bypass sync Sub-task Resolved Chen Liang  
          39.
          Throw retriable exception for getBlockLocations when ObserverNameNode is in safemode Sub-task Resolved Chao Sun  
          40.
          Add a configuration to turn on/off observer reads Sub-task Open Shweta

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 40m
          41.
          Handle BlockMissingException when reading from observer Sub-task Resolved Chao Sun  
          42.
          Unit Test for transitioning between different states Sub-task Resolved Sherwood Zheng  
          43.
          Fix crlf line endings in HDFS-12943 branch Sub-task Resolved Konstantin Shvachko  
          44.
          Test reads from standby on a secure cluster with IP failover Sub-task Resolved Chen Liang  
          45.
          TestObserverNode refactoring Sub-task Resolved Konstantin Shvachko  
          46.
          Introduce the single Observer failure Sub-task Resolved Sherwood Zheng  
          47.
          ObserverReadProxyProvider should enable observer read by default Sub-task Resolved Chen Liang  
          48.
          ObserverReadProxyProviderWithIPFailover should work with HA configuration Sub-task Resolved Chen Liang  
          49.
          Emulate Observer node falling far behind the Active Sub-task Resolved Sherwood Zheng  
          50.
          NN status discovery does not leverage delegation token Sub-task Resolved Chen Liang  
          51.
          Test reads from standby on a secure cluster with Configured failover Sub-task Resolved Plamen Jeliazkov  
          52.
          Allow manual failover between standby and observer Sub-task Resolved Chao Sun  
          53.
          Allow manual transition from Standby to Observer Sub-task Resolved Unassigned  
          54.
          Fix the order of logging arguments in ObserverReadProxyProvider. Sub-task Resolved Ayush Saxena  
          55.
          Fix class cast error in NNThroughputBenchmark with ObserverReadProxyProvider. Sub-task Resolved Chao Sun  
          56.
          ORFPP should also clone DT for the virtual IP Sub-task Resolved Chen Liang  
          57.
          Make ZKFC ObserverNode aware Sub-task Resolved xiangheng  
          58.
          Create user guide for "Consistent reads from Observer" feature. Sub-task Resolved Chao Sun  
          59.
          Move ipfailover config key out of HdfsClientConfigKeys Sub-task Resolved Chen Liang  
          60.
          Handle exception from internalQueueCall Sub-task Resolved Chao Sun  
          61.
          Adjust annotations on new interfaces/classes for SBN reads. Sub-task Resolved Chao Sun  
          62.
          Description errors in the comparison logic of transaction ID Sub-task Resolved xiangheng  
          63.
          Update "Consistent Read from Observer" User Guide with Edit Tailing Frequency Sub-task Resolved Erik Krogen  
          64.
          Document dfs.ha.tail-edits.period in user guide. Sub-task Resolved Chao Sun  
          65.
          ObserverReadInvocationHandler should implement RpcInvocationHandler Sub-task Resolved Konstantin Shvachko  
          66.
          Balancer should work with ObserverNode Sub-task Resolved Erik Krogen  
          67.
          Fix white spaces related to SBN reads. Sub-task Resolved Konstantin Shvachko  
          68.
          [SBN read] Unclear Log.WARN message in GlobalStateIdContext Sub-task Resolved Shweta  
          69.
          [SBN Read] StateId and TrasactionId not present in Trace level logging Sub-task Resolved Shweta  
          70.
          Throwing RemoteException in the time of Read Operation Sub-task Resolved Unassigned  
          71.
          [SBN Read] Add the document link to the top page Sub-task Resolved Takanobu Asanuma  
          72.
          [SBN read] Got an unexpected txid when tail editlog Sub-task Resolved Zhaohui Wang  
          73.
          Fix logging error in TestEditLog#testMultiStreamsLoadEditWithConfMaxTxns Sub-task Resolved Jonathan Hung  
          74.
          [SBN read] Change client logging to be less aggressive Sub-task Resolved Chen Liang  
          75.
          [SBN read] StanbyNode does not come out of safemode while adding new blocks. Sub-task Resolved Unassigned  
          76.
          [SBN read] reportBadBlock is rejected by Observer. Sub-task Open Unassigned  
          77.
          [SBN read] Revisit GlobalStateIdContext locking when getting server state id Sub-task Resolved Chen Liang  
          78.
          [SBN read] Allow configurably enable/disable AlignmentContext on NameNode Sub-task Resolved Chen Liang  
          79.
          Prevent Observer NameNode from becoming StandBy NameNode Sub-task Resolved Aihua Xu  
          80.
          RBF: Support observer node from Router-Based Federation Sub-task Resolved Simbarashe Dzinamarira  

          Activity

            The design document covers motivation, main requirements, and potential solutions. It describes the consistency model, gives examples and use cases, introduces the new API, discusses implementation details. The roadmap lists four major stages and sets HDFS-10702 as the initial stage.

            shv Konstantin Shvachko added a comment - The design document covers motivation, main requirements, and potential solutions. It describes the consistency model, gives examples and use cases, introduces the new API, discusses implementation details. The roadmap lists four major stages and sets HDFS-10702 as the initial stage.
            xkrogen Erik Krogen added a comment -

            We have been running some performance experiments (using Dynamometer) to try to determine just how large the potential benefits to be gained by this feature are. Using the tool, we replayed a few hours of traces from a production cluster against a simulated NameNode, filtering out different % of read requests to mimic the ANN's point-of-view of requests going to the standby. We tried filtering out 0%, 20%, 50%, and 100% of read requests, and also tried replaying our write workload only at 2x and 4x speed to get an estimate of throughput under the ideal (all reads offloaded) conditions.

              0% Skip 20% Skip 50% Skip 100% Skip 100% Skip (2x) 100% Skip (4x)
            Average Write Latency (ms) 52.8 28.5 18.0 14.0 27.0 73.2
            Average Read Latency (ms) 34.3 20.0 11.5 N/A N/A N/A
            RPC Queue AvgTime (ms) 23.0 11.9 7.4 1.7 4.3 20.7
            RPC Queue 50th Percentile (ms) 2.81 0.52 0.47 0.05 0.05 0.04
            RPC Queue 90th Percentile (ms) 24.42 12.51 9.98 0.12 1.49 12.96
            RPC Queue NumOps (k) 31.0 25.2 16.3 1.5 3.0 6.0
            LockQueueLength Average 45.3 24.9 18.9 7.0 12.5 30.6
            GC Time (ms) 9.62 7.94 6.13 1.94 3.03 5.49

            The results above indicate that, if we were able to offload all read requests, we should expect up to 4x throughput improvement for the write workload.

            xkrogen Erik Krogen added a comment - We have been running some performance experiments (using Dynamometer ) to try to determine just how large the potential benefits to be gained by this feature are. Using the tool, we replayed a few hours of traces from a production cluster against a simulated NameNode, filtering out different % of read requests to mimic the ANN's point-of-view of requests going to the standby. We tried filtering out 0%, 20%, 50%, and 100% of read requests, and also tried replaying our write workload only at 2x and 4x speed to get an estimate of throughput under the ideal (all reads offloaded) conditions.   0% Skip 20% Skip 50% Skip 100% Skip 100% Skip (2x) 100% Skip (4x) Average Write Latency (ms) 52.8 28.5 18.0 14.0 27.0 73.2 Average Read Latency (ms) 34.3 20.0 11.5 N/A N/A N/A RPC Queue AvgTime (ms) 23.0 11.9 7.4 1.7 4.3 20.7 RPC Queue 50th Percentile (ms) 2.81 0.52 0.47 0.05 0.05 0.04 RPC Queue 90th Percentile (ms) 24.42 12.51 9.98 0.12 1.49 12.96 RPC Queue NumOps (k) 31.0 25.2 16.3 1.5 3.0 6.0 LockQueueLength Average 45.3 24.9 18.9 7.0 12.5 30.6 GC Time (ms) 9.62 7.94 6.13 1.94 3.03 5.49 The results above indicate that, if we were able to offload all read requests, we should expect up to 4x throughput improvement for the write workload.

            Thanks for the document and benchmarking. This is really cool.

            Right now, writes are effectively throttled by blocking reads e.g., conditional checks before doing a rename. So if the NN is under heavy load, most applications will appear to back off because all these operations are blocking. If StandbyNodes serve many of these reads, then the write rate to the primary NameNode will increase. Have you tried running workloads against the PoC to get a sense for the "natural" increase in write traffic? In some deployments, would it make sense to disallow reads from the primary to prevent clients from harming overall cluster throughput?

            cdouglas Christopher Douglas added a comment - Thanks for the document and benchmarking. This is really cool. Right now, writes are effectively throttled by blocking reads e.g., conditional checks before doing a rename. So if the NN is under heavy load, most applications will appear to back off because all these operations are blocking. If StandbyNodes serve many of these reads, then the write rate to the primary NameNode will increase. Have you tried running workloads against the PoC to get a sense for the "natural" increase in write traffic? In some deployments, would it make sense to disallow reads from the primary to prevent clients from harming overall cluster throughput?

            Chris, I do not have POC numbers. I believe csun can elaborate on this.
            I agree reads are blocking writes on NN.
            Disallowing reads on active NN is an interesting twist. The design proposes a new client-side config variable to enable reads from SBN. I think we can have another one to disable reads from ANN:

            • dfs.client.standby.reads.enabled = true - enables reads from standby
            • dfs.client.active.reads.enabled = false - disables reads on active and directs them exclusively to standby
            shv Konstantin Shvachko added a comment - Chris, I do not have POC numbers. I believe csun can elaborate on this. I agree reads are blocking writes on NN. Disallowing reads on active NN is an interesting twist. The design proposes a new client-side config variable to enable reads from SBN. I think we can have another one to disable reads from ANN: dfs.client.standby.reads.enabled = true - enables reads from standby dfs.client.active.reads.enabled = false - disables reads on active and directs them exclusively to standby
            csun Chao Sun added a comment -

            chris.douglas I did some experiment with the POC patch, on 2.8.3. It uses 5000 containers to issue read/write requests that mimic production workloads (~95% reads, ~5% write).
            With stale reads enabled, I observed around 60-80K throughput on the SBN, and around 20K on the ANN for write throughput. Without stale reads, the total throughput on the ANN was around 35-40K.
            Also, with stale reads, the write throughput on ANN was 2-2.5X higher, while the GC time dropped from around 6s/min to 2s/min.

            Hope this helps, and let me know if you need more data.

            csun Chao Sun added a comment - chris.douglas I did some experiment with the POC patch, on 2.8.3. It uses 5000 containers to issue read/write requests that mimic production workloads (~95% reads, ~5% write). With stale reads enabled, I observed around 60-80K throughput on the SBN, and around 20K on the ANN for write throughput. Without stale reads, the total throughput on the ANN was around 35-40K. Also, with stale reads, the write throughput on ANN was 2-2.5X higher, while the GC time dropped from around 6s/min to 2s/min. Hope this helps, and let me know if you need more data.
            zhz Zhe Zhang added a comment - - edited

            Thanks csun, interesting results! You used only 1 SBN to server reads right? In both configurations (with and without stale reads), I assume you were saturating the system? It's interesting to see that with two NNs serving RPCs (1 ANN + 1 SBN), the throughput actually more than doubled the throughput with 1 ANN. Did you use Namesystem unfair locking?

            If I understand correctly, both your test and the Dynamometer test are more like trace-driven micro benchmarks, where a container issues a certain type of RPC at given timestamp. Chris was probably referring to a test job with "real code" like if !file_exists(path) then create_file(path), where the blocking relationship between calls are miniced.

            chris.douglas: the "natural" increase of write traffic is an interesting question. I don't think the feature will increase the total amount of write RPCs (a given job will still issue that many writes overall). Writes within a job could become more bursty but the job itself will run for shorter. Statistically, the 1000s of jobs on the cluster would probably smooth out this increased burstiness.

            zhz Zhe Zhang added a comment - - edited Thanks csun , interesting results! You used only 1 SBN to server reads right? In both configurations (with and without stale reads), I assume you were saturating the system? It's interesting to see that with two NNs serving RPCs (1 ANN + 1 SBN), the throughput actually more than doubled the throughput with 1 ANN. Did you use Namesystem unfair locking? If I understand correctly, both your test and the Dynamometer test are more like trace-driven micro benchmarks, where a container issues a certain type of RPC at given timestamp. Chris was probably referring to a test job with "real code" like if !file_exists(path) then create_file(path) , where the blocking relationship between calls are miniced. chris.douglas : the "natural" increase of write traffic is an interesting question. I don't think the feature will increase the total amount of write RPCs (a given job will still issue that many writes overall). Writes within a job could become more bursty but the job itself will run for shorter. Statistically, the 1000s of jobs on the cluster would probably smooth out this increased burstiness.
            csun Chao Sun added a comment -

            Thanks Chao Sun, interesting results! You used only 1 SBN to server reads right?

            Yes I used 1 ANN + 1SBN + 1ONN (observer NN).

            In both configurations (with and without stale reads), I assume you were saturating the system?

            In the stale read case, the RPC queue time on the ANN was less than 5ms, while on ONN it was between 0 to 30ms. In the non-stale read case, the RPC queue time on ANN was around 130-140ms. So I guess the ANN was not saturated when stale read is enabled?

            Did you use Namesystem unfair locking?

            The ANN didn't use unfair locking. The ONN used unfair locking + async audit logging (we have an internal patch to use log4j 2.x) + async edit logging. Do you think it will make a difference if unfair locking is used on ANN?

            If I understand correctly, both your test and the Dynamometer test are more like trace-driven micro benchmarks, where a container issues a certain type of RPC at given timestamp. Chris was probably referring to a test job with "real code" like if !file_exists(path) then create_file(path), where the blocking relationship between calls are miniced.

            Yes the test was pretty simple. It is basically:

            loop {
              x = randInt(0, 100)
              if (x < 6) {
                fs.createNewFile(..)
                fs.rename(..)
                fs.delete(..)
              } else if (x < 10) {
                fs.listStatus(..)
              } else if (x < 40) {
                fs.getFileBlockLocations(..)
              } else {
                fs.getFileStatus(..)
              }
            }
            

            The file listing was done on a directory with 2K files.
            Let me know if you have any suggestion on improving this. It's pretty easy to change the code and re-run the benchmark.

            csun Chao Sun added a comment - Thanks Chao Sun, interesting results! You used only 1 SBN to server reads right? Yes I used 1 ANN + 1SBN + 1ONN (observer NN). In both configurations (with and without stale reads), I assume you were saturating the system? In the stale read case, the RPC queue time on the ANN was less than 5ms, while on ONN it was between 0 to 30ms. In the non-stale read case, the RPC queue time on ANN was around 130-140ms. So I guess the ANN was not saturated when stale read is enabled? Did you use Namesystem unfair locking? The ANN didn't use unfair locking. The ONN used unfair locking + async audit logging (we have an internal patch to use log4j 2.x) + async edit logging. Do you think it will make a difference if unfair locking is used on ANN? If I understand correctly, both your test and the Dynamometer test are more like trace-driven micro benchmarks, where a container issues a certain type of RPC at given timestamp. Chris was probably referring to a test job with "real code" like if !file_exists(path) then create_file(path), where the blocking relationship between calls are miniced. Yes the test was pretty simple. It is basically: loop { x = randInt(0, 100) if (x < 6) { fs.createNewFile(..) fs.rename(..) fs.delete(..) } else if (x < 10) { fs.listStatus(..) } else if (x < 40) { fs.getFileBlockLocations(..) } else { fs.getFileStatus(..) } } The file listing was done on a directory with 2K files. Let me know if you have any suggestion on improving this. It's pretty easy to change the code and re-run the benchmark.

            Hi shv, thanks for posting the design document. One thing that wasn't clear to me from the design doc itself was what's the function of the Observer Nodes. Are these what the clients actually use to read, instead of the real SBN?
            Further, what's the goal to of having them? Is it to reduce the load on the SBN further or graceful degradation during failures of NN/SBN?

            virajith Virajith Jalaparti added a comment - Hi shv , thanks for posting the design document. One thing that wasn't clear to me from the design doc itself was what's the function of the Observer Nodes. Are these what the clients actually use to read, instead of the real SBN? Further, what's the goal to of having them? Is it to reduce the load on the SBN further or graceful degradation during failures of NN/SBN?

            what's the function of the Observer Nodes

            Good question. The design doc says that Observer Node is an SBN that does not do checkpoints. Checkpointing degrades performance of SBN, we wont be able to read from it when it's busy. So it's more like a term to distinguish the node which is dedicated for reading - the read-only SBN. Regular SBN is also needed though if we want checkpointing and HA on the cluster, which I do. In the "Note on HA" we talk about some failover scenarios, that reading from ObserverNode elevates it role on the cluster so that you may need to run multiple of them to sustain the response rate in case of failure.

            shv Konstantin Shvachko added a comment - what's the function of the Observer Nodes Good question. The design doc says that Observer Node is an SBN that does not do checkpoints. Checkpointing degrades performance of SBN, we wont be able to read from it when it's busy. So it's more like a term to distinguish the node which is dedicated for reading - the read-only SBN. Regular SBN is also needed though if we want checkpointing and HA on the cluster, which I do. In the "Note on HA" we talk about some failover scenarios, that reading from ObserverNode elevates it role on the cluster so that you may need to run multiple of them to sustain the response rate in case of failure.

            Thanks for the clarification shv

            virajith Virajith Jalaparti added a comment - Thanks for the clarification shv

            Cut the branch origin/HDFS-12943. When committing please do not forget:

            1. To prepend jira description with [SBN read]. This should help to distinguish the branch from the trunk commits.
            2. Merge trunk to the branch before committing.
            shv Konstantin Shvachko added a comment - Cut the branch origin/ HDFS-12943 . When committing please do not forget: To prepend jira description with [SBN read] . This should help to distinguish the branch from the trunk commits. Merge trunk to the branch before committing.

            Updated the design doc. Included a section in Implementation details describing startup sequence, configuration for NameNodes, and state transitions. Also added references to fast path for tailing edits.

            shv Konstantin Shvachko added a comment - Updated the design doc. Included a section in Implementation details describing startup sequence, configuration for NameNodes, and state transitions. Also added references to fast path for tailing edits.
            xiaochen Xiao Chen added a comment -

            Thanks all for the for the work! (and sorry for the late response here) Just read through the design doc and the comments, looks great!

            I have 2 questions:

            About 'Optimization 1':

            Currently atime is created to be the same as mtime, and only gets updated if "dfs.namenode.accesstime.precision" has passed. Does this mean we require a really small atime precision? (Anecdotally, snapshot will capture a diff on the inode if atime is different. So if someone takes daily snapshots for a week, atime precision of a week will only resulting in 1 object being created while atime precision < 1 day will resulting in 7.).

            About Observer nodes:

            How is the failover handled? Currently ANN <> SBN is done by failover controller racing to write to zookeeper. For the observer node <> SBN transition, how is it done?

            xiaochen Xiao Chen added a comment - Thanks all for the for the work! (and sorry for the late response here) Just read through the design doc and the comments, looks great! I have 2 questions: About 'Optimization 1': Currently atime is created to be the same as mtime, and only gets updated if "dfs.namenode.accesstime.precision" has passed. Does this mean we require a really small atime precision? (Anecdotally, snapshot will capture a diff on the inode if atime is different. So if someone takes daily snapshots for a week, atime precision of a week will only resulting in 1 object being created while atime precision < 1 day will resulting in 7.). About Observer nodes: How is the failover handled? Currently ANN < > SBN is done by failover controller racing to write to zookeeper. For the observer node < > SBN transition, how is it done?
            csun Chao Sun added a comment -

            xiaochen, on the second question, current the transition from SBN to Observer is done via a haadmin command: haadmin -transitionToObserver, and vise versa you can transition Observer to SBN via haadmin -transitionToStandby. There is no automatic transition between the two, and no transition is allowed between Observer and ANN.

            In terms of failover, details are yet to be discussed. Ideally we'd like to allow Observer to participate in the failover too but it is yet to be resolved. I did some preliminary work on that which you can find in the comments of HDFS-12975. The failover handling is tracked by HDFS-13182.

            csun Chao Sun added a comment - xiaochen , on the second question, current the transition from SBN to Observer is done via a haadmin command: haadmin -transitionToObserver , and vise versa you can transition Observer to SBN via haadmin -transitionToStandby . There is no automatic transition between the two, and no transition is allowed between Observer and ANN. In terms of failover, details are yet to be discussed. Ideally we'd like to allow Observer to participate in the failover too but it is yet to be resolved. I did some preliminary work on that which you can find in the comments of HDFS-12975 . The failover handling is tracked by HDFS-13182 .

            Attached Test Plan document.

            shv Konstantin Shvachko added a comment - Attached Test Plan document.
            xiangheng xiangheng added a comment -

            Thanks csun ,I configured hdfs-site.xml according to the plan document and used the haadmin command: {{haadmin -transitionToObserver,}}But transition from SBN  to Observer state failed,And have a prompt message :transitionToObserver: incorrect arguments,Can you tell me the configuration of the observer namenode related in detail?thank you very much.

            xiangheng xiangheng added a comment - Thanks csun  ,I configured hdfs-site.xml according to the plan document and used the haadmin command: {{haadmin -transitionToObserver,}}But transition from SBN  to Observer state failed,And have a prompt message : transitionToObserver: incorrect arguments ,Can you tell me the configuration of the observer namenode related in detail?thank you very much.
            vagarychen Chen Liang added a comment -

            xiangheng thanks for trying Observer read! What was the full command you ran? It should be something like hdfs haadmin -transitionToObserver <nnID> where nnID is the ID of the name node that you want to transition to Observer. You can run hdfs haadmin -getAllServiceState to list all the valid nnIDs in the cluster.

            vagarychen Chen Liang added a comment - xiangheng thanks for trying Observer read! What was the full command you ran? It should be something like hdfs haadmin -transitionToObserver <nnID> where nnID is the ID of the name node that you want to transition to Observer. You can run hdfs haadmin -getAllServiceState to list all the valid nnIDs in the cluster.
            xiangheng xiangheng added a comment -

            Thanks vagarychen(and sorry for the late response ),I have successfully transform namenode from Standby to Observer state,But i need to set ha.automatic-failover=false and close the ZKFC process, Whether we should consider while realizing the namenode state transition and supporting the ha.automatic-failover?thank you very much.

            xiangheng xiangheng added a comment - Thanks vagarychen (and sorry for the late response ),I have successfully transform namenode from Standby to Observer state,But i need to set ha.automatic-failover=false and close the ZKFC process, Whether we should consider while realizing the namenode state transition and supporting the ha.automatic-failover?thank you very much.
            csun Chao Sun added a comment -

            xiangheng: we are still working on the support for state transition between standby/observer in the auto failover environment. You can watch HDFS-14067, HDFS-13182 and HDFS-14059 for more detailed information.

            At the moment, one workaround is to not launch ZK failover controller on the host where the observer is at. Let me know if this works for you.

            csun Chao Sun added a comment - xiangheng : we are still working on the support for state transition between standby/observer in the auto failover environment. You can watch  HDFS-14067 , HDFS-13182 and HDFS-14059 for more detailed information. At the moment, one workaround is to not launch ZK failover controller on the host where the observer is at. Let me know if this works for you.
            xiangheng xiangheng added a comment -

            Thanks csun,I have tried this way but failed

            • one workaround is to not launch ZK failover controller on the host where the observer is at. Let me know if this works for you.

            i have three namenode (nn1,nn2,nn3),if i  launch ZK failover controller between nn1 and nn2,and transform the state of nn3 from standby to observer,it will be failed.

            Refusing to manually manage HA state, since it may cause
            a split-brain scenario or other incorrect state.
            If you are very sure you know what you are doing, please
            specify the --forcemanual flag.
            journal# hdfs haadmin -transitionToObserver --forcemanual nn3
            transitionToObserver: incorrect arguments
            i will focus on  HDFS-14067HDFS-13182 and HDFS-14059,thanks for your suggestions.

            xiangheng xiangheng added a comment - Thanks csun ,I have tried this way but failed one workaround is to not launch ZK failover controller on the host where the observer is at. Let me know if this works for you. i have three namenode (nn1,nn2,nn3),if i  launch ZK failover controller between nn1 and nn2,and transform the state of nn3 from standby to observer,it will be failed. Refusing to manually manage HA state, since it may cause a split-brain scenario or other incorrect state. If you are very sure you know what you are doing, please specify the --forcemanual flag. journal # hdfs haadmin -transitionToObserver --forcemanual nn3 transitionToObserver: incorrect arguments i will focus on   HDFS-14067 ,  HDFS-13182  and  HDFS-14059 ,thanks for your suggestions.
            csun Chao Sun added a comment -

            xiangheng you are right - one more patch is required to make this work - you can check HDFS-14067 for the fix. Thanks.

            csun Chao Sun added a comment - xiangheng you are right - one more patch is required to make this work - you can check HDFS-14067 for the fix. Thanks.
            xiangheng xiangheng added a comment - - edited

            Hi,csun,I am very glad to communicate this question with you,I have checked HDFS-14067 and make a test,It seems that the problem is still unsolved.If you agree with it,I will propose a new issue and try my best to solve this problem,please let me know if you have any suggestions.thank you very much.

            xiangheng xiangheng added a comment - - edited Hi, csun ,I am very glad to communicate this question with you,I have checked HDFS-14067  and make a test,It seems that the problem is still unsolved.If you agree with it,I will propose a new issue and try my best to solve this problem,please let me know if you have any suggestions.thank you very much.

            Submitting a unified patch for HDFS-12943 branch for review and for a Jenkins run.

            shv Konstantin Shvachko added a comment - Submitting a unified patch for HDFS-12943 branch for review and for a Jenkins run.
            elgoiri Íñigo Goiri added a comment -

            Is there a JIRA tracking the documentation/user guide?
            I think we should be able to push that fairly fast.

            elgoiri Íñigo Goiri added a comment - Is there a JIRA tracking the documentation/user guide? I think we should be able to push that fairly fast.

            Hey goiri, see HDFS-14131 - the documentation jira.

            shv Konstantin Shvachko added a comment - Hey goiri , see HDFS-14131 - the documentation jira.
            brahmareddy Brahma Reddy Battula added a comment - - edited

            Thanks all for great work here.

            I think,write requests can be degraded..? As they also contains some read requests like  getFileinfo(),getServerDefaults() ...(getHAServiceState() is newly added) .

            Just I had checked for mkdir perf,it's like below.

            • i) getHAServiceState() took 2+ sec ( 3 getHAServiceState() + 2 getFileInfo()  + 1 mkdirs = 6 calls)
            • ii) Every second request is getting timedout[1] and rpc call is getting skipped from observer.(  7 getHAServiceState() + 4 getFileInfo() + 1 mkdirs = 12 calls).Here two getFileInfo() skipped from observer hence it's success with Active. 
            time hdfs --loglevel debug dfs -Ddfs.client.failover.proxy.provider.hacluster=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /TestsORF1
            real 0m4.314s
            user 0m3.668s
            sys 0m0.272s
            time hdfs --loglevel debug dfs -Ddfs.client.failover.proxy.provider.hacluster=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /TestsORF2
            real 0m22.238s
            user 0m3.800s
            sys 0m0.248s
            

             

            without ObserverReadProxyProvider ( 2 getFileInfo()  + 1 mkdirs() = 3 Calls) 

            time ./hdfs --loglevel debug dfs  -mkdir /TestsCFP
            real 0m2.105s
            user 0m3.768s
            sys 0m0.592s
            

            Please correct me if I am missing anyting.

             

            timedout[1],Every second write request I am getting following, did I miss something here,these calls are skipped from observer.

            2018-12-14 11:21:45,312 DEBUG ipc.Client: closing ipc connection to vm1/10.*.*.*:65110: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.*.*.*:58409 remote=vm1/10.*.*.*:65110]
            java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.*.*.*:58409 remote=vm1/10.*.*.*:65110]
             at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
             at java.io.FilterInputStream.read(FilterInputStream.java:133)
             at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
             at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
             at java.io.FilterInputStream.read(FilterInputStream.java:83)
             at java.io.FilterInputStream.read(FilterInputStream.java:83)
             at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:567)
             at java.io.DataInputStream.readInt(DataInputStream.java:387)
             at org.apache.hadoop.ipc.Client$IpcStreams.readResponse(Client.java:1849)
             at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1183)
             at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1079)
            2018-12-14 11:21:45,313 DEBUG ipc.Client: IPC Client (1006094903) connection to vm1/10.*.*.*:65110 from brahma: closed

             

             

             

            brahmareddy Brahma Reddy Battula added a comment - - edited Thanks all for great work here. I think,write requests can be degraded..? As they also contains some read requests like  getFileinfo(),getServerDefaults() ...(getHAServiceState() is newly added) . Just I had checked for mkdir perf,it's like below. i) getHAServiceState() took 2+ sec ( 3 getHAServiceState() + 2 getFileInfo()  + 1 mkdirs = 6 calls) ii) Every second request is getting timedout [1] and rpc call is getting skipped from observer.(  7 getHAServiceState() + 4 getFileInfo() + 1 mkdirs = 12 calls).Here two getFileInfo() skipped from observer hence it's success with Active.  time hdfs --loglevel debug dfs -Ddfs.client.failover.proxy.provider.hacluster=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /TestsORF1 real 0m4.314s user 0m3.668s sys 0m0.272s time hdfs --loglevel debug dfs -Ddfs.client.failover.proxy.provider.hacluster=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /TestsORF2 real 0m22.238s user 0m3.800s sys 0m0.248s   without ObserverReadProxyProvider ( 2 getFileInfo()  + 1 mkdirs() = 3 Calls)   time ./hdfs --loglevel debug dfs  -mkdir /TestsCFP real 0m2.105s user 0m3.768s sys 0m0.592s Please correct me if I am missing anyting.   timedout [1] ,Every second write request I am getting following, did I miss something here,these calls are skipped from observer. 2018-12-14 11:21:45,312 DEBUG ipc.Client: closing ipc connection to vm1/10.*.*.*:65110: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.*.*.*:58409 remote=vm1/10.*.*.*:65110] java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.*.*.*:58409 remote=vm1/10.*.*.*:65110] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.FilterInputStream.read(FilterInputStream.java:133) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read(BufferedInputStream.java:265) at java.io.FilterInputStream.read(FilterInputStream.java:83) at java.io.FilterInputStream.read(FilterInputStream.java:83) at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:567) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.hadoop.ipc.Client$IpcStreams.readResponse(Client.java:1849) at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1183) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1079) 2018-12-14 11:21:45,313 DEBUG ipc.Client: IPC Client (1006094903) connection to vm1/10.*.*.*:65110 from brahma: closed      
            xkrogen Erik Krogen added a comment - - edited

            Hey brahmareddy, thanks for trying it out and for the detailed feedback!

            I think when we discuss a "request", we need to differentiate an RPC request originating from a Java application (MapReduce task, etc.) vs. a CLI request. The former will be the vast majority of operations on a typical cluster, so I would argue that optimizing for the performance and efficiency of that usage is much more important. The ObserverReadProxyProvider does have higher startup overheads as it directly polls for the state rather than just blindly trying its request; however, in an application which performs more than a few RPCs, this cost will be easily amortized away. I don't think it's fair to say that "write" performance is degraded simply because hdfs dfs -mkdirs takes longer; a benchmark running 100+ mkdirs would be a better measure IMO. If CLI performance is important, such clients can continue to use ConfiguredFailoverProxyProvider and communicate with the active directly.

            The timeout you have shared is interesting. I suspect that it may be caused by the Observer trying to wait for its state to catch up to the stateID requested by your getFileInfo. I have a few questions:

            1. Are you running with HDFS-13873? With this patch (only committed yesterday so I doubt you have it) the exception thrown should be more meaningful.
            2. Did you remember to enable in-progress edit log tailing?
            3. Was this run on an almost completely stagnant cluster (no other writes)? This can make the ANN flush its edits to the JNs less frequently, increasing the lag time between ANN and Observer.
            xkrogen Erik Krogen added a comment - - edited Hey brahmareddy , thanks for trying it out and for the detailed feedback! I think when we discuss a "request", we need to differentiate an RPC request originating from a Java application (MapReduce task, etc.) vs. a CLI request. The former will be the vast majority of operations on a typical cluster, so I would argue that optimizing for the performance and efficiency of that usage is much more important. The ObserverReadProxyProvider does have higher startup overheads as it directly polls for the state rather than just blindly trying its request; however, in an application which performs more than a few RPCs, this cost will be easily amortized away. I don't think it's fair to say that "write" performance is degraded simply because hdfs dfs -mkdirs takes longer; a benchmark running 100+ mkdirs would be a better measure IMO. If CLI performance is important, such clients can continue to use ConfiguredFailoverProxyProvider and communicate with the active directly. The timeout you have shared is interesting. I suspect that it may be caused by the Observer trying to wait for its state to catch up to the stateID requested by your getFileInfo. I have a few questions: Are you running with HDFS-13873 ? With this patch (only committed yesterday so I doubt you have it) the exception thrown should be more meaningful. Did you remember to enable in-progress edit log tailing? Was this run on an almost completely stagnant cluster (no other writes)? This can make the ANN flush its edits to the JNs less frequently, increasing the lag time between ANN and Observer.
            brahmareddy Brahma Reddy Battula added a comment - - edited

            I think when we discuss a "request", we need to differentiate an RPC request originating from a Java application (MapReduce task, etc.) vs. a CLI request. The former will be the vast majority of operations on a typical cluster, so I would argue that optimizing for the performance and efficiency of that usage is much more important.

            Agree, I Could have mentioned CLI. But getHAServiceState() call from ORP which taken 2s+ as I mentioned above.Bytheway My intent was when read/write are combined in single application how much will be impact as it needs switch?

            Just for curiosity,,do we've write benchmarks with and without ORP,as I didn't find from HDFS-14058 and HDFS-14059?

            1.Are you running with HDFS-13873? With this patch (only committed yesterday so I doubt you have it) the exception thrown should be more meaningful.

            Yes,with latest HDFS-12943 branch.

            2.Did you remember to enable in-progress edit log tailing?

            Yes,Enabled for three NN's

            3.Was this run on an almost completely stagnant cluster (no other writes)? This can make the ANN flush its edits to the JNs less frequently, increasing the lag time between ANN and Observer.

            Yes,no other writes.

             
            Tried the following test with and with ORF,Came to know it's(perf impact) based on the tailing edits("dfs.ha.tail-edits.period") which is default 1m.(In tests, it's 100MS)..

            @Test
             public void testSimpleRead() throws Exception {
             long avg=0;
             long avgL=0;
             long avgC=0;
             int num = 100;
             for (int i = 0; i < num; i++) {
             Path testPath1 = new Path(testPath, "test1"+i);
             long startTime=System.currentTimeMillis();
             assertTrue(dfs.mkdirs(testPath1, FsPermission.getDefault()));
             long l = System.currentTimeMillis() - startTime;
             System.out.println("time TakenL1: "+i+" : "+l);
             avg = avg+l;
             assertSentTo(0);
             long startTime2=System.currentTimeMillis();
             dfs.getContentSummary(testPath1);
             long C = System.currentTimeMillis() - startTime2;
             System.out.println("time TakengetContentSummary: "+i+" : "+ C);
             avgC = avgC+C;
             assertSentTo(2);
             long startTime1=System.currentTimeMillis();
             dfs.getFileStatus(testPath1);
             long L = System.currentTimeMillis() - startTime1;
             System.out.println("time TakengetFileStatus: "+i+" : "+ L);
             avgL = avgL+L;
             assertSentTo(2);
            }
             System.out.println("AVG: mkDir: "+avg/num+" List: "+avgL/num+" Cont: "+avgC/num);
            }

            IMO,Configuring less value(like 100ms) for reading ingress edits put load on journalnode till log roll happens(2mins by default),as it's open the stream to read the edits.

            Apart from the perf i have following queries.
            i) Did we try with C/CPP client..?
            ii)are we planning separate metrics for observer reads(Client Side),Application like mapred might helpful for  job counters?

             

            brahmareddy Brahma Reddy Battula added a comment - - edited I think when we discuss a "request", we need to differentiate an RPC request originating from a Java application (MapReduce task, etc.) vs. a CLI request. The former will be the vast majority of operations on a typical cluster, so I would argue that optimizing for the performance and efficiency of that usage is much more important. Agree, I Could have mentioned CLI. But getHAServiceState() call from ORP which taken 2s+ as I mentioned above.Bytheway My intent was when read/write are combined in single application how much will be impact as it needs switch? Just for curiosity,,do we've write benchmarks with and without ORP,as I didn't find from HDFS-14058 and HDFS-14059 ? 1.Are you running with HDFS-13873 ? With this patch (only committed yesterday so I doubt you have it) the exception thrown should be more meaningful. Yes,with latest HDFS-12943 branch. 2.Did you remember to enable in-progress edit log tailing? Yes,Enabled for three NN's 3.Was this run on an almost completely stagnant cluster (no other writes)? This can make the ANN flush its edits to the JNs less frequently, increasing the lag time between ANN and Observer. Yes,no other writes.   Tried the following test with and with ORF,Came to know it's(perf impact) based on the tailing edits(" dfs.ha.tail-edits.period") which is default 1m.(In tests, it's 100MS).. @Test public void testSimpleRead() throws Exception { long avg=0; long avgL=0; long avgC=0; int num = 100; for ( int i = 0; i < num; i++) { Path testPath1 = new Path(testPath, "test1" +i); long startTime= System .currentTimeMillis(); assertTrue(dfs.mkdirs(testPath1, FsPermission.getDefault())); long l = System .currentTimeMillis() - startTime; System .out.println( "time TakenL1: " +i+ " : " +l); avg = avg+l; assertSentTo(0); long startTime2= System .currentTimeMillis(); dfs.getContentSummary(testPath1); long C = System .currentTimeMillis() - startTime2; System .out.println( "time TakengetContentSummary: " +i+ " : " + C); avgC = avgC+C; assertSentTo(2); long startTime1= System .currentTimeMillis(); dfs.getFileStatus(testPath1); long L = System .currentTimeMillis() - startTime1; System .out.println( "time TakengetFileStatus: " +i+ " : " + L); avgL = avgL+L; assertSentTo(2); } System .out.println( "AVG: mkDir: " +avg/num+ " List: " +avgL/num+ " Cont: " +avgC/num); } IMO,Configuring less value(like 100ms) for reading ingress edits put load on journalnode till log roll happens(2mins by default),as it's open the stream to read the edits. Apart from the perf i have following queries. i) Did we try with C/CPP client..? ii)are we planning separate metrics for observer reads(Client Side),Application like mapred might helpful for  job counters?  
            vagarychen Chen Liang added a comment - - edited

            Hi brahmareddy,

            Thanks for testing! The timeout issue seems interesting. To start with, it is expected to see some performance degradation from CLI, because CLI initiates a DFSClient every time for each command, a fresh DFSClient has to get status of name nodes every time. But if it is the same DFSClient being reused, this would not be an issue. I have never seen the second-call issue. Here is an output from our cluster (log outpu part omitted), and I think you are right about lowering dfs.ha.tail-edits.period, we had similar numbers here:

            $time hdfs --loglevel debug dfs -Ddfs.client.failover.proxy.provider.***=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /TestsORF1
            real	0m2.254s
            user	0m3.608s
            sys	0m0.331s
            $time hdfs --loglevel debug dfs -Ddfs.client.failover.proxy.provider.***=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /TestsORF2
            real	0m2.159s
            user	0m3.855s
            sys	0m0.330s

            Curious, how many NN you had in the testing? and was there any error from NN logs?

            vagarychen Chen Liang added a comment - - edited Hi brahmareddy , Thanks for testing! The timeout issue seems interesting. To start with, it is expected to see some performance degradation from CLI , because CLI initiates a DFSClient every time for each command, a fresh DFSClient has to get status of name nodes every time. But if it is the same DFSClient being reused, this would not be an issue. I have never seen the second-call issue. Here is an output from our cluster (log outpu part omitted), and I think you are right about lowering dfs.ha.tail-edits.period, we had similar numbers here: $time hdfs --loglevel debug dfs -Ddfs.client.failover.proxy.provider.***=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /TestsORF1 real 0m2.254s user 0m3.608s sys 0m0.331s $time hdfs --loglevel debug dfs -Ddfs.client.failover.proxy.provider.***=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /TestsORF2 real 0m2.159s user 0m3.855s sys 0m0.330s Curious, how many NN you had in the testing? and was there any error from NN logs?
            csun Chao Sun added a comment -

            I think we should document dfs.ha.tail-edits.period in the user guide - the default value is just too large for observer reads. Filed HDFS-14154.

            csun Chao Sun added a comment - I think we should document dfs.ha.tail-edits.period in the user guide - the default value is just too large for observer reads. Filed HDFS-14154 .
            xkrogen Erik Krogen added a comment - - edited

            Bytheway My intent was when read/write are combined in single application how much will be impact as it needs switch?

            There will only be potential performance impact when switching from writes (sent to Active) to reads (sent to Observer) since the client may need to wait some time for the state on the Observer to catch up. Experience when designing HDFS-13150 indicated that this delay time could be reduced to a few ms when properly tuned, which would make the delay of switching from Active to Observer negligible. See the design doc, especially Appendix A, for more details.

            Just for curiosity,,do we've write benchmarks with and without ORP,as I didn't find from HDFS-14058 and HDFS-14059?

            There are some preliminary performance numbers shared in my earlier comment in this thread. I'm not aware of any good benchmark numbers produced after finishing the feature, maybe csun can provide them?

            Tried the following test with and with ORF,Came to know it's(perf impact) based on the tailing edits("dfs.ha.tail-edits.period") which is default 1m.(In tests, it's 100MS)..
            ...
            IMO,Configuring less value(like 100ms) for reading ingress edits put load on journalnode till log roll happens(2mins by default),as it's open the stream to read the edits.

            I think I now understand the issue that you were facing. To use this feature correctly, in addition to setting dfs.ha.tail-edits.in-progress to true, you should also set dfs.ha.tail-edits.period to a small value; in our case I think we use 0 or 1 ms. Your concern about heavier load in the JournalNode would have previously been valid, but with the completion of HDFS-13150 and dfs.ha.tail-edits.in-progress enabled, the Standby/Observer no longer creates a new stream to tail edits, instead polling for edits via RPC (and thus making use of connection keepalive). This greatly reduces the overheads involved with each iteration of edit tailing, enabling it to be done much more frequently. I created HDFS-14155 to track updating the documentation with this information.

            i) Did we try with C/CPP client..?

            We haven't developed any support for these clients, no. They should continue to work on clusters with the Observer enabled but will not be able to take advantage of the new functionality.

            ii)are we planning separate metrics for observer reads(Client Side),Application like mapred might helpful for job counters?

            There's no metrics like this on the client side at this time, we are relying on server-side metrics, but I agree that this could be a useful addition.

            xkrogen Erik Krogen added a comment - - edited Bytheway My intent was when read/write are combined in single application how much will be impact as it needs switch? There will only be potential performance impact when switching from writes (sent to Active) to reads (sent to Observer) since the client may need to wait some time for the state on the Observer to catch up. Experience when designing HDFS-13150 indicated that this delay time could be reduced to a few ms when properly tuned, which would make the delay of switching from Active to Observer negligible. See the design doc , especially Appendix A, for more details. Just for curiosity,,do we've write benchmarks with and without ORP,as I didn't find from HDFS-14058 and HDFS-14059 ? There are some preliminary performance numbers shared in my earlier comment in this thread. I'm not aware of any good benchmark numbers produced after finishing the feature, maybe csun can provide them? Tried the following test with and with ORF,Came to know it's(perf impact) based on the tailing edits("dfs.ha.tail-edits.period") which is default 1m.(In tests, it's 100MS).. ... IMO,Configuring less value(like 100ms) for reading ingress edits put load on journalnode till log roll happens(2mins by default),as it's open the stream to read the edits. I think I now understand the issue that you were facing. To use this feature correctly, in addition to setting dfs.ha.tail-edits.in-progress to true, you should also set dfs.ha.tail-edits.period to a small value; in our case I think we use 0 or 1 ms. Your concern about heavier load in the JournalNode would have previously been valid, but with the completion of HDFS-13150 and dfs.ha.tail-edits.in-progress enabled, the Standby/Observer no longer creates a new stream to tail edits, instead polling for edits via RPC (and thus making use of connection keepalive). This greatly reduces the overheads involved with each iteration of edit tailing, enabling it to be done much more frequently. I created HDFS-14155 to track updating the documentation with this information. i) Did we try with C/CPP client..? We haven't developed any support for these clients, no. They should continue to work on clusters with the Observer enabled but will not be able to take advantage of the new functionality. ii)are we planning separate metrics for observer reads(Client Side),Application like mapred might helpful for job counters? There's no metrics like this on the client side at this time, we are relying on server-side metrics, but I agree that this could be a useful addition.
            xkrogen Erik Krogen added a comment -

            Whoops, took too long writing my comment. Thanks for also addressing the tail-edits period issue in the documentation, Chao. Will close mine as duplicate.

            xkrogen Erik Krogen added a comment - Whoops, took too long writing my comment. Thanks for also addressing the tail-edits period issue in the documentation, Chao. Will close mine as duplicate.

            Hi vagarychen

            I have never seen the second-call issue. Here is an output from our cluster (log outpu part omitted), and I think you are right about lowering dfs.ha.tail-edits.period, we had similar numbers here:

            you can see this issue if "dfs.ha.tail-edits.period" is default value.

            Curious, how many NN you had in the testing? and was there any error from NN logs?

            1 ANN,1 SNN,1 Obserserver. No error logs from NN's.

            Hi csun

            I think we should document dfs.ha.tail-edits.period in the user guide - the default value is just too large for observer reads. Filed HDFS-14154.

            Yes, thanks for reporting the same.

            Hi xkrogen

            Your concern about heavier load in the JournalNode would have previously been valid, but with the completion of HDFS-13150 and dfs.ha.tail-edits.in-progress enabled, the Standby/Observer no longer creates a new stream to tail edits, instead polling for edits via RPC (and thus making use of connection keepalive). This greatly reduces the overheads involved with each iteration of edit tailing, enabling it to be done much more frequently.

            Yes,this is one of my concern. Gone through fast path(HDFS-13150)  thanks,it can improve.

            I'm not aware of any good benchmark numbers produced after finishing the feature, maybe csun can provide them?

            csun can you provide..? I am sure this feature going to be great advantage over rpc workload on ANN, just i want to know write benchmarks also ( as getHAserviceState() and fast editing tailing edits are intrdouced).Sorry for pitching very late..

            brahmareddy Brahma Reddy Battula added a comment - Hi  vagarychen I have never seen the second-call issue. Here is an output from our cluster (log outpu part omitted), and I think you are right about lowering dfs.ha.tail-edits.period, we had similar numbers here: you can see this issue if "dfs.ha.tail-edits.period" is default value. Curious, how many NN you had in the testing? and was there any error from NN logs? 1 ANN,1 SNN,1 Obserserver. No error logs from NN's. Hi csun I think we should document  dfs.ha.tail-edits.period  in the user guide - the default value is just too large for observer reads. Filed  HDFS-14154 . Yes, thanks for reporting the same. Hi xkrogen Your concern about heavier load in the JournalNode would have previously been valid, but with the completion of  HDFS-13150  and  dfs.ha.tail-edits.in-progress  enabled, the Standby/Observer no longer creates a new stream to tail edits, instead polling for edits via RPC (and thus making use of connection keepalive). This greatly reduces the overheads involved with each iteration of edit tailing, enabling it to be done much more frequently. Yes,this is one of my concern. Gone through fast path ( HDFS-13150 )  thanks,it can improve. I'm not aware of any good benchmark numbers produced after finishing the feature, maybe  csun  can provide them? csun can you provide..? I am sure this feature going to be great advantage over rpc workload on ANN, just i want to know write benchmarks also ( as getHAserviceState() and fast editing tailing edits are intrdouced).Sorry for pitching very late..
            csun Chao Sun added a comment -

            brahmareddy xkrogen: unfortunately I can't provide enough data points on this. In our production we deployed a slight different version than upstream - the observer hosts are fixed in config so no getHAServiceState is issued (on the downside observer cannot participate in failover). I do intend to run some benchmark with the latest upstream code though. Perhaps will update later.

            csun Chao Sun added a comment - brahmareddy xkrogen : unfortunately I can't provide enough data points on this. In our production we deployed a slight different version than upstream - the observer hosts are fixed in config so no getHAServiceState is issued (on the downside observer cannot participate in failover). I do intend to run some benchmark with the latest upstream code though. Perhaps will update later.
            vagarychen Chen Liang added a comment -

            Hi brahmareddy

            you can see this issue if "dfs.ha.tail-edits.period" is default value.

            Yes, with default period of 1min, any read can take up to 1min to finish, this is not specific to "second" call as you were mentioning, but any read. I agree that we need to lower this value. In our environment, we do already have set it to 100ms, and with this setting, I never seen the issue of always the second call timeout as you mentioned, nor getServiceState taking 2 seconds. I was under the impression that you still had the timeout even with setting it to 100ms?

            vagarychen Chen Liang added a comment - Hi  brahmareddy you can see this issue if "dfs.ha.tail-edits.period" is default value. Yes, with default period of 1min, any read can take up to 1min to finish, this is not specific to "second" call as you were mentioning, but any read. I agree that we need to lower this value. In our environment, we do already have set it to 100ms, and with this setting, I never seen the issue of always the second call timeout as you mentioned, nor getServiceState taking 2 seconds. I was under the impression that you still had the timeout even with setting it to 100ms?
            vagarychen Chen Liang added a comment -

            Hi brahmareddy,

            Some more notes to add:
            1. getHAServiceState() only gets called when initialization of client proxies (and of course when existing proxies failed and client reinitialize them). In regular operation, this call will not happen so it should not be a concern in benchmarks.
            2. I tried the unit test you shared locally with Observer read enabled/disabled. I did not see difference in terms of mkdir time, it has been about 2ms the whole time regardless. I saw some degradation on get content summary though. But this is due to that the unit test is doing mkdir -> getContentSummary -> getFileStatus -> repeat. So the client is constantly switching between write and read, and thus constantly switching between proxies(NNs). This is not the IO pattern Observer is mainly targeting for, and probably the worst case for Observer read because every single getContentSummary call here could potentially trigger Observer catch up wait.

            vagarychen Chen Liang added a comment - Hi brahmareddy , Some more notes to add: 1. getHAServiceState() only gets called when initialization of client proxies (and of course when existing proxies failed and client reinitialize them). In regular operation, this call will not happen so it should not be a concern in benchmarks. 2. I tried the unit test you shared locally with Observer read enabled/disabled. I did not see difference in terms of mkdir time, it has been about 2ms the whole time regardless. I saw some degradation on get content summary though. But this is due to that the unit test is doing mkdir -> getContentSummary -> getFileStatus -> repeat. So the client is constantly switching between write and read, and thus constantly switching between proxies(NNs). This is not the IO pattern Observer is mainly targeting for, and probably the worst case for Observer read because every single getContentSummary call here could potentially trigger Observer catch up wait.
            hadoopqa Hadoop QA added a comment -
            -1 overall



            Vote Subsystem Runtime Comment
            0 reexec 0m 22s Docker mode activated.
                  Prechecks
            +1 @author 0m 0s The patch does not contain any @author tags.
            +1 test4tests 0m 0s The patch appears to include 26 new or modified test files.
                  trunk Compile Tests
            0 mvndep 1m 0s Maven dependency ordering for branch
            +1 mvninstall 18m 55s trunk passed
            +1 compile 14m 45s trunk passed
            +1 checkstyle 3m 19s trunk passed
            +1 mvnsite 4m 38s trunk passed
            +1 shadedclient 19m 3s branch has no errors when building and testing our client artifacts.
            0 findbugs 0m 0s Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
            +1 findbugs 7m 53s trunk passed
            +1 javadoc 3m 51s trunk passed
                  Patch Compile Tests
            0 mvndep 0m 23s Maven dependency ordering for patch
            +1 mvninstall 3m 56s the patch passed
            +1 compile 14m 52s the patch passed
            +1 cc 14m 52s the patch passed
            -1 javac 14m 52s root generated 196 new + 1294 unchanged - 196 fixed = 1490 total (was 1490)
            -0 checkstyle 3m 50s root: The patch generated 29 new + 2555 unchanged - 10 fixed = 2584 total (was 2565)
            +1 mvnsite 4m 53s the patch passed
            -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
            +1 xml 0m 2s The patch has no ill-formed XML file.
            +1 shadedclient 10m 50s patch has no errors when building and testing our client artifacts.
            0 findbugs 0m 0s Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client
            +1 findbugs 8m 24s the patch passed
            +1 javadoc 3m 48s the patch passed
                  Other Tests
            +1 unit 8m 26s hadoop-common in the patch passed.
            +1 unit 1m 51s hadoop-hdfs-client in the patch passed.
            -1 unit 75m 2s hadoop-hdfs in the patch failed.
            +1 unit 6m 14s hadoop-hdfs-native-client in the patch passed.
            +1 unit 17m 35s hadoop-hdfs-rbf in the patch passed.
            -1 unit 87m 42s hadoop-yarn-server-resourcemanager in the patch failed.
            +1 asflicense 0m 42s The patch does not generate ASF License warnings.
            317m 31s



            Reason Tests
            Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts
              hadoop.hdfs.server.datanode.TestDirectoryScanner
              hadoop.hdfs.server.namenode.ha.TestBootstrapAliasmap
              hadoop.hdfs.server.namenode.TestNestedEncryptionZones



            Subsystem Report/Notes
            Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f
            JIRA Issue HDFS-12943
            JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12952748/HDFS-12943-003.patch
            Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc xml
            uname Linux 2f96ecadf91b 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
            Build tool maven
            Personality /testptch/patchprocess/precommit/personality/provided.sh
            git revision trunk / f82922d
            maven version: Apache Maven 3.3.9
            Default Java 1.8.0_181
            findbugs v3.1.0-RC1
            javac https://builds.apache.org/job/PreCommit-HDFS-Build/25846/artifact/out/diff-compile-javac-root.txt
            checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/25846/artifact/out/diff-checkstyle-root.txt
            whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/25846/artifact/out/whitespace-eol.txt
            unit https://builds.apache.org/job/PreCommit-HDFS-Build/25846/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
            unit https://builds.apache.org/job/PreCommit-HDFS-Build/25846/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
            Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/25846/testReport/
            Max. process+thread count 3626 (vs. ulimit of 10000)
            modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-native-client hadoop-hdfs-project/hadoop-hdfs-rbf hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: .
            Console output https://builds.apache.org/job/PreCommit-HDFS-Build/25846/console
            Powered by Apache Yetus 0.8.0 http://yetus.apache.org

            This message was automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 22s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 26 new or modified test files.       trunk Compile Tests 0 mvndep 1m 0s Maven dependency ordering for branch +1 mvninstall 18m 55s trunk passed +1 compile 14m 45s trunk passed +1 checkstyle 3m 19s trunk passed +1 mvnsite 4m 38s trunk passed +1 shadedclient 19m 3s branch has no errors when building and testing our client artifacts. 0 findbugs 0m 0s Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client +1 findbugs 7m 53s trunk passed +1 javadoc 3m 51s trunk passed       Patch Compile Tests 0 mvndep 0m 23s Maven dependency ordering for patch +1 mvninstall 3m 56s the patch passed +1 compile 14m 52s the patch passed +1 cc 14m 52s the patch passed -1 javac 14m 52s root generated 196 new + 1294 unchanged - 196 fixed = 1490 total (was 1490) -0 checkstyle 3m 50s root: The patch generated 29 new + 2555 unchanged - 10 fixed = 2584 total (was 2565) +1 mvnsite 4m 53s the patch passed -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply +1 xml 0m 2s The patch has no ill-formed XML file. +1 shadedclient 10m 50s patch has no errors when building and testing our client artifacts. 0 findbugs 0m 0s Skipped patched modules with no Java source: hadoop-hdfs-project/hadoop-hdfs-native-client +1 findbugs 8m 24s the patch passed +1 javadoc 3m 48s the patch passed       Other Tests +1 unit 8m 26s hadoop-common in the patch passed. +1 unit 1m 51s hadoop-hdfs-client in the patch passed. -1 unit 75m 2s hadoop-hdfs in the patch failed. +1 unit 6m 14s hadoop-hdfs-native-client in the patch passed. +1 unit 17m 35s hadoop-hdfs-rbf in the patch passed. -1 unit 87m 42s hadoop-yarn-server-resourcemanager in the patch failed. +1 asflicense 0m 42s The patch does not generate ASF License warnings. 317m 31s Reason Tests Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts   hadoop.hdfs.server.datanode.TestDirectoryScanner   hadoop.hdfs.server.namenode.ha.TestBootstrapAliasmap   hadoop.hdfs.server.namenode.TestNestedEncryptionZones Subsystem Report/Notes Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f JIRA Issue HDFS-12943 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12952748/HDFS-12943-003.patch Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc xml uname Linux 2f96ecadf91b 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/patchprocess/precommit/personality/provided.sh git revision trunk / f82922d maven version: Apache Maven 3.3.9 Default Java 1.8.0_181 findbugs v3.1.0-RC1 javac https://builds.apache.org/job/PreCommit-HDFS-Build/25846/artifact/out/diff-compile-javac-root.txt checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/25846/artifact/out/diff-checkstyle-root.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/25846/artifact/out/whitespace-eol.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/25846/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/25846/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/25846/testReport/ Max. process+thread count 3626 (vs. ulimit of 10000) modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-native-client hadoop-hdfs-project/hadoop-hdfs-rbf hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: . Console output https://builds.apache.org/job/PreCommit-HDFS-Build/25846/console Powered by Apache Yetus 0.8.0 http://yetus.apache.org This message was automatically generated.

            I just merged HDFS-12943 branch to trunk. Thank you everybody for contributing.
            Will keep this open for the last few outstanding sub-tasks.

            shv Konstantin Shvachko added a comment - I just merged HDFS-12943 branch to trunk. Thank you everybody for contributing. Will keep this open for the last few outstanding sub-tasks.
            xiangheng xiangheng added a comment -

            There are still some issues that have not been solved, which may affect the consistency of standby reads. Can we test the performance of standby reads in a real cluster environment now?and what should we focus on?

            xiangheng xiangheng added a comment - There are still some issues that have not been solved, which may affect the consistency of standby reads. Can we test the performance of standby reads in a real cluster environment now?and what should we focus on?
            zhz Zhe Zhang added a comment - - edited

            vagarychen has tested the current version of the feature on a real cluster, and can verify the aspects that have already been verified. I think weichiu has also done some tests.

            zhz Zhe Zhang added a comment - - edited vagarychen has tested the current version of the feature on a real cluster, and can verify the aspects that have already been verified. I think weichiu has also done some tests.

            Closing this as Fixed. The feature has been tested, back-ported down to 2.10 and released. Few remaining subtasks are being addressed as usual issues.
            Added release notes. Please review if I missed anything.

            Thank you everybody for contributing to this effort.

            shv Konstantin Shvachko added a comment - Closing this as Fixed. The feature has been tested, back-ported down to 2.10 and released. Few remaining subtasks are being addressed as usual issues. Added release notes. Please review if I missed anything. Thank you everybody for contributing to this effort.
            hudson Hudson added a comment -

            SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17592 (See https://builds.apache.org/job/Hadoop-trunk-Commit/17592/)
            Add 2.10.0 release notes for HDFS-12943 (jhung: rev ef9d12df24c0db76fd37a95551db7920d27d740c)

            • (edit) hadoop-common-project/hadoop-common/src/site/markdown/release/2.10.0/RELEASENOTES.2.10.0.md
            hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17592 (See https://builds.apache.org/job/Hadoop-trunk-Commit/17592/ ) Add 2.10.0 release notes for HDFS-12943 (jhung: rev ef9d12df24c0db76fd37a95551db7920d27d740c) (edit) hadoop-common-project/hadoop-common/src/site/markdown/release/2.10.0/RELEASENOTES.2.10.0.md
            lindy_hopper zhangkai added a comment -

            When client use getBlockLocation to access the observer node, the observer node will fail to update the access time of file.

            So we forbid the getBlockLocation now.

            Are there any other solution to deal with it?

            lindy_hopper zhangkai added a comment - When client use getBlockLocation to access the observer node, the observer node will fail to update the access time of file. So we forbid the getBlockLocation now. Are there any other solution to deal with it?
            vagarychen Chen Liang added a comment -

            lindy_hopper access time update is a write call so it can not be processed by Observer. Access time should be turned off on Observer, as mentioned in HDFS-14959.

            vagarychen Chen Liang added a comment - lindy_hopper  access time update is a write call so it can not be processed by Observer. Access time should be turned off on Observer, as mentioned in HDFS-14959 .

            Hey lindy_hopper yes we currently recommend turning off access time updates on Observers as vagarychen said.
            We plan to bring aTime updates back with HDFS-15118. Observer will bounce such getBlockLocation() calls to Active, so that it could actually update the time.

            shv Konstantin Shvachko added a comment - Hey lindy_hopper yes we currently recommend turning off access time updates on Observers as vagarychen said. We plan to bring aTime updates back with HDFS-15118 . Observer will bounce such getBlockLocation() calls to Active, so that it could actually update the time.

            People

              shv Konstantin Shvachko
              shv Konstantin Shvachko
              Votes:
              4 Vote for this issue
              Watchers:
              87 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 21.5h
                  21.5h