Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-22120

Replace HTrace with OpenTelemetry

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-alpha-1
    • Fix Version/s: 3.0.0-alpha-1
    • Component/s: tracing
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      In this issue we change our tracing system from HTrace to OpenTelemetry.
      The HTrace dependencies are banned(transitive dependencies are still allowed as hadoop still depends on them), the imports of htrace related classes are also banned.
      We add OpenTelemtry support for our RPC system, which means all the rpc methods will be traced on both client side and server side.
      Most methods in Table interface are also traced, except scan and coprocessor related methods. As now the scan implementation is always 'async prefetch', we haven't find a suitable way to represent this relationship between the foreground and background spans yet.
      At server side, due to the same reason, we only use a span to record the time of the WAL sync operation, without tracing into the background sync thread.
      And we do not trace the next method of RegionScanner, as a scan rpc call may lead to thousands of RegionScanner.next calls, which could slow down the rpc call even when tracing is disabled.
      On how to enable tracing, please read the Tracing section in our refguide.
      https://hbase.apache.org/book.html#tracing
      Show
      In this issue we change our tracing system from HTrace to OpenTelemetry. The HTrace dependencies are banned(transitive dependencies are still allowed as hadoop still depends on them), the imports of htrace related classes are also banned. We add OpenTelemtry support for our RPC system, which means all the rpc methods will be traced on both client side and server side. Most methods in Table interface are also traced, except scan and coprocessor related methods. As now the scan implementation is always 'async prefetch', we haven't find a suitable way to represent this relationship between the foreground and background spans yet. At server side, due to the same reason, we only use a span to record the time of the WAL sync operation, without tracing into the background sync thread. And we do not trace the next method of RegionScanner, as a scan rpc call may lead to thousands of RegionScanner.next calls, which could slow down the rpc call even when tracing is disabled. On how to enable tracing, please read the Tracing section in our refguide. https://hbase.apache.org/book.html#tracing

      Description

      Deprecate HTrace usage in HBase

      • HBase 1.x (branch-1)
      • Declare HTrace (htrace 3.x) deprecated in the user doc.
      • HBase 2.x (branch-2)
      • Declare HTrace deprecated in the user doc. Furthermore, state that it is known not working.
      • Either fix the trace context propagation bug in HBase 2.x, or backport OpenTracing support from the master branch. I am inclined to the latter.
      • HBase 3.x (master branch)
      • Remove HTrace entirely.
      • Add OpenTracing APIs. Potentially backport to HBase 2.4.
      • Replace OpenTracing API with OpenTelemetry when the latter stabilizes.

      Milestones

      1. Doc – deprecation notice
      2. Replace existing HTrace code with OpenTracing code in the master branch (3.x) 
      3. Java (a poc is currently under way)
      4. HBase shell and scripts (Ruby, shell script)
      5. Doc 
      1. Add new trace instrumentation code for new features not instrumented by the existing HTace code.
      2. Propagate the traces to other systems such as HDFS and MapReduce.
      3. Support other OpenTracing tracers.

      ======== Update ========
      As OpenTracing has now been replaced by OpenTelemetry, the goal finally becomes replacing HTrace with OpenTelemetry.

        Attachments

          Issue Links

          1.
          [OpenTracing] Declare HTrace is unusable in the user doc Sub-task Resolved Wei-Chiu Chuang
          2.
          [OpenTracing] Add OpenTracing dependency and helper methods Sub-task Resolved Wei-Chiu Chuang
          3.
          [OpenTracing] Migrate from HTrace to OpenTracing (Java code) Sub-task Resolved Wei-Chiu Chuang
          4.
          [OpenTracing] Propagate trace context to server correctly Sub-task Resolved Wei-Chiu Chuang
          5.
          [OpenTracing] Migrate hbase shell and scripts from HTrace to OpenTracing Sub-task Resolved Wei-Chiu Chuang
          6.
          Add documentation on how to enable and view tracing with OpenTelemetry Sub-task Resolved Duo Zhang
          7.
          [OpenTracing] Add shaded JaegerTracing tracer to hbase-thirdparty Sub-task Resolved Wei-Chiu Chuang
          8.
          [OpenTracing] Add traces in Procedure V2 Sub-task Resolved Wei-Chiu Chuang
          9.
          Add trace support for simple apis in async client Sub-task Resolved Duo Zhang
          10.
          Remove HTrace completely in code base and try to make use of OpenTelemetry Sub-task Resolved Duo Zhang
          11.
          Add trace support for async call in rpc client Sub-task Resolved Duo Zhang
          12.
          Find a way to config OpenTelemetry tracing without directly depending on opentelemetry-sdk Sub-task Resolved Duo Zhang
          13.
          Add trace support for connection registry Sub-task Resolved Duo Zhang
          14.
          Add trace support for HRegion read/write operation Sub-task Resolved Duo Zhang
          15.
          Add host and port attribute when tracing rpc call at client side Sub-task Resolved Duo Zhang
          16.
          Add trace support for WAL sync Sub-task Resolved Duo Zhang
          17.
          Set span kind to CLIENT in AbstractRpcClient Sub-task Resolved Duo Zhang
          18.
          Upgrade opentelemetry to 0.17.1 Sub-task Resolved Duo Zhang
          19.
          Upgrade opentelemetry to 1.0.0 Sub-task Resolved Duo Zhang
          20.
          Revisit the span names Sub-task Resolved Duo Zhang
          21.
          Basic benchmark to show the impact on performance for tracing Sub-task Resolved Duo Zhang
          22.
          Temporarily remove the trace support for RegionScanner.next Sub-task Resolved Duo Zhang
          23.
          Change the command line argument for tracing after upgrading opentelemtry to 1.0.0 Sub-task Resolved Duo Zhang
          24.
          Upgrade opentelemetry to 1.0.1 Sub-task Resolved Duo Zhang
          25.
          The tracinig implementation for AsyncConnectionImpl.getHbck is incorrect Sub-task Resolved Duo Zhang

            Activity

              People

              • Assignee:
                zhangduo Duo Zhang
                Reporter:
                sershe Sergey Shelukhin
              • Votes:
                1 Vote for this issue
                Watchers:
                35 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: