Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.1.0
    • 3.1.1
    • documentation, metrics
    • None
    • Reviewed

    Description

      There are a few confusing places in the Hadoop Metrics page. For instance, there are duplicated entries such as FsImageLoadTime; some quantile metrics do not have corresponding entries, description on some quantile metrics are not very specific on what is the num variable in the metrics name, etc.

      This JIRA targets at improving this.

      Attachments

        1. HDFS-13674.000.patch
          42 kB
          Chao Sun
        2. HDFS-13674.001.patch
          17 kB
          Chao Sun
        3. HDFS-13674.002.patch
          19 kB
          Chao Sun

        Activity

          linyiqun Yiqun Lin added a comment -

          Thanks csun for catching this. I'd like to help take the review once you attach the patch, .

          linyiqun Yiqun Lin added a comment - Thanks csun for catching this. I'd like to help take the review once you attach the patch, .
          csun Chao Sun added a comment -

          Thanks linyiqun! will appreciate your review

          csun Chao Sun added a comment - Thanks linyiqun ! will appreciate your review
          csun Chao Sun added a comment -

          Sorry for the delay linyiqun. Submitted patch v0 and it will be great if you can take a look.

          csun Chao Sun added a comment - Sorry for the delay linyiqun . Submitted patch v0 and it will be great if you can take a look.
          genericqa genericqa added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 14s Docker mode activated.
                Prechecks
          +1 @author 0m 0s The patch does not contain any @author tags.
                trunk Compile Tests
          +1 mvninstall 27m 11s trunk passed
          +1 mvnsite 1m 2s trunk passed
          +1 shadedclient 39m 39s branch has no errors when building and testing our client artifacts.
                Patch Compile Tests
          +1 mvnsite 0m 55s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 shadedclient 12m 0s patch has no errors when building and testing our client artifacts.
                Other Tests
          +1 asflicense 0m 23s The patch does not generate ASF License warnings.
          53m 39s



          Subsystem Report/Notes
          Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd
          JIRA Issue HDFS-13674
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12930147/HDFS-13674.000.patch
          Optional Tests asflicense mvnsite
          uname Linux 623bd633ae36 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/patchprocess/precommit/personality/provided.sh
          git revision trunk / 51654a3
          maven version: Apache Maven 3.3.9
          Max. process+thread count 336 (vs. ulimit of 10000)
          modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/24549/console
          Powered by Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          genericqa genericqa added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags.       trunk Compile Tests +1 mvninstall 27m 11s trunk passed +1 mvnsite 1m 2s trunk passed +1 shadedclient 39m 39s branch has no errors when building and testing our client artifacts.       Patch Compile Tests +1 mvnsite 0m 55s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 shadedclient 12m 0s patch has no errors when building and testing our client artifacts.       Other Tests +1 asflicense 0m 23s The patch does not generate ASF License warnings. 53m 39s Subsystem Report/Notes Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd JIRA Issue HDFS-13674 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12930147/HDFS-13674.000.patch Optional Tests asflicense mvnsite uname Linux 623bd633ae36 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/patchprocess/precommit/personality/provided.sh git revision trunk / 51654a3 maven version: Apache Maven 3.3.9 Max. process+thread count 336 (vs. ulimit of 10000) modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-HDFS-Build/24549/console Powered by Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          linyiqun Yiqun Lin added a comment -

          Hi csun, I don't think it's a improvement to separate quantile metric from one to five. Since they are almost same, so we combine them in one line. It should be okay I think.

          -| `EditLogFetchTime`*num*`s(50/75/90/95/99)thPercentileLatency` | The 50/75/90/95/99th percentile of time spent in fetching edit streams from journal nodes by standby NameNode, in milliseconds. Percentile measurement is off by default, by watching no intervals. The intervals are specified by `dfs.metrics.percentiles.intervals`. |
          +| `EditLogFetchTime`*num*`s50thPercentileLatency` | The 50th percentile of time spent in fetching edit streams from journal nodes by standby NameNode in milliseconds (*num* seconds granularity) if `rpc.metrics.quantile.enable` is set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. |
          +| `EditLogFetchTime`*num*`s75thPercentileLatency` | The 75th percentile of time spent in fetching edit streams from journal nodes by standby NameNode in milliseconds (*num* seconds granularity) if `rpc.metrics.quantile.enable` is set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. |
          +| `EditLogFetchTime`*num*`s90thPercentileLatency` | The 90th percentile of time spent in fetching edit streams from journal nodes by standby NameNode in milliseconds (*num* seconds granularity) if `rpc.metrics.quantile.enable` is set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. |
          +| `EditLogFetchTime`*num*`s95thPercentileLatency` | The 95th percentile of time spent in fetching edit streams from journal nodes by standby NameNode in milliseconds (*num* seconds granularity) if `rpc.metrics.quantile.enable` is set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. |
          +| `EditLogFetchTime`*num*`s99thPercentileLatency` | The 99th percentile of time spent in fetching edit streams from journal nodes by standby NameNode in milliseconds (*num* seconds granularity) if `rpc.metrics.quantile.enable` is set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. |
          

          Other change looks good to me.

          linyiqun Yiqun Lin added a comment - Hi csun , I don't think it's a improvement to separate quantile metric from one to five. Since they are almost same, so we combine them in one line. It should be okay I think. -| `EditLogFetchTime`*num*`s(50/75/90/95/99)thPercentileLatency` | The 50/75/90/95/99th percentile of time spent in fetching edit streams from journal nodes by standby NameNode, in milliseconds. Percentile measurement is off by default, by watching no intervals. The intervals are specified by `dfs.metrics.percentiles.intervals`. | +| `EditLogFetchTime`*num*`s50thPercentileLatency` | The 50th percentile of time spent in fetching edit streams from journal nodes by standby NameNode in milliseconds (*num* seconds granularity) if `rpc.metrics.quantile.enable` is set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. | +| `EditLogFetchTime`*num*`s75thPercentileLatency` | The 75th percentile of time spent in fetching edit streams from journal nodes by standby NameNode in milliseconds (*num* seconds granularity) if `rpc.metrics.quantile.enable` is set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. | +| `EditLogFetchTime`*num*`s90thPercentileLatency` | The 90th percentile of time spent in fetching edit streams from journal nodes by standby NameNode in milliseconds (*num* seconds granularity) if `rpc.metrics.quantile.enable` is set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. | +| `EditLogFetchTime`*num*`s95thPercentileLatency` | The 95th percentile of time spent in fetching edit streams from journal nodes by standby NameNode in milliseconds (*num* seconds granularity) if `rpc.metrics.quantile.enable` is set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. | +| `EditLogFetchTime`*num*`s99thPercentileLatency` | The 99th percentile of time spent in fetching edit streams from journal nodes by standby NameNode in milliseconds (*num* seconds granularity) if `rpc.metrics.quantile.enable` is set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. | Other change looks good to me.
          csun Chao Sun added a comment -

          linyiqun Thanks for the review. My intention was to make the doc more unified as I saw both one-line and five-line format for the percentile metrics. Will submit another patch.

          csun Chao Sun added a comment - linyiqun Thanks for the review. My intention was to make the doc more unified as I saw both one-line and five-line format for the percentile metrics. Will submit another patch.
          genericqa genericqa added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 15s Docker mode activated.
                Prechecks
          +1 @author 0m 0s The patch does not contain any @author tags.
                trunk Compile Tests
          +1 mvninstall 28m 7s trunk passed
          +1 mvnsite 1m 4s trunk passed
          +1 shadedclient 40m 12s branch has no errors when building and testing our client artifacts.
                Patch Compile Tests
          +1 mvnsite 1m 0s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 shadedclient 12m 14s patch has no errors when building and testing our client artifacts.
                Other Tests
          +1 asflicense 0m 24s The patch does not generate ASF License warnings.
          54m 35s



          Subsystem Report/Notes
          Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd
          JIRA Issue HDFS-13674
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12930459/HDFS-13674.001.patch
          Optional Tests asflicense mvnsite
          uname Linux 41a4c907f7d0 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/patchprocess/precommit/personality/provided.sh
          git revision trunk / 39ad989
          maven version: Apache Maven 3.3.9
          Max. process+thread count 301 (vs. ulimit of 10000)
          modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/24562/console
          Powered by Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          genericqa genericqa added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 15s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags.       trunk Compile Tests +1 mvninstall 28m 7s trunk passed +1 mvnsite 1m 4s trunk passed +1 shadedclient 40m 12s branch has no errors when building and testing our client artifacts.       Patch Compile Tests +1 mvnsite 1m 0s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 shadedclient 12m 14s patch has no errors when building and testing our client artifacts.       Other Tests +1 asflicense 0m 24s The patch does not generate ASF License warnings. 54m 35s Subsystem Report/Notes Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd JIRA Issue HDFS-13674 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12930459/HDFS-13674.001.patch Optional Tests asflicense mvnsite uname Linux 41a4c907f7d0 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/patchprocess/precommit/personality/provided.sh git revision trunk / 39ad989 maven version: Apache Maven 3.3.9 Max. process+thread count 301 (vs. ulimit of 10000) modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-HDFS-Build/24562/console Powered by Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          linyiqun Yiqun Lin added a comment -

          csun, thanks for updating the patch.

          ....if `rpc.metrics.quantile.enable` is set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. |
          

          The config rpc.metrics.quantile.enable and rpc.metrics.percentiles.intervals is making sense for Aggregate RPC metrics in COMMON. For NameNode metrics you updated in patch, following description is correct.

          Percentile measurement is off by default, by watching no intervals. The intervals are specified by `dfs.metrics.percentiles.intervals`. |
          

          Related codes:

            public static NameNodeMetrics create(Configuration conf, NamenodeRole r) {
              String sessionId = conf.get(DFSConfigKeys.DFS_METRICS_SESSION_ID_KEY);
              String processName = r.toString();
              MetricsSystem ms = DefaultMetricsSystem.instance();
              JvmMetrics jm = JvmMetrics.create(processName, sessionId, ms);
              
              // Percentile measurement is off by default, by watching no intervals
              int[] intervals = 
                  conf.getInts(DFSConfigKeys.DFS_METRICS_PERCENTILES_INTERVALS_KEY);
              return ms.register(new NameNodeMetrics(processName, sessionId,
                  intervals, jm));
            }
          
          linyiqun Yiqun Lin added a comment - csun , thanks for updating the patch. ....if `rpc.metrics.quantile.enable` is set to true. *num* is specified by `rpc.metrics.percentiles.intervals`. | The config rpc.metrics.quantile.enable and rpc.metrics.percentiles.intervals is making sense for Aggregate RPC metrics in COMMON. For NameNode metrics you updated in patch, following description is correct. Percentile measurement is off by default, by watching no intervals. The intervals are specified by `dfs.metrics.percentiles.intervals`. | Related codes: public static NameNodeMetrics create(Configuration conf, NamenodeRole r) { String sessionId = conf.get(DFSConfigKeys.DFS_METRICS_SESSION_ID_KEY); String processName = r.toString(); MetricsSystem ms = DefaultMetricsSystem.instance(); JvmMetrics jm = JvmMetrics.create(processName, sessionId, ms); // Percentile measurement is off by default , by watching no intervals int [] intervals = conf.getInts(DFSConfigKeys.DFS_METRICS_PERCENTILES_INTERVALS_KEY); return ms.register( new NameNodeMetrics(processName, sessionId, intervals, jm)); }
          csun Chao Sun added a comment -

          Oops good catch linyiqun! I'll fix this and upload a new patch.

          csun Chao Sun added a comment - Oops good catch linyiqun ! I'll fix this and upload a new patch.
          genericqa genericqa added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 40s Docker mode activated.
                Prechecks
          +1 @author 0m 0s The patch does not contain any @author tags.
                trunk Compile Tests
          +1 mvninstall 34m 5s trunk passed
          +1 mvnsite 1m 26s trunk passed
          +1 shadedclient 48m 52s branch has no errors when building and testing our client artifacts.
                Patch Compile Tests
          +1 mvnsite 1m 23s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 shadedclient 14m 59s patch has no errors when building and testing our client artifacts.
                Other Tests
          +1 asflicense 0m 30s The patch does not generate ASF License warnings.
          67m 2s



          Subsystem Report/Notes
          Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd
          JIRA Issue HDFS-13674
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12930559/HDFS-13674.002.patch
          Optional Tests asflicense mvnsite
          uname Linux 141e99d786a5 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/patchprocess/precommit/personality/provided.sh
          git revision trunk / 39ad989
          maven version: Apache Maven 3.3.9
          Max. process+thread count 301 (vs. ulimit of 10000)
          modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/24568/console
          Powered by Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          genericqa genericqa added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 40s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags.       trunk Compile Tests +1 mvninstall 34m 5s trunk passed +1 mvnsite 1m 26s trunk passed +1 shadedclient 48m 52s branch has no errors when building and testing our client artifacts.       Patch Compile Tests +1 mvnsite 1m 23s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 shadedclient 14m 59s patch has no errors when building and testing our client artifacts.       Other Tests +1 asflicense 0m 30s The patch does not generate ASF License warnings. 67m 2s Subsystem Report/Notes Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd JIRA Issue HDFS-13674 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12930559/HDFS-13674.002.patch Optional Tests asflicense mvnsite uname Linux 141e99d786a5 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/patchprocess/precommit/personality/provided.sh git revision trunk / 39ad989 maven version: Apache Maven 3.3.9 Max. process+thread count 301 (vs. ulimit of 10000) modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-HDFS-Build/24568/console Powered by Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          linyiqun Yiqun Lin added a comment -

          LGTM, +1.

          linyiqun Yiqun Lin added a comment - LGTM, +1.
          linyiqun Yiqun Lin added a comment -

          Committed this to trunk, Thanks csun for the contribution.

          linyiqun Yiqun Lin added a comment - Committed this to trunk, Thanks csun for the contribution.
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14537 (See https://builds.apache.org/job/Hadoop-trunk-Commit/14537/)
          HDFS-13674. Improve documentation on Metrics. Contributed by Chao Sun. (yqlin: rev 7a68ac607c52c8a28dcd75a367ae77331787a3b4)

          • (edit) hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14537 (See https://builds.apache.org/job/Hadoop-trunk-Commit/14537/ ) HDFS-13674 . Improve documentation on Metrics. Contributed by Chao Sun. (yqlin: rev 7a68ac607c52c8a28dcd75a367ae77331787a3b4) (edit) hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md
          csun Chao Sun added a comment -

          Thanks linyiqun for the review

          csun Chao Sun added a comment - Thanks linyiqun for the review

          People

            csun Chao Sun
            csun Chao Sun
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: