HBase
  1. HBase
  2. HBASE-10007

PerformanceEvaluation: Add sampling and latency collection to randomRead test

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.0, 0.96.1, 0.94.15
    • Component/s: Performance, test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      As I mentioned over on HBASE-9940, I'd like to have randomRead operate only on a sample of the total dataset. It would also be useful to collect latency measurements from individual responses. Throughput times are aggregated according to the amount of user data processed and the result is reported as well. This is a patch I've been using for some performance tests I've run – maybe it'll be useful to someone else.

      1. HBASE-10007-0.96.02.patch
        27 kB
        Nick Dimiduk
      2. HBASE-10007-0.96.01.patch
        26 kB
        Nick Dimiduk
      3. HBASE-10007-0.96.00.patch
        26 kB
        Nick Dimiduk
      4. HBASE-10007-0.94.00.patch
        23 kB
        Nick Dimiduk
      5. HBASE-10007.01.patch
        29 kB
        Nick Dimiduk
      6. HBASE-10007.00.patch
        28 kB
        Nick Dimiduk

        Activity

        Hide
        Nick Dimiduk added a comment -

        Here's my patch for 0.96. Will clean it up for trunk as well.

        Lars Hofhansl, Jean-Marc Spaggiari are these changes you'd like in 0.94 as well?

        Show
        Nick Dimiduk added a comment - Here's my patch for 0.96. Will clean it up for trunk as well. Lars Hofhansl , Jean-Marc Spaggiari are these changes you'd like in 0.94 as well?
        Hide
        Nick Dimiduk added a comment -

        Updating the 0.96 patch with additional changes made while preparing the trunk patch, primarily around access modifiers of static constants. These appear to have become public API on trunk so this maintains consistency.

        Show
        Nick Dimiduk added a comment - Updating the 0.96 patch with additional changes made while preparing the trunk patch, primarily around access modifiers of static constants. These appear to have become public API on trunk so this maintains consistency.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12614760/HBASE-10007-0.96.01.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7939//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614760/HBASE-10007-0.96.01.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7939//console This message is automatically generated.
        Hide
        Jean-Marc Spaggiari added a comment -

        I think might be nice to have it too in 0.94. That will allow to run some comparisons and validate optimizations impacts on the 2 branches...

        Show
        Jean-Marc Spaggiari added a comment - I think might be nice to have it too in 0.94. That will allow to run some comparisons and validate optimizations impacts on the 2 branches...
        Hide
        Andrew Purtell added a comment -

        Might be overkill but have you thought about reservoir sampling? Could hold down the total size of that array of floats for really large/long tests.

        Show
        Andrew Purtell added a comment - Might be overkill but have you thought about reservoir sampling? Could hold down the total size of that array of floats for really large/long tests.
        Hide
        Nick Dimiduk added a comment -

        Okay Jean-Marc Spaggiari I'll post a patch for 0.94 as well. We can address bringing 0.94's version up to feature parity with 0.96/trunk in a separate ticket.

        Yes, Andrew Purtell, I considered that. I'm not a statistician, so I don't know if that approach would be sufficient to keep tabs on the extremes. I'm also interested to pursue the technique employed by Gil's LatencyUtils project. Unless someone has immediate advice on the sampling implementation, I'd prefer to implement this as a separate ticket.

        Show
        Nick Dimiduk added a comment - Okay Jean-Marc Spaggiari I'll post a patch for 0.94 as well. We can address bringing 0.94's version up to feature parity with 0.96/trunk in a separate ticket. Yes, Andrew Purtell , I considered that. I'm not a statistician, so I don't know if that approach would be sufficient to keep tabs on the extremes. I'm also interested to pursue the technique employed by Gil's LatencyUtils project. Unless someone has immediate advice on the sampling implementation, I'd prefer to implement this as a separate ticket.
        Hide
        Nick Dimiduk added a comment -

        New patch for 0.94, updated patches for 0.96 and trunk. The latter two are just whitespace cleanup. I've taken all three for a spin on a pseudo-distributed hadoop + local hbase in both mapreduce and threaded variations.

        Show
        Nick Dimiduk added a comment - New patch for 0.94, updated patches for 0.96 and trunk. The latter two are just whitespace cleanup. I've taken all three for a spin on a pseudo-distributed hadoop + local hbase in both mapreduce and threaded variations.
        Hide
        Andrew Purtell added a comment -

        Skimmed the trunk patch, +1

        Show
        Andrew Purtell added a comment - Skimmed the trunk patch, +1
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12615026/HBASE-10007-0.94.00.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7956//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12615026/HBASE-10007-0.94.00.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7956//console This message is automatically generated.
        Hide
        Nick Dimiduk added a comment -

        Thanks for having a look Jean-Marc Spaggiari, Andrew Purtell.

        I'll commit this afternoon unless anyone speaks up.

        Show
        Nick Dimiduk added a comment - Thanks for having a look Jean-Marc Spaggiari , Andrew Purtell . I'll commit this afternoon unless anyone speaks up.
        Hide
        Nicolas Liochon added a comment -

        Committed on behalf of Nick.

        Show
        Nicolas Liochon added a comment - Committed on behalf of Nick.
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in HBase-0.94-security #345 (See https://builds.apache.org/job/HBase-0.94-security/345/)
        HBASE-10007 PerformanceEvaluation: Add sampling and latency collection to randomRead test (Nick Dimiduk) (nkeywal: rev 1544685)

        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
        Show
        Hudson added a comment - SUCCESS: Integrated in HBase-0.94-security #345 (See https://builds.apache.org/job/HBase-0.94-security/345/ ) HBASE-10007 PerformanceEvaluation: Add sampling and latency collection to randomRead test (Nick Dimiduk) (nkeywal: rev 1544685) /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in HBase-0.94 #1211 (See https://builds.apache.org/job/HBase-0.94/1211/)
        HBASE-10007 PerformanceEvaluation: Add sampling and latency collection to randomRead test (Nick Dimiduk) (nkeywal: rev 1544685)

        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
        Show
        Hudson added a comment - SUCCESS: Integrated in HBase-0.94 #1211 (See https://builds.apache.org/job/HBase-0.94/1211/ ) HBASE-10007 PerformanceEvaluation: Add sampling and latency collection to randomRead test (Nick Dimiduk) (nkeywal: rev 1544685) /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in hbase-0.96 #200 (See https://builds.apache.org/job/hbase-0.96/200/)
        HBASE-10007 PerformanceEvaluation: Add sampling and latency collection to randomRead test (Nick Dimiduk) (nkeywal: rev 1544683)

        • /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
        Show
        Hudson added a comment - SUCCESS: Integrated in hbase-0.96 #200 (See https://builds.apache.org/job/hbase-0.96/200/ ) HBASE-10007 PerformanceEvaluation: Add sampling and latency collection to randomRead test (Nick Dimiduk) (nkeywal: rev 1544683) /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in HBase-TRUNK #4693 (See https://builds.apache.org/job/HBase-TRUNK/4693/)
        HBASE-10007 PerformanceEvaluation: Add sampling and latency collection to randomRead test (Nick Dimiduk) (nkeywal: rev 1544681)

        • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
        Show
        Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK #4693 (See https://builds.apache.org/job/HBase-TRUNK/4693/ ) HBASE-10007 PerformanceEvaluation: Add sampling and latency collection to randomRead test (Nick Dimiduk) (nkeywal: rev 1544681) /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in hbase-0.96-hadoop2 #128 (See https://builds.apache.org/job/hbase-0.96-hadoop2/128/)
        HBASE-10007 PerformanceEvaluation: Add sampling and latency collection to randomRead test (Nick Dimiduk) (nkeywal: rev 1544683)

        • /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
        Show
        Hudson added a comment - SUCCESS: Integrated in hbase-0.96-hadoop2 #128 (See https://builds.apache.org/job/hbase-0.96-hadoop2/128/ ) HBASE-10007 PerformanceEvaluation: Add sampling and latency collection to randomRead test (Nick Dimiduk) (nkeywal: rev 1544683) /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #848 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/848/)
        HBASE-10007 PerformanceEvaluation: Add sampling and latency collection to randomRead test (Nick Dimiduk) (nkeywal: rev 1544681)

        • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
        Show
        Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #848 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/848/ ) HBASE-10007 PerformanceEvaluation: Add sampling and latency collection to randomRead test (Nick Dimiduk) (nkeywal: rev 1544681) /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
        Hide
        stack added a comment -

        Released in 0.96.1. Issue closed.

        Show
        stack added a comment - Released in 0.96.1. Issue closed.

          People

          • Assignee:
            Nick Dimiduk
            Reporter:
            Nick Dimiduk
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development