Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.0, 0.95.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      I have a YCSB client using the REST API. My testing shows the performance for scan with REST API is much worse than that with the java client API. We need to look into it and find out the root cause, either the test issue, or our REST API issue.

        Activity

        Hide
        stack added a comment -

        Marking closed.

        Show
        stack added a comment - Marking closed.
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #3976 (See https://builds.apache.org/job/HBase-TRUNK/3976/)
        HBASE-7803 [REST] Support caching on scan (Revision 1458576)

        Result = FAILURE
        jxiang :
        Files :

        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResource.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResultGenerator.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/model/ScannerModel.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/protobuf/generated/ScannerMessage.java
        • /hbase/trunk/hbase-server/src/main/resources/org/apache/hadoop/hbase/rest/protobuf/ScannerMessage.proto
        • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/rest/model/TestScannerModel.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #3976 (See https://builds.apache.org/job/HBase-TRUNK/3976/ ) HBASE-7803 [REST] Support caching on scan (Revision 1458576) Result = FAILURE jxiang : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResource.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResultGenerator.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/model/ScannerModel.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/protobuf/generated/ScannerMessage.java /hbase/trunk/hbase-server/src/main/resources/org/apache/hadoop/hbase/rest/protobuf/ScannerMessage.proto /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/rest/model/TestScannerModel.java
        Hide
        Hudson added a comment -

        Integrated in hbase-0.95 #89 (See https://builds.apache.org/job/hbase-0.95/89/)
        HBASE-7803 [REST] Support caching on scan (Revision 1458577)

        Result = SUCCESS
        jxiang :
        Files :

        • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResource.java
        • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResultGenerator.java
        • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/model/ScannerModel.java
        • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/protobuf/generated/ScannerMessage.java
        • /hbase/branches/0.95/hbase-server/src/main/resources/org/apache/hadoop/hbase/rest/protobuf/ScannerMessage.proto
        • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/rest/model/TestScannerModel.java
        Show
        Hudson added a comment - Integrated in hbase-0.95 #89 (See https://builds.apache.org/job/hbase-0.95/89/ ) HBASE-7803 [REST] Support caching on scan (Revision 1458577) Result = SUCCESS jxiang : Files : /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResource.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResultGenerator.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/model/ScannerModel.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/protobuf/generated/ScannerMessage.java /hbase/branches/0.95/hbase-server/src/main/resources/org/apache/hadoop/hbase/rest/protobuf/ScannerMessage.proto /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/rest/model/TestScannerModel.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #455 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/455/)
        HBASE-7803 [REST] Support caching on scan (Revision 1458576)

        Result = FAILURE
        jxiang :
        Files :

        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResource.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResultGenerator.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/model/ScannerModel.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/protobuf/generated/ScannerMessage.java
        • /hbase/trunk/hbase-server/src/main/resources/org/apache/hadoop/hbase/rest/protobuf/ScannerMessage.proto
        • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/rest/model/TestScannerModel.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #455 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/455/ ) HBASE-7803 [REST] Support caching on scan (Revision 1458576) Result = FAILURE jxiang : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResource.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResultGenerator.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/model/ScannerModel.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/protobuf/generated/ScannerMessage.java /hbase/trunk/hbase-server/src/main/resources/org/apache/hadoop/hbase/rest/protobuf/ScannerMessage.proto /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/rest/model/TestScannerModel.java
        Hide
        Hudson added a comment -

        Integrated in hbase-0.95-on-hadoop2 #34 (See https://builds.apache.org/job/hbase-0.95-on-hadoop2/34/)
        HBASE-7803 [REST] Support caching on scan (Revision 1458577)

        Result = FAILURE
        jxiang :
        Files :

        • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResource.java
        • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResultGenerator.java
        • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/model/ScannerModel.java
        • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/protobuf/generated/ScannerMessage.java
        • /hbase/branches/0.95/hbase-server/src/main/resources/org/apache/hadoop/hbase/rest/protobuf/ScannerMessage.proto
        • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/rest/model/TestScannerModel.java
        Show
        Hudson added a comment - Integrated in hbase-0.95-on-hadoop2 #34 (See https://builds.apache.org/job/hbase-0.95-on-hadoop2/34/ ) HBASE-7803 [REST] Support caching on scan (Revision 1458577) Result = FAILURE jxiang : Files : /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResource.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/ScannerResultGenerator.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/model/ScannerModel.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/protobuf/generated/ScannerMessage.java /hbase/branches/0.95/hbase-server/src/main/resources/org/apache/hadoop/hbase/rest/protobuf/ScannerMessage.proto /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/rest/model/TestScannerModel.java
        Hide
        Jimmy Xiang added a comment -

        Changed the title to match the patch better.

        Integrated into 0.95 and trunk. Thanks Stack and Andy for the review.

        Show
        Jimmy Xiang added a comment - Changed the title to match the patch better. Integrated into 0.95 and trunk. Thanks Stack and Andy for the review.
        Hide
        Andrew Purtell added a comment -

        +1

        Show
        Andrew Purtell added a comment - +1
        Hide
        stack added a comment -

        +1 on patch. Andrew Purtell What you think?

        Show
        stack added a comment - +1 on patch. Andrew Purtell What you think?
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12574197/trunk-7803.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 lineLengths. The patch introduces lines longer than 100

        -1 site. The patch appears to cause mvn site goal to fail.

        -1 core tests. The patch failed these unit tests:

        -1 core zombie tests. There are 1 zombie test(s):

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12574197/trunk-7803.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces lines longer than 100 -1 site . The patch appears to cause mvn site goal to fail. -1 core tests . The patch failed these unit tests: -1 core zombie tests . There are 1 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4870//console This message is automatically generated.
        Hide
        Jimmy Xiang added a comment -

        Batch means less REST HTTP trips. Caching means less trips to region servers. Based on the results, it seems both are performance killers, and HTTP overhead has more impact.

        Show
        Jimmy Xiang added a comment - Batch means less REST HTTP trips. Caching means less trips to region servers. Based on the results, it seems both are performance killers, and HTTP overhead has more impact.
        Hide
        Jimmy Xiang added a comment -

        I did some testing on my 4 nodes cluster with ycsb and here is the scan throughput I got with REST API:

        With caching, and using batch: 8.83
        With caching, but no batch: 0.99
        No caching, but using batch: 1.85
        No caching, no batch: 0.68

        On the same cluster, using the HBase client java API, the throughput I got is: 29.04

        Show
        Jimmy Xiang added a comment - I did some testing on my 4 nodes cluster with ycsb and here is the scan throughput I got with REST API: With caching, and using batch: 8.83 With caching, but no batch: 0.99 No caching, but using batch: 1.85 No caching, no batch: 0.68 On the same cluster, using the HBase client java API, the throughput I got is: 29.04
        Hide
        Jimmy Xiang added a comment -

        Attached a patch to make REST support caching.

        Show
        Jimmy Xiang added a comment - Attached a patch to make REST support caching.
        Hide
        Jimmy Xiang added a comment -

        I will find it out.

        Show
        Jimmy Xiang added a comment - I will find it out.
        Hide
        Andrew Purtell added a comment -

        do we know why we setCacheBlocks to false all the time in ScannerResultGenerator

        REST doesn't know if it should add cache pressure to the RS block cache or not.

        we never setCaching to the Scan object in ScannerResultGenerator

        Maybe a bug now, something that hasn't been examined I think since ~0.90. If you change this does this improve the results in your tests?

        Show
        Andrew Purtell added a comment - do we know why we setCacheBlocks to false all the time in ScannerResultGenerator REST doesn't know if it should add cache pressure to the RS block cache or not. we never setCaching to the Scan object in ScannerResultGenerator Maybe a bug now, something that hasn't been examined I think since ~0.90. If you change this does this improve the results in your tests?
        Hide
        Jimmy Xiang added a comment -

        Two things,
        (1) do we know why we setCacheBlocks to false all the time in ScannerResultGenerator?
        (2) we never setCaching to the Scan object in ScannerResultGenerator?

        Show
        Jimmy Xiang added a comment - Two things, (1) do we know why we setCacheBlocks to false all the time in ScannerResultGenerator? (2) we never setCaching to the Scan object in ScannerResultGenerator?
        Hide
        Andrew Purtell added a comment -

        Yes 70x indicates something amiss.

        Show
        Andrew Purtell added a comment - Yes 70x indicates something amiss.
        Hide
        Jimmy Xiang added a comment -

        Yes, each read is a separate HTTP request. I haven't done much profiling so far. I compared the throughput between REST API and java client API. The throughput for scanning of java client API is around 70x that of REST API. I am aware the HTTP overheads, but the difference is so big that I was wondering if anything was wrong with my testing or setup.

        Show
        Jimmy Xiang added a comment - Yes, each read is a separate HTTP request. I haven't done much profiling so far. I compared the throughput between REST API and java client API. The throughput for scanning of java client API is around 70x that of REST API. I am aware the HTTP overheads, but the difference is so big that I was wondering if anything was wrong with my testing or setup.
        Hide
        Andrew Purtell added a comment -

        I have also profiled the REST interface in the past and as you might expect the bulk of the runnable CPU time (as opposed to waiting for IO) is in converting from HBase client API results to the selected representation (XML, JSON, protobuf). I would be curious if you have profiled under your test load and where you observe the time going.

        Show
        Andrew Purtell added a comment - I have also profiled the REST interface in the past and as you might expect the bulk of the runnable CPU time (as opposed to waiting for IO) is in converting from HBase client API results to the selected representation (XML, JSON, protobuf). I would be curious if you have profiled under your test load and where you observe the time going.
        Hide
        Andrew Purtell added a comment -

        In the above, when considering wide vs tall, I mean as a characterization of client request behavior. Wide is many clients making bulk requests or a few requests per second each. Tall is a perhaps fewer number of clients making hundreds or thousands of requests per second each.

        Show
        Andrew Purtell added a comment - In the above, when considering wide vs tall, I mean as a characterization of client request behavior. Wide is many clients making bulk requests or a few requests per second each. Tall is a perhaps fewer number of clients making hundreds or thousands of requests per second each.
        Hide
        Andrew Purtell added a comment - - edited

        Have you considered HTTP overheads? YCSB throws a lot of ops at the target and I'd guess in your testing each read is a separate HTTP request. This was a topic that came up around the time of REST's introduction. It is appropriate for bulk or infrequent ops ("wide" load) as opposed to hundreds or thousands of transactions per second ("tall" load). For tall load, use Thrift or the native API. REST does batching for scanners and supports bundling ops into a single HTTP transaction to amortize the HTTP overheads.

        Edit: Fix big thumb spelling errors.

        Show
        Andrew Purtell added a comment - - edited Have you considered HTTP overheads? YCSB throws a lot of ops at the target and I'd guess in your testing each read is a separate HTTP request. This was a topic that came up around the time of REST's introduction. It is appropriate for bulk or infrequent ops ("wide" load) as opposed to hundreds or thousands of transactions per second ("tall" load). For tall load, use Thrift or the native API. REST does batching for scanners and supports bundling ops into a single HTTP transaction to amortize the HTTP overheads. Edit: Fix big thumb spelling errors.

          People

          • Assignee:
            Jimmy Xiang
            Reporter:
            Jimmy Xiang
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development