HBase
  1. HBase
  2. HBASE-1177

Delay when client is located on the same node as the regionserver

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Invalid
    • Affects Version/s: 0.19.0
    • Fix Version/s: None
    • Component/s: Performance
    • Labels:
    • Environment:

      Linux 2.6.25 x86_64

    • Tags:
      noob

      Description

      During testing of HBASE-80, we uncovered a strange 40ms delay for random reads. We ran a series of tests and found that it only happens when the client is on the same node as the RS and for a certain range of payloads (not specifically related to number of columns or size of them, only total payload). It appears to be precisely 40ms every time.

      Unsure if this is particular to our architecture, but it does happen on all nodes we've tried. Issue completely goes away with very large payloads or moving the client.

      Will post a test program tomorrow if anyone can test on a different architecture.

      Making a blocker for 0.20. Since this happens when you have an MR task running local to the RS, and this is what we try to do, might also consider making this a blocker for 0.19.1.

      1. zoom of columns vs round-trip blowup.jpg
        94 kB
        Jim Kellerman
      2. screenshot-4.jpg
        132 kB
        Jim Kellerman
      3. screenshot-3.jpg
        82 kB
        Jonathan Gray
      4. screenshot-2.jpg
        119 kB
        Jim Kellerman
      5. screenshot-1.jpg
        141 kB
        Jim Kellerman
      6. RowResults.readFields zoomed.jpg
        101 kB
        Jim Kellerman
      7. ReadDelayTest.java
        4 kB
        Jonathan Gray
      8. getRow times.jpg
        130 kB
        Jim Kellerman
      9. getRow + round-trip vs # columns.jpg
        111 kB
        Jim Kellerman
      10. Elapsed time of RowResults.readFields.jpg
        105 kB
        Jim Kellerman
      11. Contribution of seekTo to getClosest time.jpg
        139 kB
        Jim Kellerman
      12. Contribution of next to getRow time.jpg
        128 kB
        Jim Kellerman
      13. Contribution of getClosest to getRow time.jpg
        166 kB
        Jim Kellerman

        Issue Links

          Activity

          Hide
          Jonathan Gray added a comment -

          This is not meant for contribution, I have something much more generic if we want to add something about this to tests.

          Test creates table, imports rows, and performs random reads for two differently sized rows. The 7 column rows will be fetched quickly, 8 column rows show the 40ms delay behavior.

          > java RandomReadTest
          09/02/02 20:54:53 INFO client.HBaseAdmin: Enabled table test_table
          Read 1 row with 7 columns 100 times in 66ms
          Read 1 row with 8 columns 100 times in 4001ms
          
          Show
          Jonathan Gray added a comment - This is not meant for contribution, I have something much more generic if we want to add something about this to tests. Test creates table, imports rows, and performs random reads for two differently sized rows. The 7 column rows will be fetched quickly, 8 column rows show the 40ms delay behavior. > java RandomReadTest 09/02/02 20:54:53 INFO client.HBaseAdmin: Enabled table test_table Read 1 row with 7 columns 100 times in 66ms Read 1 row with 8 columns 100 times in 4001ms
          Hide
          Jonathan Gray added a comment -

          Removed dependencies.

          Show
          Jonathan Gray added a comment - Removed dependencies.
          Hide
          Jonathan Gray added a comment -

          New version shows larger payload going faster.

          09/02/03 08:26:07 INFO client.HBaseAdmin: Enabled table test_table2
          Read 1 row with 7 columns 100 times in 59ms
          Read 1 row with 8 columns 100 times in 4001ms
          Read 1 row with 1000 columns 100 times in 1161ms

          Show
          Jonathan Gray added a comment - New version shows larger payload going faster. 09/02/03 08:26:07 INFO client.HBaseAdmin: Enabled table test_table2 Read 1 row with 7 columns 100 times in 59ms Read 1 row with 8 columns 100 times in 4001ms Read 1 row with 1000 columns 100 times in 1161ms
          Hide
          Jean-Daniel Cryans added a comment -

          On my rig:

          jdcryans@jdcryans-lasi:~/svn/hbase/trunk$ ./bin/hbase ReadDelayTest
          09/02/03 11:22:36 INFO zookeeper.ZooKeeperWrapper: Quorum servers: localhost:2181
          09/02/03 11:22:42 INFO client.HBaseAdmin: Enabled table test_table
          Read 1 row with 7 columns 100 times in 78ms
          Read 1 row with 8 columns 100 times in 4000ms
          
          Show
          Jean-Daniel Cryans added a comment - On my rig: jdcryans@jdcryans-lasi:~/svn/hbase/trunk$ ./bin/hbase ReadDelayTest 09/02/03 11:22:36 INFO zookeeper.ZooKeeperWrapper: Quorum servers: localhost:2181 09/02/03 11:22:42 INFO client.HBaseAdmin: Enabled table test_table Read 1 row with 7 columns 100 times in 78ms Read 1 row with 8 columns 100 times in 4000ms
          Hide
          Jim Kellerman added a comment -

          On our cluster:

          Same machine as server hosting table
          
          Read 1 row with 7 columns 100 times in 206ms
          Read 1 row with 8 columns 100 times in 4018ms
          Read 1 row with 1000 columns 100 times in 13392ms
          
          Different machine as server hosting table
          
          Read 1 row with 7 columns 100 times in 152ms
          Read 1 row with 8 columns 100 times in 168ms
          Read 1 row with 1000 columns 100 times in 9386ms
          
          Show
          Jim Kellerman added a comment - On our cluster: Same machine as server hosting table Read 1 row with 7 columns 100 times in 206ms Read 1 row with 8 columns 100 times in 4018ms Read 1 row with 1000 columns 100 times in 13392ms Different machine as server hosting table Read 1 row with 7 columns 100 times in 152ms Read 1 row with 8 columns 100 times in 168ms Read 1 row with 1000 columns 100 times in 9386ms
          Hide
          Jim Kellerman added a comment -

          When I did not deploy a region server on the same machine as the namenode and master, I got much better times:

          When run on node hosting region that is not the same as namenode or master:
          
          Read 1 row with 7 columns 100 times in 113ms
          Read 1 row with 8 columns 100 times in 4057ms
          Read 1 row with 1000 columns 100 times in 10570ms
          
          When run on node not hosting region and is not namenode or master:
          
          Read 1 row with 7 columns 100 times in 109ms
          Read 1 row with 8 columns 100 times in 121ms
          Read 1 row with 1000 columns 100 times in 8838ms
          

          So running an application on the same server as the namenode. master and region server definately slows things down.

          There is basically no difference in the test where we read 1 row with 7 columns.

          Why the read 1 row with 8 columns or the read 1 row with 1000 columns is still slower if the application is run on a different machine than the one hosting the region, is still a mystery.

          Will continue investigation.......

          Show
          Jim Kellerman added a comment - When I did not deploy a region server on the same machine as the namenode and master, I got much better times: When run on node hosting region that is not the same as namenode or master: Read 1 row with 7 columns 100 times in 113ms Read 1 row with 8 columns 100 times in 4057ms Read 1 row with 1000 columns 100 times in 10570ms When run on node not hosting region and is not namenode or master: Read 1 row with 7 columns 100 times in 109ms Read 1 row with 8 columns 100 times in 121ms Read 1 row with 1000 columns 100 times in 8838ms So running an application on the same server as the namenode. master and region server definately slows things down. There is basically no difference in the test where we read 1 row with 7 columns. Why the read 1 row with 8 columns or the read 1 row with 1000 columns is still slower if the application is run on a different machine than the one hosting the region, is still a mystery. Will continue investigation.......
          Hide
          Jim Kellerman added a comment -

          When the application runs on the same host as the one serving the table, there is a lot more context switching going on which accounts for the differences in times. See below:

          Application running on a host different from the one serving the table:
          
          Read 1 row with 7 columns 100 times in 108ms
          Read 1 row with 8 columns 100 times in 111ms
          Read 1 row with 1000 columns 100 times in 8995ms
          
          procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
           r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
           0  1    112 182440  76356 6429056    0    0   282  1955    7    6  4  3 86  6  0
           0  0    112 182532  76368 6429112    0    0    16    40   25  335  0  0 100  0  0
           0  0    112 182532  76368 6429112    0    0     0     0   15  312  0  0 100  0  0
           0  0    112 182656  76368 6429112    0    0     0     0   16  330  0  0 100  0  0
           0  0    112 182896  76368 6429112    0    0     0     0   16  305  0  0 100  0  0
           0  0    112 182896  76376 6429112    0    0     0    40   28  348  0  0 100  0  0
           0  0    112 182896  76380 6429112    0    0     0    36   28  341  0  0 100  0  0
           1  0    112 177516  76380 6429144    0    0     0     0   17  511  1  1 98  0  0
           0  0    112 153648  76380 6429144    0    0     0     0  128 1164 33  2 65  0  0
           0  0    112 153396  76380 6429152    0    0     0     5   45  694  1  0 99  0  0
           0  0    112 153396  76380 6429152    0    0     0    44   18  342  0  0 100  0  0
           2  0    112 153388  76388 6429152    0    0     0    16   29  480  0  0 100  0  0
           0  0    112 153388  76392 6429152    0    0     0    64   27  380  0  0 100  0  0
           0  0    112 153132  76392 6429152    0    0     0     1   29  487  1  0 99  0  0
           0  0    112 153008  76392 6429152    0    0     0     0   10  492  0  0 100  0  0
           0  0    112 141904  76392 6429992    0    0     0    25 1901 4647 21  2 77  0  0
           0  0    112 135344  76392 6429992    0    0     0     0 1693  528  4  2 94  0  0
           0  0    112 131084  76404 6429992    0    0     0    48 1886  411  1  3 96  0  0
           0  0    112 119920  76404 6429992    0    0     0     0 1769 2398  4  1 95  0  0
           0  0    112 117428  76408 6429992    0    0     0    24 1972  461  2  1 97  0  0
           0  0    112 117300  76408 6429992    0    0     0    12 1810  386  1  1 97  0  0
          procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
           r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
           2  0    112 108104  76408 6429992    0    0     0     0 1944  445  6  1 92  0  0
           0  0    112  83452  76408 6429992    0    0     0     0 1746  456 16  2 82  0  0
           0  0    112  83796  76408 6429992    0    0     0     0 1492  448  0  2 98  0  0
           0  0    112  83608  76420 6429992    0    0     0    57  897  411  0  1 99  0  0
           0  0    112 180416  76420 6429960    0    0     0     0   30  414  0  0 100  0  0
           0  0    112 180340  76420 6429960    0    0     0     0   20  326  0  0 100  0  0
           0  0    112 180340  76420 6429960    0    0     0     0   15  311  0  0 100  0  0
           0  0    112 180676  76420 6429960    0    0     0     0   15  328  0  0 100  0  0
           0  0    112 180680  76420 6429960    0    0     0     0   14  321  0  0 100  0  0
           0  0    112 180680  76420 6429960    0    0     0    77   35  350  0  0 100  0  0
           0  0    112 180804  76420 6429960    0    0     0     0   23  323  0  0 100  0  0
           0  0    112 180804  76420 6429960    0    0     0     0   10  308  0  0 100  0  0
           0  0    112 180928  76428 6429960    0    0    24    16   33  380  4  0 95  1  0
           0  0    112 180556  76428 6430036    0    0    52     0   46  458  4  0 95  1  0
           0  0    112 178656  76436 6430036    0    0     0   112   40 1288  1  0 99  0  0
           0  0    112 178820  76436 6430036    0    0     0     0   20  325  0  0 100  0  0
           2  0    112 178820  76436 6430036    0    0     0     0   24  352  0  0 100  0  0
           1  0    112 178824  76436 6430036    0    0     0     0   13  337  0  0 100  0  0
           0  0    112 179136  76436 6430036    0    0     0     0   22  414  0  0 100  0  0
           0  0    112 179136  76436 6430036    0    0     0    40   18  354  0  0 100  0  0
           0  0    112 178840  76440 6430036    0    0     0    28   44  420  0  0 100  0  0
          procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
           r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
           1  0    112 178880  76448 6430036    0    0     0    16   31  342  0  0 100  0  0
           0  0    112 178288  76448 6430036    0    0     0     0   15  525  1  1 98  0  0
           0  0    112 178384  76448 6430040    0    0     0     0   13  322  0  0 100  0  0
           0  0    112 178384  76448 6430040    0    0     0   919   36  335  0  0 100  0  0
           1  0    112 178384  76452 6430040    0    0     0    48   26  357  0  0 100  0  0
           0  0    112 178536  76452 6430040    0    0     0     0   41  327  0  0 100  0  0
           0  0    112 178544  76452 6430040    0    0     0     0   42  330  0  0 100  0  0
           1  0    112 178560  76452 6430040    0    0     0     0   47  328  0  0 100  0  0
           0  0    112 178684  76452 6430040    0    0     0    30   37  347  0  0 100  0  0
           0  0    112 178676  76460 6430040    0    0     0    60   41  399  0  0 100  0  0
           0  0    112 178684  76460 6430040    0    0     0     0   19  336  0  0 100  0  0
           0  0    112 178684  76460 6430040    0    0     0     0   23  337  0  0 100  0  0
           1  0    112 178808  76460 6430040    0    0     0     0   16  331  0  0 100  0  0
           1  0    112 178932  76460 6430040    0    0     0     8   14  338  0  0 100  0  0
           0  0    112 178932  76460 6430040    0    0     0     0   30  326  0  0 100  0  0
           0  0    112 178784  76464 6430040    0    0     0    16   12  403  0  0 100  0  0
           0  0    112 178808  76464 6430040    0    0     0     0   13  330  0  0 100  0  0
           0  0    112 178808  76464 6430040    0    0     0     0    6  338  0  0 100  0  0
           0  0    112 178932  76472 6430040    0    0     0    25   20  357  0  0 100  0  0
           1  0    112 179056  76472 6430040    0    0     0     0   17  323  0  0 100  0  0
           1  0    112 179056  76472 6430040    0    0     0     0    8  316  0  0 100  0  0
          procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
           r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
           2  0    112 179172  76480 6430040    0    0     0    20   20  436  0  0 100  0  0
           0  0    112 179296  76480 6430040    0    0     0     0   24  319  0  0 100  0  0
           0  0    112 179296  76480 6430040    0    0     0    28   51  329  0  0 100  0  0
           0  0    112 179296  76480 6430040    0    0     0     0   33  335  0  0 100  0  0
           1  0    112 179384  76480 6430040    0    0     0     0   13  338  0  0 100  0  0
           0  0    112 179508  76480 6430040    0    0     0     0   10  310  0  0 100  0  0
           0  0    112 179508  76492 6430044    0    0     0    65   31  373  0  0 100  0  0
           0  0    112 179508  76492 6430044    0    0     0    16   39  334  0  0 100  0  0
          
          
          Application running on the same host as the one serving the table:
          
          Read 1 row with 7 columns 100 times in 97ms
          Read 1 row with 8 columns 100 times in 3997ms
          Read 1 row with 1000 columns 100 times in 10460ms
          
          procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
           r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
           2  0    116  35120  89092 6380936    0    0   307  1993    5   13  5  4 85  6  0
           0  0    116  34988  89092 6380936    0    0     0     4   29  385  0  0 100  0  0
           1  0    116  34988  89092 6380936    0    0     0     0    3  320  0  0 100  0  0
           1  0    116  34492  89096 6380936    0    0     0    36   22  425  0  0 100  0  0
           1  0    116  34616  89096 6380936    0    0     0     0   27  354  0  0 100  0  0
           1  0    116  34616  89096 6380936    0    0     0     0   13  338  0  0 100  0  0
           1  0    116  34864  89104 6380936    0    0     0    20   21  361  0  0 100  0  0
           0  0    116  34912  89104 6380936    0    0     0     0   12  351  0  0 100  0  0
           1  0    116  34912  89104 6380936    0    0     0    28   16  337  0  0 100  0  0
           1  0    116  28964  89104 6371916    0    0     0     0   20  964 12  2 86  0  0
           0  0    116  28388  89104 6358320    0    0     0     0  183 1078 22  1 77  0  0
           2  0    116  28384  89104 6358320    0    0     0    69   58  358  0  0 100  0  0
           0  0    116  27880  89112 6358312    0    0     0     4  111  937  0  1 99  0  0
           0  0    116  27832  89112 6358316    0    0     0     0   26  396  0  0 100  0  0
           1  0    116  27840  89116 6358316    0    0     0    96   41  386  0  0 100  0  0
           0  0    116  27964  89116 6358316    0    0     0     0   23  404  0  0 100  0  0
           4  0    116  25128  89116 6353476    0    0     0    58  134  774 14  0 85  0  0
           1  0    116  27368  89124 6351088    0    0     0    32  258 1977  4  1 96  0  0
           0  0    116  27368  89124 6351096    0    0     0     8   71  674  0  0 100  0  0
           0  0    116  26684  89124 6351096    0    0     0     6   13  710  0  0 100  0  0
           3  0    116  26960  89132 6351096    0    0     0    72   26  645  0  0 100  0  0
          procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
           r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
           3  0    116  28552  89132 6342908    0    0     0    26   37  688 24  1 74  0  0
           1  0    116  26540  89132 6326496    0    0     0     0   21 1201 22  2 76  0  0
           1  0    116  27480  89140 6318244    0    0     0    16   19 1450 22  1 76  0  0
           1  0    116  28620  89140 6314104    0    0     0     0   26 2092 21  1 78  0  0
           0  0    116  40952  89148 6314096    0    0     0    28   40 3456 21  1 77  0  0
           3  0    116  40080  89148 6314104    0    0     0    12   25 1406 25  0 75  0  0
           1  0    116  33888  89148 6314104    0    0     0     0   20 1480 27  0 73  0  0
           1  0    116  24888  89148 6309924    0    0     0     0   17 1564 17  1 82  0  0
           2  0    116  27952  89156 6305776    0    0     0    16   20 1594 19  0 81  0  0
           0  0    116  28156  89156 6305788    0    0     0     0   24 1499 18  0 82  0  0
           0  0    116  42532  89164 6305788    0    0     0    32   40  990  9  0 91  0  0
           0  0    116 122276  89164 6305788    0    0     0     0   17  438  0  0 100  0  0
           0  0    116 122256  89164 6305756    0    0     0     0   19  425  0  0 100  0  0
           0  0    116 122552  89164 6305756    0    0     0     0   21  362  0  0 100  0  0
           0  0    116 122676  89172 6305748    0    0     0    32   31  361  0  0 100  0  0
           0  0    116 122676  89172 6305756    0    0     0    40   35  365  0  0 100  0  0
           0  0    116 123048  89172 6305756    0    0     0     0   67  367  0  0 100  0  0
           0  0    116 123048  89172 6305756    0    0     0     0   43  336  0  0 100  0  0
           0  0    116 123048  89172 6305756    0    0     0     0   57  380  0  0 100  0  0
           0  0    116 123544  89172 6305756    0    0     0     0   27  354  0  0 100  0  0
           0  0    116 123544  89180 6305748    0    0     0    28   16  372  0  0 100  0  0
          procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
           r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
           0  0    116 123544  89184 6305756    0    0     0    24   18  361  0  0 100  0  0
           0  0    116 123820  89184 6305756    0    0     0     0   20  356  0  0 100  0  0
           0  0    116 123820  89184 6305760    0    0     0     0   17  355  0  0 100  0  0
           0  0    116 123944  89184 6305760    0    0     0     0   23  388  0  0 100  0  0
           0  0    116 124184  89184 6305760    0    0     0    56   23  353  0  0 100  0  0
           0  0    116 124184  89200 6305744    0    0     0    89   39  392  0  0 100  0  0
           0  0    116 124184  89200 6305760    0    0     0     0   16  376  0  0 100  0  0
           0  0    116 124308  89200 6305760    0    0     0     0   12  366  0  0 100  0  0
          

          I think this issue is a red herring, and there is no basic difference between running the test program on the same host as the one serving the table vs running the test program on a different server.

          The difference when running on the same server is that the OS needs to do context switches between the client application and the region server.

          Show
          Jim Kellerman added a comment - When the application runs on the same host as the one serving the table, there is a lot more context switching going on which accounts for the differences in times. See below: Application running on a host different from the one serving the table: Read 1 row with 7 columns 100 times in 108ms Read 1 row with 8 columns 100 times in 111ms Read 1 row with 1000 columns 100 times in 8995ms procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 1 112 182440 76356 6429056 0 0 282 1955 7 6 4 3 86 6 0 0 0 112 182532 76368 6429112 0 0 16 40 25 335 0 0 100 0 0 0 0 112 182532 76368 6429112 0 0 0 0 15 312 0 0 100 0 0 0 0 112 182656 76368 6429112 0 0 0 0 16 330 0 0 100 0 0 0 0 112 182896 76368 6429112 0 0 0 0 16 305 0 0 100 0 0 0 0 112 182896 76376 6429112 0 0 0 40 28 348 0 0 100 0 0 0 0 112 182896 76380 6429112 0 0 0 36 28 341 0 0 100 0 0 1 0 112 177516 76380 6429144 0 0 0 0 17 511 1 1 98 0 0 0 0 112 153648 76380 6429144 0 0 0 0 128 1164 33 2 65 0 0 0 0 112 153396 76380 6429152 0 0 0 5 45 694 1 0 99 0 0 0 0 112 153396 76380 6429152 0 0 0 44 18 342 0 0 100 0 0 2 0 112 153388 76388 6429152 0 0 0 16 29 480 0 0 100 0 0 0 0 112 153388 76392 6429152 0 0 0 64 27 380 0 0 100 0 0 0 0 112 153132 76392 6429152 0 0 0 1 29 487 1 0 99 0 0 0 0 112 153008 76392 6429152 0 0 0 0 10 492 0 0 100 0 0 0 0 112 141904 76392 6429992 0 0 0 25 1901 4647 21 2 77 0 0 0 0 112 135344 76392 6429992 0 0 0 0 1693 528 4 2 94 0 0 0 0 112 131084 76404 6429992 0 0 0 48 1886 411 1 3 96 0 0 0 0 112 119920 76404 6429992 0 0 0 0 1769 2398 4 1 95 0 0 0 0 112 117428 76408 6429992 0 0 0 24 1972 461 2 1 97 0 0 0 0 112 117300 76408 6429992 0 0 0 12 1810 386 1 1 97 0 0 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 112 108104 76408 6429992 0 0 0 0 1944 445 6 1 92 0 0 0 0 112 83452 76408 6429992 0 0 0 0 1746 456 16 2 82 0 0 0 0 112 83796 76408 6429992 0 0 0 0 1492 448 0 2 98 0 0 0 0 112 83608 76420 6429992 0 0 0 57 897 411 0 1 99 0 0 0 0 112 180416 76420 6429960 0 0 0 0 30 414 0 0 100 0 0 0 0 112 180340 76420 6429960 0 0 0 0 20 326 0 0 100 0 0 0 0 112 180340 76420 6429960 0 0 0 0 15 311 0 0 100 0 0 0 0 112 180676 76420 6429960 0 0 0 0 15 328 0 0 100 0 0 0 0 112 180680 76420 6429960 0 0 0 0 14 321 0 0 100 0 0 0 0 112 180680 76420 6429960 0 0 0 77 35 350 0 0 100 0 0 0 0 112 180804 76420 6429960 0 0 0 0 23 323 0 0 100 0 0 0 0 112 180804 76420 6429960 0 0 0 0 10 308 0 0 100 0 0 0 0 112 180928 76428 6429960 0 0 24 16 33 380 4 0 95 1 0 0 0 112 180556 76428 6430036 0 0 52 0 46 458 4 0 95 1 0 0 0 112 178656 76436 6430036 0 0 0 112 40 1288 1 0 99 0 0 0 0 112 178820 76436 6430036 0 0 0 0 20 325 0 0 100 0 0 2 0 112 178820 76436 6430036 0 0 0 0 24 352 0 0 100 0 0 1 0 112 178824 76436 6430036 0 0 0 0 13 337 0 0 100 0 0 0 0 112 179136 76436 6430036 0 0 0 0 22 414 0 0 100 0 0 0 0 112 179136 76436 6430036 0 0 0 40 18 354 0 0 100 0 0 0 0 112 178840 76440 6430036 0 0 0 28 44 420 0 0 100 0 0 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 112 178880 76448 6430036 0 0 0 16 31 342 0 0 100 0 0 0 0 112 178288 76448 6430036 0 0 0 0 15 525 1 1 98 0 0 0 0 112 178384 76448 6430040 0 0 0 0 13 322 0 0 100 0 0 0 0 112 178384 76448 6430040 0 0 0 919 36 335 0 0 100 0 0 1 0 112 178384 76452 6430040 0 0 0 48 26 357 0 0 100 0 0 0 0 112 178536 76452 6430040 0 0 0 0 41 327 0 0 100 0 0 0 0 112 178544 76452 6430040 0 0 0 0 42 330 0 0 100 0 0 1 0 112 178560 76452 6430040 0 0 0 0 47 328 0 0 100 0 0 0 0 112 178684 76452 6430040 0 0 0 30 37 347 0 0 100 0 0 0 0 112 178676 76460 6430040 0 0 0 60 41 399 0 0 100 0 0 0 0 112 178684 76460 6430040 0 0 0 0 19 336 0 0 100 0 0 0 0 112 178684 76460 6430040 0 0 0 0 23 337 0 0 100 0 0 1 0 112 178808 76460 6430040 0 0 0 0 16 331 0 0 100 0 0 1 0 112 178932 76460 6430040 0 0 0 8 14 338 0 0 100 0 0 0 0 112 178932 76460 6430040 0 0 0 0 30 326 0 0 100 0 0 0 0 112 178784 76464 6430040 0 0 0 16 12 403 0 0 100 0 0 0 0 112 178808 76464 6430040 0 0 0 0 13 330 0 0 100 0 0 0 0 112 178808 76464 6430040 0 0 0 0 6 338 0 0 100 0 0 0 0 112 178932 76472 6430040 0 0 0 25 20 357 0 0 100 0 0 1 0 112 179056 76472 6430040 0 0 0 0 17 323 0 0 100 0 0 1 0 112 179056 76472 6430040 0 0 0 0 8 316 0 0 100 0 0 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 112 179172 76480 6430040 0 0 0 20 20 436 0 0 100 0 0 0 0 112 179296 76480 6430040 0 0 0 0 24 319 0 0 100 0 0 0 0 112 179296 76480 6430040 0 0 0 28 51 329 0 0 100 0 0 0 0 112 179296 76480 6430040 0 0 0 0 33 335 0 0 100 0 0 1 0 112 179384 76480 6430040 0 0 0 0 13 338 0 0 100 0 0 0 0 112 179508 76480 6430040 0 0 0 0 10 310 0 0 100 0 0 0 0 112 179508 76492 6430044 0 0 0 65 31 373 0 0 100 0 0 0 0 112 179508 76492 6430044 0 0 0 16 39 334 0 0 100 0 0 Application running on the same host as the one serving the table: Read 1 row with 7 columns 100 times in 97ms Read 1 row with 8 columns 100 times in 3997ms Read 1 row with 1000 columns 100 times in 10460ms procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 116 35120 89092 6380936 0 0 307 1993 5 13 5 4 85 6 0 0 0 116 34988 89092 6380936 0 0 0 4 29 385 0 0 100 0 0 1 0 116 34988 89092 6380936 0 0 0 0 3 320 0 0 100 0 0 1 0 116 34492 89096 6380936 0 0 0 36 22 425 0 0 100 0 0 1 0 116 34616 89096 6380936 0 0 0 0 27 354 0 0 100 0 0 1 0 116 34616 89096 6380936 0 0 0 0 13 338 0 0 100 0 0 1 0 116 34864 89104 6380936 0 0 0 20 21 361 0 0 100 0 0 0 0 116 34912 89104 6380936 0 0 0 0 12 351 0 0 100 0 0 1 0 116 34912 89104 6380936 0 0 0 28 16 337 0 0 100 0 0 1 0 116 28964 89104 6371916 0 0 0 0 20 964 12 2 86 0 0 0 0 116 28388 89104 6358320 0 0 0 0 183 1078 22 1 77 0 0 2 0 116 28384 89104 6358320 0 0 0 69 58 358 0 0 100 0 0 0 0 116 27880 89112 6358312 0 0 0 4 111 937 0 1 99 0 0 0 0 116 27832 89112 6358316 0 0 0 0 26 396 0 0 100 0 0 1 0 116 27840 89116 6358316 0 0 0 96 41 386 0 0 100 0 0 0 0 116 27964 89116 6358316 0 0 0 0 23 404 0 0 100 0 0 4 0 116 25128 89116 6353476 0 0 0 58 134 774 14 0 85 0 0 1 0 116 27368 89124 6351088 0 0 0 32 258 1977 4 1 96 0 0 0 0 116 27368 89124 6351096 0 0 0 8 71 674 0 0 100 0 0 0 0 116 26684 89124 6351096 0 0 0 6 13 710 0 0 100 0 0 3 0 116 26960 89132 6351096 0 0 0 72 26 645 0 0 100 0 0 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 3 0 116 28552 89132 6342908 0 0 0 26 37 688 24 1 74 0 0 1 0 116 26540 89132 6326496 0 0 0 0 21 1201 22 2 76 0 0 1 0 116 27480 89140 6318244 0 0 0 16 19 1450 22 1 76 0 0 1 0 116 28620 89140 6314104 0 0 0 0 26 2092 21 1 78 0 0 0 0 116 40952 89148 6314096 0 0 0 28 40 3456 21 1 77 0 0 3 0 116 40080 89148 6314104 0 0 0 12 25 1406 25 0 75 0 0 1 0 116 33888 89148 6314104 0 0 0 0 20 1480 27 0 73 0 0 1 0 116 24888 89148 6309924 0 0 0 0 17 1564 17 1 82 0 0 2 0 116 27952 89156 6305776 0 0 0 16 20 1594 19 0 81 0 0 0 0 116 28156 89156 6305788 0 0 0 0 24 1499 18 0 82 0 0 0 0 116 42532 89164 6305788 0 0 0 32 40 990 9 0 91 0 0 0 0 116 122276 89164 6305788 0 0 0 0 17 438 0 0 100 0 0 0 0 116 122256 89164 6305756 0 0 0 0 19 425 0 0 100 0 0 0 0 116 122552 89164 6305756 0 0 0 0 21 362 0 0 100 0 0 0 0 116 122676 89172 6305748 0 0 0 32 31 361 0 0 100 0 0 0 0 116 122676 89172 6305756 0 0 0 40 35 365 0 0 100 0 0 0 0 116 123048 89172 6305756 0 0 0 0 67 367 0 0 100 0 0 0 0 116 123048 89172 6305756 0 0 0 0 43 336 0 0 100 0 0 0 0 116 123048 89172 6305756 0 0 0 0 57 380 0 0 100 0 0 0 0 116 123544 89172 6305756 0 0 0 0 27 354 0 0 100 0 0 0 0 116 123544 89180 6305748 0 0 0 28 16 372 0 0 100 0 0 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 116 123544 89184 6305756 0 0 0 24 18 361 0 0 100 0 0 0 0 116 123820 89184 6305756 0 0 0 0 20 356 0 0 100 0 0 0 0 116 123820 89184 6305760 0 0 0 0 17 355 0 0 100 0 0 0 0 116 123944 89184 6305760 0 0 0 0 23 388 0 0 100 0 0 0 0 116 124184 89184 6305760 0 0 0 56 23 353 0 0 100 0 0 0 0 116 124184 89200 6305744 0 0 0 89 39 392 0 0 100 0 0 0 0 116 124184 89200 6305760 0 0 0 0 16 376 0 0 100 0 0 0 0 116 124308 89200 6305760 0 0 0 0 12 366 0 0 100 0 0 I think this issue is a red herring, and there is no basic difference between running the test program on the same host as the one serving the table vs running the test program on a different server. The difference when running on the same server is that the OS needs to do context switches between the client application and the region server.
          Hide
          Jim Kellerman added a comment -

          Closing as invalid because it appears this is not an HBase problem. It appears that everything is working as it should, the difference being how much CPU load/context switching is required when the client runs on the same machine as the region server.

          Show
          Jim Kellerman added a comment - Closing as invalid because it appears this is not an HBase problem. It appears that everything is working as it should, the difference being how much CPU load/context switching is required when the client runs on the same machine as the region server.
          Hide
          Jonathan Gray added a comment -

          Jim, I don't believe this issue should be closed. It may not have anything to do with our code, but it affects us in a very significant way so we need to get to the bottom of it. This is an exploratory issue, hopefully with a solution, but we're not done with it yet.

          Your conclusion is not correct. You are writing off the delay as context switching that occurs when client is on the same machine. First of all, those context switches are orders of magnitude below the timings of these queries. The queries in question run 40 times slower (seems to be something about 4000ms, dunno what) when running local to the hosting node. This amount of time is clearly not explainable by the additional context switching of having these two things running concurrently.

          But more importantly, this explanation does not address what we're seeing. Given what you say above, we should be seeing ALL queries running slower by some fixed factor when running on the same node. But we don't. There is a very specific and definable range of payload sizes for which this extra delay of ~4 seconds exists. The 7 column case and the 1000 column case both perform nearly identical in both situations, so the affect of the context switching is negligible.

          Have you done network-level debugging? We need to figure out where in the chain the delay is introduced and go from there.

          There could be an issue in Linux, RPC, who knows... but we should keep digging whether or not we figure this out for 0.20

          Show
          Jonathan Gray added a comment - Jim, I don't believe this issue should be closed. It may not have anything to do with our code, but it affects us in a very significant way so we need to get to the bottom of it. This is an exploratory issue, hopefully with a solution, but we're not done with it yet. Your conclusion is not correct. You are writing off the delay as context switching that occurs when client is on the same machine. First of all, those context switches are orders of magnitude below the timings of these queries. The queries in question run 40 times slower (seems to be something about 4000ms, dunno what) when running local to the hosting node. This amount of time is clearly not explainable by the additional context switching of having these two things running concurrently. But more importantly, this explanation does not address what we're seeing. Given what you say above, we should be seeing ALL queries running slower by some fixed factor when running on the same node. But we don't. There is a very specific and definable range of payload sizes for which this extra delay of ~4 seconds exists. The 7 column case and the 1000 column case both perform nearly identical in both situations, so the affect of the context switching is negligible. Have you done network-level debugging? We need to figure out where in the chain the delay is introduced and go from there. There could be an issue in Linux, RPC, who knows... but we should keep digging whether or not we figure this out for 0.20
          Hide
          Jim Kellerman added a comment -

          @Jgray

          If you'll note, there is almost an order of magnitude of context switches when run on the same host.

          On a different host we are hovering at about 300-400 cs/sec. On the same host, when the app runs we are running 1200-1400 cs/sec.

          However I will try to gather more information with strace, etc.

          Show
          Jim Kellerman added a comment - @Jgray If you'll note, there is almost an order of magnitude of context switches when run on the same host. On a different host we are hovering at about 300-400 cs/sec. On the same host, when the app runs we are running 1200-1400 cs/sec. However I will try to gather more information with strace, etc.
          Hide
          Jonathan Gray added a comment -

          The most important point is, if it's context switches, there should be a constant overhead for ALL queries. The 4 second delay shows up for a very specific slice of payload sizes and then goes away. So whatever is wrong seems related to payload size or the nature of the payload in some way.

          Show
          Jonathan Gray added a comment - The most important point is, if it's context switches, there should be a constant overhead for ALL queries. The 4 second delay shows up for a very specific slice of payload sizes and then goes away. So whatever is wrong seems related to payload size or the nature of the payload in some way.
          Hide
          Jonathan Gray added a comment -

          I did misunderstand part of your explanation. The jump in CS is just in this delayed case.

          But we should still not close the issue. It's not the context switching caused by additional concurrency that hurts us then, it's context switching because there's something else happening. The jump in context switches is a symptom of something happening differently along the way for this special range of payloads.

          Show
          Jonathan Gray added a comment - I did misunderstand part of your explanation. The jump in CS is just in this delayed case. But we should still not close the issue. It's not the context switching caused by additional concurrency that hurts us then, it's context switching because there's something else happening. The jump in context switches is a symptom of something happening differently along the way for this special range of payloads.
          Hide
          Jonathan Gray added a comment -

          Again, I think our best approach will be to try to isolate further and further until we know what the culprit is. First off, does this happen in the client or the server?

          Show
          Jonathan Gray added a comment - Again, I think our best approach will be to try to isolate further and further until we know what the culprit is. First off, does this happen in the client or the server?
          Hide
          Jim Kellerman added a comment -

          Ran ReadDelayTest again after adding flush, compact and major compaction and waiting until all were complete before running the read test. No change.

          I have inserted timers in HRegionServer.getRow and can see increased times throughout the test, so I believe at least part of the delay is on the server side.

          Next step is to insert lower level timers to isolate Memcache or reading from store files as a culprit.

          Show
          Jim Kellerman added a comment - Ran ReadDelayTest again after adding flush, compact and major compaction and waiting until all were complete before running the read test. No change. I have inserted timers in HRegionServer.getRow and can see increased times throughout the test, so I believe at least part of the delay is on the server side. Next step is to insert lower level timers to isolate Memcache or reading from store files as a culprit.
          Hide
          Jim Kellerman added a comment -

          Contribution of methods to response time. Clearly getFullFromStoreFile dominates.

          Show
          Jim Kellerman added a comment - Contribution of methods to response time. Clearly getFullFromStoreFile dominates.
          Hide
          Jim Kellerman added a comment -

          Store.getClosest

          Show
          Jim Kellerman added a comment - Store.getClosest
          Hide
          stack added a comment -

          Whats the diagram saying? And how does getClosest relate to getFullFromStoreFile (Is the latter whats taking time)?

          Show
          stack added a comment - Whats the diagram saying? And how does getClosest relate to getFullFromStoreFile (Is the latter whats taking time)?
          Hide
          Jim Kellerman added a comment - - edited

          The first graph shows the time consumed by methods called from Store.getFull.

          getFullFromStoreFile is where the most time is being spent . Those peaks contribute to what causes the 100 gets to have a high average time.

          getClosest is the biggest contributor to the time spent in getFullFromStoreFile.

          I have data for methods called by getClosest, but have not graphed them yet.

          Show
          Jim Kellerman added a comment - - edited The first graph shows the time consumed by methods called from Store.getFull. getFullFromStoreFile is where the most time is being spent . Those peaks contribute to what causes the 100 gets to have a high average time. getClosest is the biggest contributor to the time spent in getFullFromStoreFile. I have data for methods called by getClosest, but have not graphed them yet.
          Hide
          Jonathan Gray added a comment -

          What is the X-axis on these graphs? Iteration or number of columns retrieved?

          Show
          Jonathan Gray added a comment - What is the X-axis on these graphs? Iteration or number of columns retrieved?
          Hide
          Jonathan Gray added a comment -

          Here is a graph I made at the same time I originally filed this issue.

          What it shows are very explicit ranges of payloads for which an enormous and fixed delay occurs, and then goes away as the row returned gets larger.

          I'm trying to match up this behavior with what you're plotting, Jim. Can you explain further what the axis are on yours?

          Show
          Jonathan Gray added a comment - Here is a graph I made at the same time I originally filed this issue. What it shows are very explicit ranges of payloads for which an enormous and fixed delay occurs, and then goes away as the row returned gets larger. I'm trying to match up this behavior with what you're plotting, Jim. Can you explain further what the axis are on yours?
          Hide
          Jim Kellerman added a comment -

          In the first graph the x axis is number of calls to getRow and the y axis is time in nano seconds per call.

          The second graph's x axis is the time for each call to getClosest.

          I'll post some better ones shortly.

          Show
          Jim Kellerman added a comment - In the first graph the x axis is number of calls to getRow and the y axis is time in nano seconds per call. The second graph's x axis is the time for each call to getClosest. I'll post some better ones shortly.
          Hide
          Jim Kellerman added a comment - - edited

          All I am seeing now (on trunk) is increasing times:

          Read 1 row with 7 columns 100 times in 921ms
          Read 1 row with 8 columns 100 times in 4,089ms
          Read 1 row with 1000 columns 100 times in 16,330ms

          Read 1 row with 7 columns 100 times in 875ms
          Read 1 row with 8 columns 100 times in 4,025ms
          Read 1 row with 1000 columns 100 times in 16,860ms

          Read 1 row with 7 columns 100 times in 993ms
          Read 1 row with 8 columns 100 times in 4,087ms
          Read 1 row with 1000 columns 100 times in 16,530ms

          Which is what I would expect.

          Sometimes I get fetching 8 columns is faster than 7 as in:

          Read 1 row with 7 columns 100 times in 907ms
          Read 1 row with 8 columns 100 times in 622ms
          Read 1 row with 1000 columns 100 times in 15,062ms

          Show
          Jim Kellerman added a comment - - edited All I am seeing now (on trunk) is increasing times: Read 1 row with 7 columns 100 times in 921ms Read 1 row with 8 columns 100 times in 4,089ms Read 1 row with 1000 columns 100 times in 16,330ms Read 1 row with 7 columns 100 times in 875ms Read 1 row with 8 columns 100 times in 4,025ms Read 1 row with 1000 columns 100 times in 16,860ms Read 1 row with 7 columns 100 times in 993ms Read 1 row with 8 columns 100 times in 4,087ms Read 1 row with 1000 columns 100 times in 16,530ms Which is what I would expect. Sometimes I get fetching 8 columns is faster than 7 as in: Read 1 row with 7 columns 100 times in 907ms Read 1 row with 8 columns 100 times in 622ms Read 1 row with 1000 columns 100 times in 15,062ms
          Hide
          Jim Kellerman added a comment - - edited

          I just ran it again twice and got increasing times:

          Read 1 row with 7 columns 100 times in 996ms
          Read 1 row with 8 columns 100 times in 4044ms
          Read 1 row with 1000 columns 100 times in 16175ms

          Read 1 row with 7 columns 100 times in 976ms
          Read 1 row with 8 columns 100 times in 4034ms
          Read 1 row with 1000 columns 100 times in 16240ms

          I have not observed the decrease between 8 and 1000 columns today at all.

          Show
          Jim Kellerman added a comment - - edited I just ran it again twice and got increasing times: Read 1 row with 7 columns 100 times in 996ms Read 1 row with 8 columns 100 times in 4044ms Read 1 row with 1000 columns 100 times in 16175ms Read 1 row with 7 columns 100 times in 976ms Read 1 row with 8 columns 100 times in 4034ms Read 1 row with 1000 columns 100 times in 16240ms I have not observed the decrease between 8 and 1000 columns today at all.
          Hide
          Jim Kellerman added a comment -

          Obviously, from this graph, not all the time is consumed in getRow

          Need to look elsewhere

          Show
          Jim Kellerman added a comment - Obviously, from this graph, not all the time is consumed in getRow Need to look elsewhere
          Hide
          Jim Kellerman added a comment -

          In this run (trunk latest as of 2009/05/31), times increased with the number of columns.

          7 columns took: 840ms
          8 columns took: 4080ms
          1000 columns took 16170ms

          Show
          Jim Kellerman added a comment - In this run (trunk latest as of 2009/05/31), times increased with the number of columns. 7 columns took: 840ms 8 columns took: 4080ms 1000 columns took 16170ms
          Hide
          Jim Kellerman added a comment -

          As one might expect, almost no seeking is necessary for 7 columns. Why it jumps so much for 8 columns (because 7 is at the beginning of the file?) is odd.

          What is expected is that more seeking is done for 1000 columns than for 8.

          Show
          Jim Kellerman added a comment - As one might expect, almost no seeking is necessary for 7 columns. Why it jumps so much for 8 columns (because 7 is at the beginning of the file?) is odd. What is expected is that more seeking is done for 1000 columns than for 8.
          Hide
          Jim Kellerman added a comment -

          This graph, at least, makes sense. Not much 'nexting' is needed for 7 or 8 columns, but it jumps up for 1000 columns

          Show
          Jim Kellerman added a comment - This graph, at least, makes sense. Not much 'nexting' is needed for 7 or 8 columns, but it jumps up for 1000 columns
          Hide
          Jim Kellerman added a comment -

          This also makes sense to me. getClosest is going to occupy a higher percentage of time for 7 or 8 columns than for 1000.

          I expect that marshalling data is the largest contributor to getRow(1000 columns), but I do not have hard data to support that.

          Show
          Jim Kellerman added a comment - This also makes sense to me. getClosest is going to occupy a higher percentage of time for 7 or 8 columns than for 1000. I expect that marshalling data is the largest contributor to getRow(1000 columns), but I do not have hard data to support that.
          Hide
          Jim Kellerman added a comment -

          Here you can see the blowup in elapsed time vs number of columns.

          Show
          Jim Kellerman added a comment - Here you can see the blowup in elapsed time vs number of columns.
          Hide
          Jim Kellerman added a comment -

          Based on the graph https://issues.apache.org/jira/secure/attachment/12409623/zoom+of+columns+vs+round-trip+blowup.jpg

          the blow-up in round-trip time happens between 8 and 23 columns (approximately - different runs will have different end points), but the amount of time spent in getRow remains pretty constant through this range.

          For larger numbers of columns (as depicted in https://issues.apache.org/jira/secure/attachment/12409622/getRow+%2B+round-trip+vs+%23+columns.jpg ), getRow and round-trip time seem to scale pretty linearly.

          This is pretty strange, but at least it doesn't seem related to server-side I/O.

          Show
          Jim Kellerman added a comment - Based on the graph https://issues.apache.org/jira/secure/attachment/12409623/zoom+of+columns+vs+round-trip+blowup.jpg the blow-up in round-trip time happens between 8 and 23 columns (approximately - different runs will have different end points), but the amount of time spent in getRow remains pretty constant through this range. For larger numbers of columns (as depicted in https://issues.apache.org/jira/secure/attachment/12409622/getRow+%2B+round-trip+vs+%23+columns.jpg ), getRow and round-trip time seem to scale pretty linearly. This is pretty strange, but at least it doesn't seem related to server-side I/O.
          Hide
          stack added a comment -

          So the time is not being spent in getRow? This scales linearly with number of columns? Time is elsewhere?

          Show
          stack added a comment - So the time is not being spent in getRow? This scales linearly with number of columns? Time is elsewhere?
          Hide
          Jim Kellerman added a comment -

          It would appear that the time explosion happens during deserialization.

          Show
          Jim Kellerman added a comment - It would appear that the time explosion happens during deserialization.
          Hide
          Jonathan Gray added a comment -

          Jim, can you run on current trunk, post-1304 commit?

          I will try to run a test tomorrow, but it's a busy day. Let's not hold up 0.20.0 for this if we can't figure it and drop to 0.20.1/0.21.0

          Show
          Jonathan Gray added a comment - Jim, can you run on current trunk, post-1304 commit? I will try to run a test tomorrow, but it's a busy day. Let's not hold up 0.20.0 for this if we can't figure it and drop to 0.20.1/0.21.0
          Hide
          Jim Kellerman added a comment -

          Will do tomorrow (Mon 6/8)

          Show
          Jim Kellerman added a comment - Will do tomorrow (Mon 6/8)
          Hide
          Jim Kellerman added a comment -

          This problem still exists in trunk after HBASE-1304 was committed. I'm going to instrument the RPC layer tomorrow.

          Read 1 row with 6 columns 100 times in 53 ms
          Read 1 row with 7 columns 100 times in 81 ms
          Read 1 row with 8 columns 100 times in 4133 ms
          Read 1 row with 9 columns 100 times in 4000 ms
          Read 1 row with 10 columns 100 times in 4000 ms
          Read 1 row with 11 columns 100 times in 4000 ms
          Read 1 row with 12 columns 100 times in 4000 ms
          Read 1 row with 13 columns 100 times in 4000 ms
          Read 1 row with 14 columns 100 times in 4000 ms
          Read 1 row with 15 columns 100 times in 4000 ms
          Read 1 row with 16 columns 100 times in 4010 ms
          Read 1 row with 17 columns 100 times in 4000 ms
          Read 1 row with 18 columns 100 times in 4000 ms
          Read 1 row with 19 columns 100 times in 4000 ms
          Read 1 row with 20 columns 100 times in 4000 ms
          Read 1 row with 21 columns 100 times in 4000 ms
          Read 1 row with 22 columns 100 times in 4000 ms
          Read 1 row with 23 columns 100 times in 4000 ms
          Read 1 row with 24 columns 100 times in 83 ms
          Read 1 row with 25 columns 100 times in 72 ms
          Read 1 row with 26 columns 100 times in 63 ms
          Read 1 row with 27 columns 100 times in 66 ms
          Read 1 row with 28 columns 100 times in 71 ms
          Read 1 row with 29 columns 100 times in 67 ms
          Read 1 row with 30 columns 100 times in 88 ms
          Read 1 row with 31 columns 100 times in 78 ms
          Read 1 row with 32 columns 100 times in 174 ms
          Read 1 row with 33 columns 100 times in 208 ms
          Read 1 row with 34 columns 100 times in 62 ms
          Read 1 row with 35 columns 100 times in 130 ms
          Read 1 row with 36 columns 100 times in 137 ms
          Read 1 row with 37 columns 100 times in 142 ms
          Read 1 row with 38 columns 100 times in 279 ms
          Read 1 row with 39 columns 100 times in 184 ms
          Read 1 row with 40 columns 100 times in 367 ms
          Read 1 row with 41 columns 100 times in 202 ms
          Read 1 row with 42 columns 100 times in 452 ms
          Read 1 row with 43 columns 100 times in 206 ms
          Read 1 row with 44 columns 100 times in 669 ms
          Read 1 row with 45 columns 100 times in 242 ms
          Read 1 row with 46 columns 100 times in 199 ms
          Read 1 row with 47 columns 100 times in 250 ms
          Read 1 row with 48 columns 100 times in 62 ms
          Read 1 row with 49 columns 100 times in 62 ms
          Read 1 row with 50 columns 100 times in 81 ms
          Read 1 row with 51 columns 100 times in 66 ms
          Read 1 row with 52 columns 100 times in 65 ms
          Read 1 row with 53 columns 100 times in 64 ms
          Read 1 row with 54 columns 100 times in 65 ms
          Read 1 row with 55 columns 100 times in 113 ms
          Read 1 row with 56 columns 100 times in 89 ms
          Read 1 row with 57 columns 100 times in 72 ms
          Read 1 row with 58 columns 100 times in 67 ms
          Read 1 row with 59 columns 100 times in 67 ms
          Read 1 row with 60 columns 100 times in 66 ms
          Read 1 row with 61 columns 100 times in 78 ms
          Read 1 row with 62 columns 100 times in 88 ms
          Read 1 row with 63 columns 100 times in 72 ms
          Read 1 row with 64 columns 100 times in 85 ms
          Read 1 row with 65 columns 100 times in 82 ms
          Read 1 row with 66 columns 100 times in 122 ms
          Read 1 row with 67 columns 100 times in 82 ms
          Read 1 row with 68 columns 100 times in 110 ms
          Read 1 row with 69 columns 100 times in 168 ms
          Read 1 row with 70 columns 100 times in 85 ms
          Read 1 row with 71 columns 100 times in 85 ms
          Read 1 row with 72 columns 100 times in 86 ms
          Read 1 row with 73 columns 100 times in 109 ms
          Read 1 row with 74 columns 100 times in 85 ms
          Read 1 row with 75 columns 100 times in 97 ms
          Read 1 row with 76 columns 100 times in 108 ms
          Read 1 row with 77 columns 100 times in 128 ms
          Read 1 row with 78 columns 100 times in 91 ms
          Read 1 row with 79 columns 100 times in 102 ms
          Read 1 row with 80 columns 100 times in 221 ms
          Read 1 row with 81 columns 100 times in 138 ms
          Read 1 row with 82 columns 100 times in 260 ms
          Read 1 row with 83 columns 100 times in 474 ms
          Read 1 row with 84 columns 100 times in 99 ms
          Read 1 row with 85 columns 100 times in 110 ms
          Read 1 row with 86 columns 100 times in 93 ms
          Read 1 row with 87 columns 100 times in 313 ms
          Read 1 row with 88 columns 100 times in 231 ms
          Read 1 row with 89 columns 100 times in 246 ms
          Read 1 row with 90 columns 100 times in 189 ms
          Read 1 row with 91 columns 100 times in 162 ms
          Read 1 row with 92 columns 100 times in 153 ms
          Read 1 row with 93 columns 100 times in 157 ms
          Read 1 row with 94 columns 100 times in 144 ms
          Read 1 row with 95 columns 100 times in 156 ms
          Read 1 row with 96 columns 100 times in 172 ms
          Read 1 row with 97 columns 100 times in 362 ms
          Read 1 row with 98 columns 100 times in 139 ms
          Read 1 row with 99 columns 100 times in 171 ms
          Read 1 row with 100 columns 100 times in 103 ms
          Read 1 row with 101 columns 100 times in 149 ms
          Read 1 row with 102 columns 100 times in 104 ms
          Read 1 row with 103 columns 100 times in 410 ms
          Read 1 row with 104 columns 100 times in 594 ms
          Read 1 row with 1000 columns 100 times in 2041 ms
          
          Show
          Jim Kellerman added a comment - This problem still exists in trunk after HBASE-1304 was committed. I'm going to instrument the RPC layer tomorrow. Read 1 row with 6 columns 100 times in 53 ms Read 1 row with 7 columns 100 times in 81 ms Read 1 row with 8 columns 100 times in 4133 ms Read 1 row with 9 columns 100 times in 4000 ms Read 1 row with 10 columns 100 times in 4000 ms Read 1 row with 11 columns 100 times in 4000 ms Read 1 row with 12 columns 100 times in 4000 ms Read 1 row with 13 columns 100 times in 4000 ms Read 1 row with 14 columns 100 times in 4000 ms Read 1 row with 15 columns 100 times in 4000 ms Read 1 row with 16 columns 100 times in 4010 ms Read 1 row with 17 columns 100 times in 4000 ms Read 1 row with 18 columns 100 times in 4000 ms Read 1 row with 19 columns 100 times in 4000 ms Read 1 row with 20 columns 100 times in 4000 ms Read 1 row with 21 columns 100 times in 4000 ms Read 1 row with 22 columns 100 times in 4000 ms Read 1 row with 23 columns 100 times in 4000 ms Read 1 row with 24 columns 100 times in 83 ms Read 1 row with 25 columns 100 times in 72 ms Read 1 row with 26 columns 100 times in 63 ms Read 1 row with 27 columns 100 times in 66 ms Read 1 row with 28 columns 100 times in 71 ms Read 1 row with 29 columns 100 times in 67 ms Read 1 row with 30 columns 100 times in 88 ms Read 1 row with 31 columns 100 times in 78 ms Read 1 row with 32 columns 100 times in 174 ms Read 1 row with 33 columns 100 times in 208 ms Read 1 row with 34 columns 100 times in 62 ms Read 1 row with 35 columns 100 times in 130 ms Read 1 row with 36 columns 100 times in 137 ms Read 1 row with 37 columns 100 times in 142 ms Read 1 row with 38 columns 100 times in 279 ms Read 1 row with 39 columns 100 times in 184 ms Read 1 row with 40 columns 100 times in 367 ms Read 1 row with 41 columns 100 times in 202 ms Read 1 row with 42 columns 100 times in 452 ms Read 1 row with 43 columns 100 times in 206 ms Read 1 row with 44 columns 100 times in 669 ms Read 1 row with 45 columns 100 times in 242 ms Read 1 row with 46 columns 100 times in 199 ms Read 1 row with 47 columns 100 times in 250 ms Read 1 row with 48 columns 100 times in 62 ms Read 1 row with 49 columns 100 times in 62 ms Read 1 row with 50 columns 100 times in 81 ms Read 1 row with 51 columns 100 times in 66 ms Read 1 row with 52 columns 100 times in 65 ms Read 1 row with 53 columns 100 times in 64 ms Read 1 row with 54 columns 100 times in 65 ms Read 1 row with 55 columns 100 times in 113 ms Read 1 row with 56 columns 100 times in 89 ms Read 1 row with 57 columns 100 times in 72 ms Read 1 row with 58 columns 100 times in 67 ms Read 1 row with 59 columns 100 times in 67 ms Read 1 row with 60 columns 100 times in 66 ms Read 1 row with 61 columns 100 times in 78 ms Read 1 row with 62 columns 100 times in 88 ms Read 1 row with 63 columns 100 times in 72 ms Read 1 row with 64 columns 100 times in 85 ms Read 1 row with 65 columns 100 times in 82 ms Read 1 row with 66 columns 100 times in 122 ms Read 1 row with 67 columns 100 times in 82 ms Read 1 row with 68 columns 100 times in 110 ms Read 1 row with 69 columns 100 times in 168 ms Read 1 row with 70 columns 100 times in 85 ms Read 1 row with 71 columns 100 times in 85 ms Read 1 row with 72 columns 100 times in 86 ms Read 1 row with 73 columns 100 times in 109 ms Read 1 row with 74 columns 100 times in 85 ms Read 1 row with 75 columns 100 times in 97 ms Read 1 row with 76 columns 100 times in 108 ms Read 1 row with 77 columns 100 times in 128 ms Read 1 row with 78 columns 100 times in 91 ms Read 1 row with 79 columns 100 times in 102 ms Read 1 row with 80 columns 100 times in 221 ms Read 1 row with 81 columns 100 times in 138 ms Read 1 row with 82 columns 100 times in 260 ms Read 1 row with 83 columns 100 times in 474 ms Read 1 row with 84 columns 100 times in 99 ms Read 1 row with 85 columns 100 times in 110 ms Read 1 row with 86 columns 100 times in 93 ms Read 1 row with 87 columns 100 times in 313 ms Read 1 row with 88 columns 100 times in 231 ms Read 1 row with 89 columns 100 times in 246 ms Read 1 row with 90 columns 100 times in 189 ms Read 1 row with 91 columns 100 times in 162 ms Read 1 row with 92 columns 100 times in 153 ms Read 1 row with 93 columns 100 times in 157 ms Read 1 row with 94 columns 100 times in 144 ms Read 1 row with 95 columns 100 times in 156 ms Read 1 row with 96 columns 100 times in 172 ms Read 1 row with 97 columns 100 times in 362 ms Read 1 row with 98 columns 100 times in 139 ms Read 1 row with 99 columns 100 times in 171 ms Read 1 row with 100 columns 100 times in 103 ms Read 1 row with 101 columns 100 times in 149 ms Read 1 row with 102 columns 100 times in 104 ms Read 1 row with 103 columns 100 times in 410 ms Read 1 row with 104 columns 100 times in 594 ms Read 1 row with 1000 columns 100 times in 2041 ms
          Hide
          Jonathan Gray added a comment -

          Downgrading from critical to major. We need to fix this but with no real leads yet, we should not hold up any releases.

          Show
          Jonathan Gray added a comment - Downgrading from critical to major. We need to fix this but with no real leads yet, we should not hold up any releases.
          Hide
          Jim Kellerman added a comment -

          Tried running the read test backwards (reading 104 columns...6 columns). The first few rows were slower, but the big blip still existed between 23...10 columns. Something is really goofy.

          Read 1 row with 104 columns 100 times in 1170 ms
          Read 1 row with 103 columns 100 times in 313 ms
          Read 1 row with 102 columns 100 times in 820 ms
          Read 1 row with 101 columns 100 times in 809 ms
          Read 1 row with 100 columns 100 times in 403 ms
          Read 1 row with 99 columns 100 times in 704 ms
          Read 1 row with 98 columns 100 times in 317 ms
          Read 1 row with 97 columns 100 times in 246 ms
          Read 1 row with 96 columns 100 times in 169 ms
          Read 1 row with 95 columns 100 times in 179 ms
          ...
          Read 1 row with 24 columns 100 times in 65 ms
          Read 1 row with 23 columns 100 times in 3999 ms
          Read 1 row with 22 columns 100 times in 4000 ms
          Read 1 row with 21 columns 100 times in 4000 ms
          Read 1 row with 20 columns 100 times in 4000 ms
          Read 1 row with 19 columns 100 times in 4000 ms
          Read 1 row with 18 columns 100 times in 4000 ms
          Read 1 row with 17 columns 100 times in 4000 ms
          Read 1 row with 16 columns 100 times in 4000 ms
          Read 1 row with 15 columns 100 times in 4000 ms
          Read 1 row with 14 columns 100 times in 4000 ms
          Read 1 row with 13 columns 100 times in 4000 ms
          Read 1 row with 12 columns 100 times in 4000 ms
          Read 1 row with 11 columns 100 times in 4000 ms
          Read 1 row with 10 columns 100 times in 4000 ms
          Read 1 row with 9 columns 100 times in 4000 ms
          Read 1 row with 8 columns 100 times in 4000 ms
          Read 1 row with 7 columns 100 times in 42 ms
          

          Go figure...

          Show
          Jim Kellerman added a comment - Tried running the read test backwards (reading 104 columns...6 columns). The first few rows were slower, but the big blip still existed between 23...10 columns. Something is really goofy. Read 1 row with 104 columns 100 times in 1170 ms Read 1 row with 103 columns 100 times in 313 ms Read 1 row with 102 columns 100 times in 820 ms Read 1 row with 101 columns 100 times in 809 ms Read 1 row with 100 columns 100 times in 403 ms Read 1 row with 99 columns 100 times in 704 ms Read 1 row with 98 columns 100 times in 317 ms Read 1 row with 97 columns 100 times in 246 ms Read 1 row with 96 columns 100 times in 169 ms Read 1 row with 95 columns 100 times in 179 ms ... Read 1 row with 24 columns 100 times in 65 ms Read 1 row with 23 columns 100 times in 3999 ms Read 1 row with 22 columns 100 times in 4000 ms Read 1 row with 21 columns 100 times in 4000 ms Read 1 row with 20 columns 100 times in 4000 ms Read 1 row with 19 columns 100 times in 4000 ms Read 1 row with 18 columns 100 times in 4000 ms Read 1 row with 17 columns 100 times in 4000 ms Read 1 row with 16 columns 100 times in 4000 ms Read 1 row with 15 columns 100 times in 4000 ms Read 1 row with 14 columns 100 times in 4000 ms Read 1 row with 13 columns 100 times in 4000 ms Read 1 row with 12 columns 100 times in 4000 ms Read 1 row with 11 columns 100 times in 4000 ms Read 1 row with 10 columns 100 times in 4000 ms Read 1 row with 9 columns 100 times in 4000 ms Read 1 row with 8 columns 100 times in 4000 ms Read 1 row with 7 columns 100 times in 42 ms Go figure...
          Hide
          Jonathan Gray added a comment -

          Anything doing on this?

          If not, let's punt to 0.20.1 so we can clear up 0.20.0 remaining issues.

          Show
          Jonathan Gray added a comment - Anything doing on this? If not, let's punt to 0.20.1 so we can clear up 0.20.0 remaining issues.
          Hide
          Jim Kellerman added a comment -
          • It's not the RPC
          • it's not the server
          • it (sort of) appears to be in deserializing the result but I don't understand why it just for those keys and also why is local different from remote?
          • It balloons up during the same set of rows no matter if you run the reads forwards or backwards.

          I suppose it could have something to do with row/column keys for those rows, but I don't know what (and again, why local and not remote?)

          Punt it to 0.21.

          Show
          Jim Kellerman added a comment - It's not the RPC it's not the server it (sort of) appears to be in deserializing the result but I don't understand why it just for those keys and also why is local different from remote? It balloons up during the same set of rows no matter if you run the reads forwards or backwards. I suppose it could have something to do with row/column keys for those rows, but I don't know what (and again, why local and not remote?) Punt it to 0.21.
          Hide
          Jonathan Gray added a comment -

          Anyone want to dig in to this again?

          Show
          Jonathan Gray added a comment - Anyone want to dig in to this again?
          Hide
          stack added a comment -

          Moved from 0.21 to 0.22 just after merge of old 0.20 branch into TRUNK.

          Show
          stack added a comment - Moved from 0.21 to 0.22 just after merge of old 0.20 branch into TRUNK.
          Hide
          stack added a comment -

          Moving out of 0.92.0. Pull it back in if you think different.

          Show
          stack added a comment - Moving out of 0.92.0. Pull it back in if you think different.
          Hide
          James Fitch added a comment -

          After seeing the same 40ms. delay during some testing of random reads against hbase, we were able to determine that it was being caused by the Nagle "ACK delay". Setting "ipc.server.tcpnodelay" to "true" resolved the issue for us. Further discussion of this issue can be found in the linked issues.

          Show
          James Fitch added a comment - After seeing the same 40ms. delay during some testing of random reads against hbase, we were able to determine that it was being caused by the Nagle "ACK delay". Setting "ipc.server.tcpnodelay" to "true" resolved the issue for us. Further discussion of this issue can be found in the linked issues.
          Hide
          stack added a comment -

          Any new dev want to take on proving this phenomeon is because of Nagles? The 40ms does seem to correlate strongly.

          Show
          stack added a comment - Any new dev want to take on proving this phenomeon is because of Nagles? The 40ms does seem to correlate strongly.
          Hide
          Lars Hofhansl added a comment -

          If this caused by Nagles it should happen regardless of whether the DN is local to the RS or not, right? (Or maybe we're seeing an issue different from the OP's).

          Show
          Lars Hofhansl added a comment - If this caused by Nagles it should happen regardless of whether the DN is local to the RS or not, right? (Or maybe we're seeing an issue different from the OP's).
          Hide
          stack added a comment -

          Resolving as no longer valid. Looks like Nagles' anyways.

          Show
          stack added a comment - Resolving as no longer valid. Looks like Nagles' anyways.

            People

            • Assignee:
              Unassigned
              Reporter:
              Jonathan Gray
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development