Hadoop Common
  1. Hadoop Common
  2. HADOOP-2434

MapFile.get on HDFS in TRUNK is WAY!!! slower than 0.15.x

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.16.0
    • Component/s: io
    • Labels:
      None

      Description

      Stall happens down in SequenceFile in the first call to getPos inside readRecordLength. I tried the johano patch from HADOOP-2172 that restores the positional cache but that didn't seem to be the issue here.

      Here is data to support my assertion.

      I wrote a little program to make a MapFile of 1M records. I then did 1M random reads from same file. Below are timings from a 0.15.0 and TRUNK as of this afternoon run.

      0.15.x branch:

      [stack@aa0-000-12 branch-0.15]$ ./bin/hadoop org.apache.hadoop.io.TestMapFile
      .07/12/15 01:29:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      07/12/15 01:29:02 INFO io.TestMapFile: Writing 1000000 rows to testMapFileRandomAccess
      07/12/15 01:32:04 INFO io.TestMapFile: Writing 1000000 records took 182009ms
      07/12/15 01:32:04 INFO io.TestMapFile: Reading 1000000 random rows
      07/12/15 01:48:02 INFO io.TestMapFile: Reading 1000000 random records took 958243ms
      Time: 1,140.652
      OK (1 test)
      

      For the below test using TRUNK r604352, I amended the test so it output a log message every 100k reads:

      [stack@aa0-000-12 hadoop-trunk]$ ./bin/hadoop org.apache.hadoop.io.TestMapFile
      .07/12/15 01:56:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      07/12/15 01:56:34 INFO io.TestMapFile: Writing 1000000 rows to testMapFileRandomAccess
      07/12/15 01:59:38 INFO io.TestMapFile: Writing 1000000 records took 183986ms
      07/12/15 01:59:38 INFO io.TestMapFile: Reading 1000000 random rows
      .........
      

      After 20 minutes it still hadn't printed out the 'read 100k messages' (I had to leave – will fill in final figures later)

      1. 2434.patch
        0.6 kB
        stack
      2. TestMapFile.java
        3 kB
        stack
      3. TestMapFile.java
        3 kB
        stack

        Activity

        Hide
        stack added a comment -

        Code to demonstrate slowness.

        Copy the file to src/test/org/apache/hadoop/io. Build. Then point at a dfs cluster. Run the test like this so you pick up the configuration that points to a dfs cluster:

        $ ./bin/hadoop org.apache.hadoop.io.TestMapFile
        
        Show
        stack added a comment - Code to demonstrate slowness. Copy the file to src/test/org/apache/hadoop/io. Build. Then point at a dfs cluster. Run the test like this so you pick up the configuration that points to a dfs cluster: $ ./bin/hadoop org.apache.hadoop.io.TestMapFile
        Hide
        Paul Saab added a comment -

        Tracing the DataXceiveServer shows that the reads are occurring 1 byte at time.

        8787  recvfrom(42, "\0\7Q\\\360\342\324\252<\nN\0\0\0\0\0008\350\263\0\0\0\0\0\34\225V", 4096, 0, NULL, NULL) = 27
        8787  stat("/disk3/nutch/data/filesystem/data/current/subdir40/blk_6697102048916933198", {st_mode=S_IFREG|0664, st_size=5602825, ...}) = 0
        8787  stat("/disk3/nutch/data/filesystem/data/current/subdir40/blk_6697102048916933198.meta", {st_mode=S_IFREG|0664, st_size=43783, ...}) = 0
        8787  stat("/disk3/nutch/data/filesystem/data/current/subdir40/blk_6697102048916933198", {st_mode=S_IFREG|0664, st_size=5602825, ...}) = 0
        8787  open("/disk3/nutch/data/filesystem/data/current/subdir40/blk_6697102048916933198.meta", O_RDONLY) = 75
        8787  fstat(75, {st_mode=S_IFREG|0664, st_size=43783, ...}) = 0
        8787  stat("/disk3/nutch/data/filesystem/data/current/subdir40/blk_6697102048916933198.meta", {st_mode=S_IFREG|0664, st_size=43783, ...}) = 0
        8787  read(75, "\0", 1)                 = 1
        8787  read(75, "\1", 1)                 = 1
        8787  read(75, "\1", 1)                 = 1
        8787  read(75, "\0", 1)                 = 1
        8787  read(75, "\0", 1)                 = 1
        8787  read(75, "\2", 1)                 = 1
        8787  read(75, "\0", 1)                 = 1
        8787  read(75, "\337", 1)               = 1
        8787  read(75, "\34", 1)                = 1
        8787  read(75, "\352", 1)               = 1
        8787  read(75, "a", 1)                  = 1
        ....
        continues like this
        
        Show
        Paul Saab added a comment - Tracing the DataXceiveServer shows that the reads are occurring 1 byte at time. 8787 recvfrom(42, "\0\7Q\\\360\342\324\252<\nN\0\0\0\0\0008\350\263\0\0\0\0\0\34\225V" , 4096, 0, NULL, NULL) = 27 8787 stat( "/disk3/nutch/data/filesystem/data/current/subdir40/blk_6697102048916933198" , {st_mode=S_IFREG|0664, st_size=5602825, ...}) = 0 8787 stat( "/disk3/nutch/data/filesystem/data/current/subdir40/blk_6697102048916933198.meta" , {st_mode=S_IFREG|0664, st_size=43783, ...}) = 0 8787 stat( "/disk3/nutch/data/filesystem/data/current/subdir40/blk_6697102048916933198" , {st_mode=S_IFREG|0664, st_size=5602825, ...}) = 0 8787 open( "/disk3/nutch/data/filesystem/data/current/subdir40/blk_6697102048916933198.meta" , O_RDONLY) = 75 8787 fstat(75, {st_mode=S_IFREG|0664, st_size=43783, ...}) = 0 8787 stat( "/disk3/nutch/data/filesystem/data/current/subdir40/blk_6697102048916933198.meta" , {st_mode=S_IFREG|0664, st_size=43783, ...}) = 0 8787 read(75, "\0" , 1) = 1 8787 read(75, "\1" , 1) = 1 8787 read(75, "\1" , 1) = 1 8787 read(75, "\0" , 1) = 1 8787 read(75, "\0" , 1) = 1 8787 read(75, "\2" , 1) = 1 8787 read(75, "\0" , 1) = 1 8787 read(75, "\337" , 1) = 1 8787 read(75, "\34" , 1) = 1 8787 read(75, "\352" , 1) = 1 8787 read(75, "a" , 1) = 1 .... continues like this
        Hide
        Paul Saab added a comment - - edited

        Just more data points after tracing longer

        20132 open("/disk3/nutch/data/filesystem/data/current/subdir3/blk_6879048861456107903.meta", O_RDONLY) = 57
        20132 fstat(57, {st_mode=S_IFREG|0664, st_size=437691, ...}) = 0
        20132 stat("/disk3/nutch/data/filesystem/data/current/subdir3/blk_6879048861456107903.meta", {st_mode=S_IFREG|0664, st_size=437691, ...}) = 0
        20132 read(57, "\0", 1)                 = 1
        20132 read(57, "\1", 1)                 = 1
        20132 read(57, "\1", 1)                 = 1
        20132 read(57, "\0", 1)                 = 1
        20132 read(57, "\0", 1)                 = 1
        20132 read(57, "\2", 1)                 = 1
        20132 read(57, "\0", 1)                 = 1
        20132 read(57, "\320", 1)               = 1
        20132 read(57, "\314", 1)               = 1
        20132 read(57, "\6", 1)                 = 1
        20132 read(57, "\314", 1)               = 1
        20132 read(57, "Y", 1)                  = 1
        20132 read(57, "\34", 1)                = 1
        20132 read(57, "\233", 1)               = 1
        20132 read(57, "z", 1)                  = 1
        20132 read(57, "\210", 1)               = 1
        20132 read(57, "\221", 1)               = 1
        20132 read(57, "^", 1)                  = 1
        20132 read(57, "k", 1)                  = 1
        20132 read(57, "\255", 1)               = 1
        20132 read(57, "\257", 1)               = 1
        20132 read(57, "\314", 1)               = 1
        20132 read(57, "o", 1)                  = 1
        ....
        20132 read(57, "\262", 1)               = 1
        20132 stat("/disk3/nutch/data/filesystem/data/current/subdir3/blk_6879048861456107903", {st_mode=S_IFREG|0664, st_size=56023428, ...}) = 0
        20132 open("/disk3/nutch/data/filesystem/data/current/subdir3/blk_6879048861456107903", O_RDONLY) = 58
        20132 fstat(58, {st_mode=S_IFREG|0664, st_size=56023428, ...}) = 0
        20132 lseek(58, 30691328, SEEK_SET)     = 30691328
        20132 read(58, "\206&eu\30a\3454|!J\333\351\310\34\205\2\5@\265\254\\I\376\212>1\3174\230\10\203"..., 4096) = 4096
        20132 read(57, "\22", 1)                = 1
        20132 read(57, "Q", 1)                  = 1
        20132 read(57, "\26", 1)                = 1
        20132 read(57, "3", 1)                  = 1
        20132 read(57, "\234", 1)               = 1
        20132 read(57, "m", 1)                  = 1
        ...
        20132 read(57, "m", 1)                  = 1
        20132 read(57, "\263", 1)               = 1
        20132 read(57, "\306", 1)               = 1
        20132 read(57, "C", 1)                  = 1
        20132 read(57, "\332", 1)               = 1
        20132 read(57, "\223", 1)               = 1
        20132 read(57, "\353", 1)               = 1
        20132 read(57, "\314", 1)               = 1
        20132 read(57, "\242", 1)               = 1
        20132 sendto(44, "\0\0\1\0\0\2\0\0\0\0\0\1\324P\0\0\0\2\0\206&eu\30a\3454|!J\333\351"..., 3659, 0, NULL, 0) = 3659
        20132 read(58, "P\244\266!\221\310}\213\250\"\274\230\264$O\245,\242\201\266\364\267\254\260\225\335x\276\325zq\276"..., 4096) = 4096
        20132 sendto(44, "\304yXh{z\23\254\r\331\vY\215\305\307\20\243\27\261*fB\10\33\217\377\356\350\5\327\330\271"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "rb\336\307\261\313\266L\212\215\276\364\373\357\347\273\324*\2745s%\325\366\247\255\333\345\272\334fL"..., 4096) = 4096
        20132 sendto(44, "\32\231\252\36\250\252\341\363\226\227\n\200o\322\303\274Lj]Z]U\236\217\224\216\366J\307\270M\225"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "eC\335\314\32\214\215tk\352\243wl\\\300n\34\233t\324s\200D\365\320\225.\253\6\16F\366"..., 4096) = 4096
        20132 sendto(44, "#=\263\243\np=\363\270E\300\3351\267\254\200\203\260O\336/\22c`\331\216\276\230L\207\260\343"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "\v\272\236\216\21^\16\224\223|\2051G#At\203.\7o\177D\307\357p\361\312\30\247\261l\25"..., 4096) = 4096
        20132 sendto(44, "54809x\234\35\223Gn\304@\20\3/>\372+\3t\16\237\364\233]\255=,\4Mh"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "^\17\324I\377\377\377\377o\372\233(\346\36J\237\244\307\246v\312\325\rs\0\0\2\35\0\0\0\v"..., 4096) = 4096
        20132 sendto(44, "\242\252\230\20\302\202\353\227\3470C2\223\327d\322\275xS\273\327B\263\235lo\203\261\327Q\347\266"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "u1\334\32\355Y\26\37Zh\214\330hWms\317\337\230Z\225\213zj\32\356\25\252\271upm"..., 4096) = 4096
        20132 sendto(44, "\340a\205&i\323\320P\243\1\322P\340\266\4\332\354z\0;\244q\227\364\3523R\10\247\246\334\23"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "\260\306\210_`\30\3268i\332\\l\221\207\2115\360\366\216&\\\364\362!/\5\370\365\206\262\341\36"..., 4096) = 4096
        20132 sendto(44, "\315+)\344\305\26\333\243\351\353\n\317\32\355v\234\256\254\230\234_\367\264\313\337\n\324\217\345NV\214"..., 3640, 0, NULL, 0) = 3640
        20132 sendto(44, "\260\306\210_`\30\3268i\332\\l\221\207\2115\360\366\216&\\\364\362!/\5\370\365\206\262\341\36"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "\3\225zmU\210\361\321\231\16\271t\202\256\5\177\1\277\316+\323\330\0\0\2\37\0\0\0\v\n0"..., 4096) = 4096
        20132 sendto(44, "\3030\20C\177\372\221\313\10\230}\271d\316\3347Ea \206m\211\344\220T\276?\237\226U\253\310"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "\205]\376\313d\307\261\217\355\303!\347\30\246\343\201\234\251.^u\t\312\250\34\331VK\341\351X "..., 4096) = 4096
        20132 sendto(44, "\240\333\376*75\353\201G\6\234\20X\32c\367F\204\343\217_~\360\365\275\214\20\24\35158\301"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "\7\335,\224\346x\236\360\300 \35\334\330\252\222\302\340!\373\305\27-\16\21$a\251y\367\372\371\305"..., 4096) = 4096
        20132 sendto(44, "\3\346\35\301Xa\342\226\232\345\231\233N\\\224_\335\273a\264\274\0\5.\227\20\\\f\31\"\211\323"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "\200\260\315\16\265\r\363X<\227}\\)[\32%K\240\6\16\22\351\16!Q\222t\263\275\232S\270"..., 4096) = 4096
        20132 sendto(44, "iT\271\375\332J\315\355\253\330L\256t\275\265\261X\230\r|\235\303QQ\2623\233\357\260\0018\254"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "h\215\357\345\205_~3\333GaL\324\224\2566\362\200\310\264j\337\233\362\6\301\272\264\7\344c\262"..., 4096) = 4096
        20132 sendto(44, "\374lj4\346\33\264Q\32Ij\314\226\3\n;\334\r\\\16^2\'H\31636\372\365\346^\340"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "\221\210\201\214\2073\253D\301\252\17\206hoT\353&\377\f\376\232\2\334@ /\3\35&\26z\241"..., 4096) = 4096
        20132 sendto(44, "\335\177\2300\321\330\0\0\2#\0\0\0\v\n0000054873x\234%\223Yn\\"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "\243\264\0\203\206\330+h)\n\221\314\374\270sc\2653\36D@p\365\3301\20\21\370u\236a\22"..., 4096) = 4096
        20132 sendto(44, "\36@\325o\16O\16Rk\367\214\205\3\352\20\240y\353\301\2615\303p\373@0MdXnE9"..., 3640, 0, NULL, 0 <unfinished ...>
        20132 <... sendto resumed> )            = 3640
        20132 sendto(44, "\243\264\0\203\206\330+h)\n\221\314\374\270sc\2653\36D@p\365\3301\20\21\370u\236a\22"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "\236\r3\371\334\27\274VZ\362\353\t\17\253KW\256\233\355l\273\252\357(\213Sd\255I\323\252\233"..., 4096) = 4096
        20132 sendto(44, "\f\332l\300\373\225\231\267eX\246VOuw\224[N\310\364\2\224\335\351\17 \213\232\262a\352\254"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "\32\3\243\272O\273z<\232~\325@=PF\306\224\307\314E\316\356r;\34\34\203\'.x::"..., 4096) = 4096
        20132 sendto(44, "\204c`:BP\327d\341\360;\346\203\350^x#\214\260]\310}!#{\277#\373\253\234\312/"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "K\26\333\332+\363\215E_\330\325\213\242\0$.)\206\320\201:\246\362\4\330K\222\304=\325\10#"..., 4096) = 4096
        20132 sendto(44, "\260 \201\23\234\333\5\330O\336\360\177\323\300\326\205\0\0\2 \0\0\0\v\n0000054"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "\37\344\277\323#\0\0\2%\0\0\0\v\n0000054915x\234%\223\331m#A"..., 4096) = 4096
        20132 sendto(44, "t\266\372\340\331\226\347\304d\217\321R\340B+n`X\223\325\371A\302\303\321iP\211,\253\375*"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "Z\353e\370\214d6\253\234\217*y\213\254PF\247\3454\227\20\16#\2535p]\20\277\361\20\216"..., 4096) = 4096
        20132 sendto(44, "\v\206\371pl\271O_E4=w\221\217\247\202\351c\3241\251\243\277\2357\323i<$\264[m"..., 3640, 0, NULL, 0 <unfinished ...>
        20132 <... sendto resumed> )            = 3640
        20132 read(58, "\250\336t\225\1\222\31\232\342\337\236\222F\\\220Z\233|dK\267{\356\342EM\362^\354W\34\276"..., 4096) = 4096
        20132 sendto(44, "\2465\245w\370\333I\17\263u\227\245\253\313\347m\255+Yu\247\276\325Y=\311\272\256\322\353\316\302"..., 3640, 0, NULL, 0) = 3640
        20132 read(58, "Q\250\0060\326\16g\362\340\365\350\v\36\27)\222\327\344\2655\300\356l\213\222\200\2\225\16\221\247L"..., 4096) = 4096
        
        Show
        Paul Saab added a comment - - edited Just more data points after tracing longer 20132 open( "/disk3/nutch/data/filesystem/data/current/subdir3/blk_6879048861456107903.meta" , O_RDONLY) = 57 20132 fstat(57, {st_mode=S_IFREG|0664, st_size=437691, ...}) = 0 20132 stat( "/disk3/nutch/data/filesystem/data/current/subdir3/blk_6879048861456107903.meta" , {st_mode=S_IFREG|0664, st_size=437691, ...}) = 0 20132 read(57, "\0" , 1) = 1 20132 read(57, "\1" , 1) = 1 20132 read(57, "\1" , 1) = 1 20132 read(57, "\0" , 1) = 1 20132 read(57, "\0" , 1) = 1 20132 read(57, "\2" , 1) = 1 20132 read(57, "\0" , 1) = 1 20132 read(57, "\320" , 1) = 1 20132 read(57, "\314" , 1) = 1 20132 read(57, "\6" , 1) = 1 20132 read(57, "\314" , 1) = 1 20132 read(57, "Y" , 1) = 1 20132 read(57, "\34" , 1) = 1 20132 read(57, "\233" , 1) = 1 20132 read(57, "z" , 1) = 1 20132 read(57, "\210" , 1) = 1 20132 read(57, "\221" , 1) = 1 20132 read(57, "^" , 1) = 1 20132 read(57, "k" , 1) = 1 20132 read(57, "\255" , 1) = 1 20132 read(57, "\257" , 1) = 1 20132 read(57, "\314" , 1) = 1 20132 read(57, "o" , 1) = 1 .... 20132 read(57, "\262" , 1) = 1 20132 stat( "/disk3/nutch/data/filesystem/data/current/subdir3/blk_6879048861456107903" , {st_mode=S_IFREG|0664, st_size=56023428, ...}) = 0 20132 open( "/disk3/nutch/data/filesystem/data/current/subdir3/blk_6879048861456107903" , O_RDONLY) = 58 20132 fstat(58, {st_mode=S_IFREG|0664, st_size=56023428, ...}) = 0 20132 lseek(58, 30691328, SEEK_SET) = 30691328 20132 read(58, "\206&eu\30a\3454|!J\333\351\310\34\205\2\5@\265\254\\I\376\212>1\3174\230\10\203" ..., 4096) = 4096 20132 read(57, "\22" , 1) = 1 20132 read(57, "Q" , 1) = 1 20132 read(57, "\26" , 1) = 1 20132 read(57, "3" , 1) = 1 20132 read(57, "\234" , 1) = 1 20132 read(57, "m" , 1) = 1 ... 20132 read(57, "m" , 1) = 1 20132 read(57, "\263" , 1) = 1 20132 read(57, "\306" , 1) = 1 20132 read(57, "C" , 1) = 1 20132 read(57, "\332" , 1) = 1 20132 read(57, "\223" , 1) = 1 20132 read(57, "\353" , 1) = 1 20132 read(57, "\314" , 1) = 1 20132 read(57, "\242" , 1) = 1 20132 sendto(44, "\0\0\1\0\0\2\0\0\0\0\0\1\324P\0\0\0\2\0\206&eu\30a\3454|!J\333\351" ..., 3659, 0, NULL, 0) = 3659 20132 read(58, "P\244\266!\221\310}\213\250\" \274\230\264$O\245,\242\201\266\364\267\254\260\225\335x\276\325zq\276"..., 4096) = 4096 20132 sendto(44, "\304yXh{z\23\254\r\331\vY\215\305\307\20\243\27\261*fB\10\33\217\377\356\350\5\327\330\271" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "rb\336\307\261\313\266L\212\215\276\364\373\357\347\273\324*\2745s%\325\366\247\255\333\345\272\334fL" ..., 4096) = 4096 20132 sendto(44, "\32\231\252\36\250\252\341\363\226\227\n\200o\322\303\274Lj]Z]U\236\217\224\216\366J\307\270M\225" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "eC\335\314\32\214\215tk\352\243wl\\\300n\34\233t\324s\200D\365\320\225.\253\6\16F\366" ..., 4096) = 4096 20132 sendto(44, "#=\263\243\np=\363\270E\300\3351\267\254\200\203\260O\336/\22c`\331\216\276\230L\207\260\343" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "\v\272\236\216\21^\16\224\223|\2051G#At\203.\7o\177D\307\357p\361\312\30\247\261l\25" ..., 4096) = 4096 20132 sendto(44, "54809x\234\35\223Gn\304@\20\3/>\372+\3t\16\237\364\233]\255=,\4Mh" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "^\17\324I\377\377\377\377o\372\233(\346\36J\237\244\307\246v\312\325\rs\0\0\2\35\0\0\0\v" ..., 4096) = 4096 20132 sendto(44, "\242\252\230\20\302\202\353\227\3470C2\223\327d\322\275xS\273\327B\263\235lo\203\261\327Q\347\266" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "u1\334\32\355Y\26\37Zh\214\330hWms\317\337\230Z\225\213zj\32\356\25\252\271upm" ..., 4096) = 4096 20132 sendto(44, "\340a\205&i\323\320P\243\1\322P\340\266\4\332\354z\0;\244q\227\364\3523R\10\247\246\334\23" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "\260\306\210_`\30\3268i\332\\l\221\207\2115\360\366\216&\\\364\362!/\5\370\365\206\262\341\36" ..., 4096) = 4096 20132 sendto(44, "\315+)\344\305\26\333\243\351\353\n\317\32\355v\234\256\254\230\234_\367\264\313\337\n\324\217\345NV\214" ..., 3640, 0, NULL, 0) = 3640 20132 sendto(44, "\260\306\210_`\30\3268i\332\\l\221\207\2115\360\366\216&\\\364\362!/\5\370\365\206\262\341\36" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "\3\225zmU\210\361\321\231\16\271t\202\256\5\177\1\277\316+\323\330\0\0\2\37\0\0\0\v\n0" ..., 4096) = 4096 20132 sendto(44, "\3030\20C\177\372\221\313\10\230}\271d\316\3347Ea \206m\211\344\220T\276?\237\226U\253\310" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "\205]\376\313d\307\261\217\355\303!\347\30\246\343\201\234\251.^u\t\312\250\34\331VK\341\351X " ..., 4096) = 4096 20132 sendto(44, "\240\333\376*75\353\201G\6\234\20X\32c\367F\204\343\217_~\360\365\275\214\20\24\35158\301" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "\7\335,\224\346x\236\360\300 \35\334\330\252\222\302\340!\373\305\27-\16\21$a\251y\367\372\371\305" ..., 4096) = 4096 20132 sendto(44, "\3\346\35\301Xa\342\226\232\345\231\233N\\\224_\335\273a\264\274\0\5.\227\20\\\f\31\" \211\323"..., 3640, 0, NULL, 0) = 3640 20132 read(58, "\200\260\315\16\265\r\363X<\227}\\)[\32%K\240\6\16\22\351\16!Q\222t\263\275\232S\270" ..., 4096) = 4096 20132 sendto(44, "iT\271\375\332J\315\355\253\330L\256t\275\265\261X\230\r|\235\303QQ\2623\233\357\260\0018\254" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "h\215\357\345\205_~3\333GaL\324\224\2566\362\200\310\264j\337\233\362\6\301\272\264\7\344c\262" ..., 4096) = 4096 20132 sendto(44, "\374lj4\346\33\264Q\32Ij\314\226\3\n;\334\r\\\16^2\'H\31636\372\365\346^\340" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "\221\210\201\214\2073\253D\301\252\17\206hoT\353&\377\f\376\232\2\334@ /\3\35&\26z\241" ..., 4096) = 4096 20132 sendto(44, "\335\177\2300\321\330\0\0\2#\0\0\0\v\n0000054873x\234%\223Yn\\" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "\243\264\0\203\206\330+h)\n\221\314\374\270sc\2653\36D@p\365\3301\20\21\370u\236a\22" ..., 4096) = 4096 20132 sendto(44, "\36@\325o\16O\16Rk\367\214\205\3\352\20\240y\353\301\2615\303p\373@0MdXnE9" ..., 3640, 0, NULL, 0 <unfinished ...> 20132 <... sendto resumed> ) = 3640 20132 sendto(44, "\243\264\0\203\206\330+h)\n\221\314\374\270sc\2653\36D@p\365\3301\20\21\370u\236a\22" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "\236\r3\371\334\27\274VZ\362\353\t\17\253KW\256\233\355l\273\252\357(\213Sd\255I\323\252\233" ..., 4096) = 4096 20132 sendto(44, "\f\332l\300\373\225\231\267eX\246VOuw\224[N\310\364\2\224\335\351\17 \213\232\262a\352\254" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "\32\3\243\272O\273z<\232~\325@=PF\306\224\307\314E\316\356r;\34\34\203\'.x::" ..., 4096) = 4096 20132 sendto(44, "\204c`:BP\327d\341\360;\346\203\350^x#\214\260]\310}!#{\277#\373\253\234\312/" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "K\26\333\332+\363\215E_\330\325\213\242\0$.)\206\320\201:\246\362\4\330K\222\304=\325\10#" ..., 4096) = 4096 20132 sendto(44, "\260 \201\23\234\333\5\330O\336\360\177\323\300\326\205\0\0\2 \0\0\0\v\n0000054" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "\37\344\277\323#\0\0\2%\0\0\0\v\n0000054915x\234%\223\331m#A" ..., 4096) = 4096 20132 sendto(44, "t\266\372\340\331\226\347\304d\217\321R\340B+n`X\223\325\371A\302\303\321iP\211,\253\375*" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "Z\353e\370\214d6\253\234\217*y\213\254PF\247\3454\227\20\16#\2535p]\20\277\361\20\216" ..., 4096) = 4096 20132 sendto(44, "\v\206\371pl\271O_E4=w\221\217\247\202\351c\3241\251\243\277\2357\323i<$\264[m" ..., 3640, 0, NULL, 0 <unfinished ...> 20132 <... sendto resumed> ) = 3640 20132 read(58, "\250\336t\225\1\222\31\232\342\337\236\222F\\\220Z\233|dK\267{\356\342EM\362^\354W\34\276" ..., 4096) = 4096 20132 sendto(44, "\2465\245w\370\333I\17\263u\227\245\253\313\347m\255+Yu\247\276\325Y=\311\272\256\322\353\316\302" ..., 3640, 0, NULL, 0) = 3640 20132 read(58, "Q\250\0060\326\16g\362\340\365\350\v\36\27)\222\327\344\2655\300\356l\213\222\200\2\225\16\221\247L" ..., 4096) = 4096
        Hide
        stack added a comment -

        Another version of the attachment. Just does 100k instead of 1M 'cos 1M takes too long (outputs state every 1000 reads folks don't think the program is dead). Here is output from my cluster using first branch-0.15:

        [stack@aa0-000-12 branch-0.15]$ nohup ./bin/hadoop org.apache.hadoop.io.TestMapFile > testmapfile.timings < /dev/null &
        [1] 19284
        [stack@aa0-000-12 branch-0.15]$ tail -f testmapfile.timings 
        .07/12/15 20:02:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
        07/12/15 20:02:17 INFO io.TestMapFile: Writing 100000 rows to testMapFileRandomAccess
        07/12/15 20:02:38 INFO io.TestMapFile: Writing 100000 records took 20975ms
        07/12/15 20:02:38 INFO io.TestMapFile: Reading 100000 random rows
        07/12/15 20:02:40 INFO io.TestMapFile: Read 1000
        07/12/15 20:02:41 INFO io.TestMapFile: Read 2000
        07/12/15 20:02:43 INFO io.TestMapFile: Read 3000
        07/12/15 20:02:44 INFO io.TestMapFile: Read 4000
        07/12/15 20:02:45 INFO io.TestMapFile: Read 5000
        07/12/15 20:02:47 INFO io.TestMapFile: Read 6000
        07/12/15 20:02:48 INFO io.TestMapFile: Read 7000
        07/12/15 20:02:49 INFO io.TestMapFile: Read 8000
        07/12/15 20:02:50 INFO io.TestMapFile: Read 9000
        07/12/15 20:02:52 INFO io.TestMapFile: Read 10000
        ...
        

        Now using TRUNK:

        [stack@aa0-000-12 hadoop-trunk]$ nohup ./bin/hadoop org.apache.hadoop.io.TestMapFile > testmapfile.timings < /dev/null &
        [stack@aa0-000-12 hadoop-trunk]$ tail -f testmapfile.timings
        .07/12/15 19:44:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
        07/12/15 19:44:27 INFO io.TestMapFile: Writing 100000 rows to testMapFileRandomAccess
        07/12/15 19:44:48 INFO io.TestMapFile: Writing 100000 records took 20326ms
        07/12/15 19:44:48 INFO io.TestMapFile: Reading 100000 random rows
        07/12/15 19:49:17 INFO io.TestMapFile: Read 1000
        07/12/15 19:53:40 INFO io.TestMapFile: Read 2000
        07/12/15 19:57:47 INFO io.TestMapFile: Read 3000
        ...
        

        TRUNK is taking about 13 minutes to read 3k. Branch 0.15 does same random reads in ~5 seconds. Thats > 2 orders of magnitude slower.

        I retried the johano position cache patch to be sure and that doesn' t fix the problem.

        Pauls' suggestion of our reading a byte a time would seem to jibe with the timings seen.

        Show
        stack added a comment - Another version of the attachment. Just does 100k instead of 1M 'cos 1M takes too long (outputs state every 1000 reads folks don't think the program is dead). Here is output from my cluster using first branch-0.15: [stack@aa0-000-12 branch-0.15]$ nohup ./bin/hadoop org.apache.hadoop.io.TestMapFile > testmapfile.timings < /dev/ null & [1] 19284 [stack@aa0-000-12 branch-0.15]$ tail -f testmapfile.timings .07/12/15 20:02:17 WARN util.NativeCodeLoader: Unable to load native -hadoop library for your platform... using builtin-java classes where applicable 07/12/15 20:02:17 INFO io.TestMapFile: Writing 100000 rows to testMapFileRandomAccess 07/12/15 20:02:38 INFO io.TestMapFile: Writing 100000 records took 20975ms 07/12/15 20:02:38 INFO io.TestMapFile: Reading 100000 random rows 07/12/15 20:02:40 INFO io.TestMapFile: Read 1000 07/12/15 20:02:41 INFO io.TestMapFile: Read 2000 07/12/15 20:02:43 INFO io.TestMapFile: Read 3000 07/12/15 20:02:44 INFO io.TestMapFile: Read 4000 07/12/15 20:02:45 INFO io.TestMapFile: Read 5000 07/12/15 20:02:47 INFO io.TestMapFile: Read 6000 07/12/15 20:02:48 INFO io.TestMapFile: Read 7000 07/12/15 20:02:49 INFO io.TestMapFile: Read 8000 07/12/15 20:02:50 INFO io.TestMapFile: Read 9000 07/12/15 20:02:52 INFO io.TestMapFile: Read 10000 ... Now using TRUNK: [stack@aa0-000-12 hadoop-trunk]$ nohup ./bin/hadoop org.apache.hadoop.io.TestMapFile > testmapfile.timings < /dev/ null & [stack@aa0-000-12 hadoop-trunk]$ tail -f testmapfile.timings .07/12/15 19:44:27 WARN util.NativeCodeLoader: Unable to load native -hadoop library for your platform... using builtin-java classes where applicable 07/12/15 19:44:27 INFO io.TestMapFile: Writing 100000 rows to testMapFileRandomAccess 07/12/15 19:44:48 INFO io.TestMapFile: Writing 100000 records took 20326ms 07/12/15 19:44:48 INFO io.TestMapFile: Reading 100000 random rows 07/12/15 19:49:17 INFO io.TestMapFile: Read 1000 07/12/15 19:53:40 INFO io.TestMapFile: Read 2000 07/12/15 19:57:47 INFO io.TestMapFile: Read 3000 ... TRUNK is taking about 13 minutes to read 3k. Branch 0.15 does same random reads in ~5 seconds. Thats > 2 orders of magnitude slower. I retried the johano position cache patch to be sure and that doesn' t fix the problem. Pauls' suggestion of our reading a byte a time would seem to jibe with the timings seen.
        Hide
        Paul Saab added a comment -

        It turns out HADOOP-1989 is causing this regression. stack has verified that reverting to before svn 595509, random reads are once again fast:

        Here's the stack trace of what's causing the one byte reads

        Thread 36193 (org.apache.hadoop.dfs.DataNode$DataXceiver@45f73f44):
          State: RUNNABLE
          Blocked count: 0
          Waited count: 0
          Stack:
            java.io.FileInputStream.read(Native Method)
            org.apache.hadoop.dfs.FSDatasetInterface$MetaDataInputStream.read(FSDatasetInterface.java:63)
            java.io.InputStream.read(InputStream.java:163)
            java.io.InputStream.skip(InputStream.java:208)
            java.io.BufferedInputStream.skip(BufferedInputStream.java:349)
            java.io.FilterInputStream.skip(FilterInputStream.java:125)
            org.apache.hadoop.io.IOUtils.skipFully(IOUtils.java:115)
            org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1326)
            org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:888)
            org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:840)
            java.lang.Thread.run(Thread.java:619)
        

        MetaDataInputStream.read() is called over and over to read 1 byte at a time

        Show
        Paul Saab added a comment - It turns out HADOOP-1989 is causing this regression. stack has verified that reverting to before svn 595509, random reads are once again fast: Here's the stack trace of what's causing the one byte reads Thread 36193 (org.apache.hadoop.dfs.DataNode$DataXceiver@45f73f44): State: RUNNABLE Blocked count: 0 Waited count: 0 Stack: java.io.FileInputStream.read(Native Method) org.apache.hadoop.dfs.FSDatasetInterface$MetaDataInputStream.read(FSDatasetInterface.java:63) java.io.InputStream.read(InputStream.java:163) java.io.InputStream.skip(InputStream.java:208) java.io.BufferedInputStream.skip(BufferedInputStream.java:349) java.io.FilterInputStream.skip(FilterInputStream.java:125) org.apache.hadoop.io.IOUtils.skipFully(IOUtils.java:115) org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1326) org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:888) org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:840) java.lang. Thread .run( Thread .java:619) MetaDataInputStream.read() is called over and over to read 1 byte at a time
        Hide
        Raghu Angadi added a comment -

        Good catch. I noticed this when I was working on HADOOP-2012. I fixed it there but didn't think it would affect so much.
        Could you try the following fix?

        Index: src/java/org/apache/hadoop/dfs/FSDatasetInterface.java
        ===================================================================
        --- src/java/org/apache/hadoop/dfs/FSDatasetInterface.java      (revision 601589)
        +++ src/java/org/apache/hadoop/dfs/FSDatasetInterface.java      (working copy)
        @@ -64,8 +64,8 @@
             }
        
             @Override
        -    public int read(byte[] b) throws IOException {
        -      return inStream.read(b);
        +    public int read(byte[] b, int off, int len) throws IOException {
        +      return inStream.read(b, off, len);
             }
        
             public long getLength() {
        
        Show
        Raghu Angadi added a comment - Good catch. I noticed this when I was working on HADOOP-2012 . I fixed it there but didn't think it would affect so much. Could you try the following fix? Index: src/java/org/apache/hadoop/dfs/FSDatasetInterface.java =================================================================== --- src/java/org/apache/hadoop/dfs/FSDatasetInterface.java (revision 601589) +++ src/java/org/apache/hadoop/dfs/FSDatasetInterface.java (working copy) @@ -64,8 +64,8 @@ } @Override - public int read(byte[] b) throws IOException { - return inStream.read(b); + public int read(byte[] b, int off, int len) throws IOException { + return inStream.read(b, off, len); } public long getLength() {
        Hide
        stack added a comment -

        +1

        This patch works for me.

        Show
        stack added a comment - +1 This patch works for me.
        Hide
        stack added a comment -

        Raghu's fix as an attached patch.

        Show
        stack added a comment - Raghu's fix as an attached patch.
        Hide
        stack added a comment -

        Reconfirmed this patch indeed fixes the issue.

        Ran unit tests locally. TestLocalFileSystem and hdfsserver ports failed. Trying hudson.

        Show
        stack added a comment - Reconfirmed this patch indeed fixes the issue. Ran unit tests locally. TestLocalFileSystem and hdfsserver ports failed. Trying hudson.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12371742/2434.patch
        against trunk revision r604451.

        @author +1. The patch does not contain any @author tags.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new compiler warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests +1. The patch passed contrib unit tests.

        Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1358/testReport/
        Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1358/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1358/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1358/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12371742/2434.patch against trunk revision r604451. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1358/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1358/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1358/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1358/console This message is automatically generated.
        Hide
        stack added a comment -

        Marking this issue a blocker: hbase read performance is abysmal without it.

        Show
        stack added a comment - Marking this issue a blocker: hbase read performance is abysmal without it.
        Hide
        dhruba borthakur added a comment -

        I just committed this. Thanks Raghu!

        Show
        dhruba borthakur added a comment - I just committed this. Thanks Raghu!
        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-Nightly #338 (See http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/338/ )

          People

          • Assignee:
            stack
            Reporter:
            stack
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development