Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-770

SocketTimeoutException: timeout while waiting for channel to be ready for read

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.20.1
    • Fix Version/s: None
    • Labels:
      None
    • Environment:

      Ubuntu Linux 8.04

      Description

      We're having issues with timeouts occurring in our client: for some reason, a timeout of 63000 milliseconds is triggered while writing HDFS data. Since we currently have a single-server setup, this results in our client terminating with a "All datanodes are bad" IOException.

      We're running all services, including the client, on our single server, so it cannot be a network error. The load on the client is extremely low during this period: only a few kilobytes a minute were being written around the time the error occured.

      After browsing a bit online, a lot of people talk about setting "dfs.datanode.socket.write.timeout" to 0 as a solution for this problem. Due to the low load of our system during this period, however, I do feel this is a real error and a timeout that should not be occurring. I have attached 3 logs of the namenode, datanode and client.

      It could be that this is related to http://issues.apache.org/jira/browse/HDFS-693

      Any pointers on how I can assist to resolve this issue will be greatly appreciated.

      1. filewriter.cpp
        0.6 kB
        Zheng Shao
      2. client.txt
        3 kB
        Leon Mergen
      3. namenode.txt
        3 kB
        Leon Mergen
      4. datanode.txt
        4 kB
        Leon Mergen

        Issue Links

          Activity

          Hide
          Todd Lipcon added a comment -

          I also wonder whether it might be HDFS-561. I'll comment there that I think it should be put on the 20 branch

          Show
          Todd Lipcon added a comment - I also wonder whether it might be HDFS-561 . I'll comment there that I think it should be put on the 20 branch
          Hide
          Todd Lipcon added a comment -

          I have a suspicion this may be vaguely related to HDFS-1075 - some problems with the way write pipelines are staged.

          Show
          Todd Lipcon added a comment - I have a suspicion this may be vaguely related to HDFS-1075 - some problems with the way write pipelines are staged.
          Hide
          Zheng Shao added a comment -

          bad machine:

          /proc/sys/vm/block_dump:0
          /proc/sys/vm/dirty_background_ratio:10
          /proc/sys/vm/dirty_expire_centisecs:3000
          /proc/sys/vm/dirty_ratio:40
          /proc/sys/vm/dirty_writeback_centisecs:500
          /proc/sys/vm/hugetlb_shm_group:0
          /proc/sys/vm/laptop_mode:0
          /proc/sys/vm/legacy_va_layout:0
          /proc/sys/vm/lowmem_reserve_ratio:256   32
          /proc/sys/vm/max_map_count:65536
          /proc/sys/vm/min_free_kbytes:16383
          /proc/sys/vm/nr_hugepages:0
          /proc/sys/vm/nr_pdflush_threads:2
          /proc/sys/vm/overcommit_memory:0
          /proc/sys/vm/overcommit_ratio:50
          /proc/sys/vm/page-cluster:3
          /proc/sys/vm/swappiness:60
          /proc/sys/vm/swap_token_timeout:0       0
          /proc/sys/vm/vfs_cache_pressure:100
          

          good machine:

          /proc/sys/vm/block_dump:0
          /proc/sys/vm/dirty_background_ratio:10
          /proc/sys/vm/dirty_expire_centisecs:2999
          /proc/sys/vm/dirty_ratio:40
          /proc/sys/vm/dirty_writeback_centisecs:499
          /proc/sys/vm/drop_caches:0
          /proc/sys/vm/hugetlb_shm_group:0
          /proc/sys/vm/laptop_mode:0
          /proc/sys/vm/legacy_va_layout:0
          /proc/sys/vm/lowmem_reserve_ratio:256   256
          /proc/sys/vm/max_map_count:65536
          /proc/sys/vm/min_free_kbytes:23004
          /proc/sys/vm/min_slab_ratio:5
          /proc/sys/vm/min_unmapped_ratio:1
          /proc/sys/vm/nr_hugepages:0
          /proc/sys/vm/nr_pdflush_threads:2
          /proc/sys/vm/overcommit_memory:0
          /proc/sys/vm/overcommit_ratio:50
          /proc/sys/vm/page-cluster:3
          /proc/sys/vm/panic_on_oom:0
          /proc/sys/vm/percpu_pagelist_fraction:0
          /proc/sys/vm/swappiness:60
          /proc/sys/vm/vfs_cache_pressure:100
          /proc/sys/vm/zone_reclaim_mode:0
          
          Show
          Zheng Shao added a comment - bad machine: /proc/sys/vm/block_dump:0 /proc/sys/vm/dirty_background_ratio:10 /proc/sys/vm/dirty_expire_centisecs:3000 /proc/sys/vm/dirty_ratio:40 /proc/sys/vm/dirty_writeback_centisecs:500 /proc/sys/vm/hugetlb_shm_group:0 /proc/sys/vm/laptop_mode:0 /proc/sys/vm/legacy_va_layout:0 /proc/sys/vm/lowmem_reserve_ratio:256 32 /proc/sys/vm/max_map_count:65536 /proc/sys/vm/min_free_kbytes:16383 /proc/sys/vm/nr_hugepages:0 /proc/sys/vm/nr_pdflush_threads:2 /proc/sys/vm/overcommit_memory:0 /proc/sys/vm/overcommit_ratio:50 /proc/sys/vm/page-cluster:3 /proc/sys/vm/swappiness:60 /proc/sys/vm/swap_token_timeout:0 0 /proc/sys/vm/vfs_cache_pressure:100 good machine: /proc/sys/vm/block_dump:0 /proc/sys/vm/dirty_background_ratio:10 /proc/sys/vm/dirty_expire_centisecs:2999 /proc/sys/vm/dirty_ratio:40 /proc/sys/vm/dirty_writeback_centisecs:499 /proc/sys/vm/drop_caches:0 /proc/sys/vm/hugetlb_shm_group:0 /proc/sys/vm/laptop_mode:0 /proc/sys/vm/legacy_va_layout:0 /proc/sys/vm/lowmem_reserve_ratio:256 256 /proc/sys/vm/max_map_count:65536 /proc/sys/vm/min_free_kbytes:23004 /proc/sys/vm/min_slab_ratio:5 /proc/sys/vm/min_unmapped_ratio:1 /proc/sys/vm/nr_hugepages:0 /proc/sys/vm/nr_pdflush_threads:2 /proc/sys/vm/overcommit_memory:0 /proc/sys/vm/overcommit_ratio:50 /proc/sys/vm/page-cluster:3 /proc/sys/vm/panic_on_oom:0 /proc/sys/vm/percpu_pagelist_fraction:0 /proc/sys/vm/swappiness:60 /proc/sys/vm/vfs_cache_pressure:100 /proc/sys/vm/zone_reclaim_mode:0
          Hide
          Todd Lipcon added a comment -

          Hey Zheng,

          Any chance you can run "grep . /proc/sys/vm/*" on the system that does show the problem, and compare the results to the one that doesn't show the problem? I'm thinking this might just be a factor of system level tuning. See http://www.westnet.com/~gsmith/content/linux-pdflush.htm

          Show
          Todd Lipcon added a comment - Hey Zheng, Any chance you can run "grep . /proc/sys/vm/*" on the system that does show the problem, and compare the results to the one that doesn't show the problem? I'm thinking this might just be a factor of system level tuning. See http://www.westnet.com/~gsmith/content/linux-pdflush.htm
          Hide
          Todd Lipcon added a comment -

          I assume the _fbk kernel doesn't have wacky patches backported from the last couple months of kernel development? Will run your test program on a couple boxes here as well, thanks for uploading it.

          Show
          Todd Lipcon added a comment - I assume the _fbk kernel doesn't have wacky patches backported from the last couple months of kernel development? Will run your test program on a couple boxes here as well, thanks for uploading it.
          Hide
          Zheng Shao added a comment -

          The new machine is running: "Linux hadoop0100.xxx.yyy.com 2.6.20-39_fbk #1 SMP Mon Mar 16 20:33:46 PDT 2009 x86_64 x86_64 x86_64 GNU/Linux" (uname -a) with 12 local disks.
          You can use the attached filewriter.cpp to see if your box has the problem mentioned above.

          Show
          Zheng Shao added a comment - The new machine is running: "Linux hadoop0100.xxx.yyy.com 2.6.20-39_fbk #1 SMP Mon Mar 16 20:33:46 PDT 2009 x86_64 x86_64 x86_64 GNU/Linux" (uname -a) with 12 local disks. You can use the attached filewriter.cpp to see if your box has the problem mentioned above.
          Hide
          Zheng Shao added a comment -

          filewriter.cpp: A simple program to test whether the box will block continuous writes for a long time. See Zheng's earlier comments for details.

          Show
          Zheng Shao added a comment - filewriter.cpp: A simple program to test whether the box will block continuous writes for a long time. See Zheng's earlier comments for details.
          Hide
          Todd Lipcon added a comment -

          Hi Zheng,

          What kernel is the newer box running? Is it a brand new kernel? 2.6.32 has significant changes to dirty page writeback

          -Todd

          Show
          Todd Lipcon added a comment - Hi Zheng, What kernel is the newer box running? Is it a brand new kernel? 2.6.32 has significant changes to dirty page writeback -Todd
          Hide
          Zheng Shao added a comment -

          In my case, it seems a problem with my box.
          It's running "2.6.12-1.1398_FC4smp #1 SMP Fri Jul 15 01:05:24 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux" (uname -a) with 4 local hard disks.

          I wrote a simple program (tried both Java and C) to write out a huge file, and the writes get blocked frequently after writing out 5~10GB of data. The block was sometimes as long as 18 seconds.

          Tried it on a newer box and the problem disappeared. The block is in "FileOutputStream.write" and "fwrite". Each write takes 60KB of data.

          I guess it's not a problem of the hardware, because "iostat -x -n 2" shows that the disk write speed is still non-zero (10-30MB/s) when this blocking happens.

          Show
          Zheng Shao added a comment - In my case, it seems a problem with my box. It's running "2.6.12-1.1398_FC4smp #1 SMP Fri Jul 15 01:05:24 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux" (uname -a) with 4 local hard disks. I wrote a simple program (tried both Java and C) to write out a huge file, and the writes get blocked frequently after writing out 5~10GB of data. The block was sometimes as long as 18 seconds. Tried it on a newer box and the problem disappeared. The block is in "FileOutputStream.write" and "fwrite". Each write takes 60KB of data. I guess it's not a problem of the hardware, because "iostat -x -n 2" shows that the disk write speed is still non-zero (10-30MB/s) when this blocking happens.
          Hide
          Leon Mergen added a comment -

          For what it's worth, the timeout problems I was having are gone after I have set "dfs.datanode.socket.write.timeout" to 630000, so it definitely appears as if the DN went to sleep for some reason.

          Show
          Leon Mergen added a comment - For what it's worth, the timeout problems I was having are gone after I have set "dfs.datanode.socket.write.timeout" to 630000, so it definitely appears as if the DN went to sleep for some reason.
          Hide
          Zheng Shao added a comment -

          I saw the same thing with hadoop 0.19 while doing heavy-weight writes.

          09/10/29 01:05:18 WARN hdfs.DFSClient: DataStreamer Exception: java.net.SocketTimeoutException: 30000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.\
          SocketChannel[connected local=/aa.bb.cc.dd:55040 remote=/ee.ff.gg.hh:50010]
                  at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:162)
                  at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146)
                  at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107)
                  at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
                  at java.io.DataOutputStream.write(DataOutputStream.java:90)
                  at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2323)
          
          Show
          Zheng Shao added a comment - I saw the same thing with hadoop 0.19 while doing heavy-weight writes. 09/10/29 01:05:18 WARN hdfs.DFSClient: DataStreamer Exception: java.net.SocketTimeoutException: 30000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.\ SocketChannel[connected local=/aa.bb.cc.dd:55040 remote=/ee.ff.gg.hh:50010] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:162) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) at java.io.DataOutputStream.write(DataOutputStream.java:90) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2323)
          Hide
          Raghu Angadi added a comment -

          From the datanode log :

          > 2009-11-13 06:18:21,965 DEBUG org.apache.hadoop.ipc.RPC: Call: sendHeartbeat 14
          > 2009-11-13 06:19:38,081 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to dfs.hadoop.tsukku.solatis/127.0.0.1:9000 from hadoop: closed

          Note that there is no activity on DataNode for 77 seconds. There are number of possibilities, common one being GC. we haven't seen GC taking this long DN though.

          Assuming DN went to sleep for some reason, rest of the behaviour is expected. If you do expect such delays, what you need to increase is the read timeout for "responder thread" in DFSOutputStream (there is a config for generic read timeout that applies to sockets in many contexts).

          Show
          Raghu Angadi added a comment - From the datanode log : > 2009-11-13 06:18:21,965 DEBUG org.apache.hadoop.ipc.RPC: Call: sendHeartbeat 14 > 2009-11-13 06:19:38,081 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to dfs.hadoop.tsukku.solatis/127.0.0.1:9000 from hadoop: closed Note that there is no activity on DataNode for 77 seconds. There are number of possibilities, common one being GC. we haven't seen GC taking this long DN though. Assuming DN went to sleep for some reason, rest of the behaviour is expected. If you do expect such delays, what you need to increase is the read timeout for "responder thread" in DFSOutputStream (there is a config for generic read timeout that applies to sockets in many contexts).

            People

            • Assignee:
              Unassigned
              Reporter:
              Leon Mergen
            • Votes:
              3 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:

                Development