Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1203

DataNode should sleep before reentering service loop after an exception

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.22.0
    • Component/s: datanode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When the DN gets an exception in response to a heartbeat, it logs it and continues, but there is no sleep. I've occasionally seen bugs produce a case where heartbeats continuously produce exceptions, and thus the DN floods the NN with bad heartbeats. Adding a 1 second sleep at least throttles the error messages for easier debugging and error isolation.

      1. hdfs-1203.txt
        0.7 kB
        Todd Lipcon
      2. hdfs-1203.txt
        0.6 kB
        Todd Lipcon

        Activity

        Konstantin Shvachko made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Todd Lipcon made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Todd Lipcon made changes -
        Attachment hdfs-1203.txt [ 12453153 ]
        Jakob Homan made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Jakob Homan made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Jakob Homan made changes -
        Issue Type Bug [ 1 ] Improvement [ 4 ]
        Fix Version/s 0.22.0 [ 12314241 ]
        Hadoop Flags [Reviewed]
        Todd Lipcon made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Todd Lipcon made changes -
        Field Original Value New Value
        Attachment hdfs-1203.txt [ 12446925 ]
        Todd Lipcon created issue -

          People

          • Assignee:
            Todd Lipcon
            Reporter:
            Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development