Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-767

Job failure due to BlockMissingException

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      If a block is request by too many mappers/reducers (say, 3000) at the same time, a BlockMissingException is thrown because it exceeds the upper limit (I think 256 by default) of number of threads accessing the same block at the same time. The DFSClient wil catch that exception and retry 3 times after waiting for 3 seconds. Since the wait time is a fixed value, a lot of clients will retry at about the same time and a large portion of them get another failure. After 3 retries, there are about 256*4 = 1024 clients got the block. If the number of clients are more than that, the job will fail.

      1. HDFS-767_4.txt
        13 kB
        dhruba borthakur
      2. HDFS-767_3.patch
        13 kB
        Ning Zhang
      3. HDFS-767_2.patch
        12 kB
        Ning Zhang
      4. HDFS-767.patch
        12 kB
        Ning Zhang

        Activity

        Ning Zhang created issue -
        Ning Zhang made changes -
        Field Original Value New Value
        Attachment HDFS-767.patch [ 12425556 ]
        dhruba borthakur made changes -
        Assignee Ning Zhang [ nzhang ]
        He Yongqiang made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Ning Zhang made changes -
        Attachment HDFS-767_2.patch [ 12428065 ]
        Ning Zhang made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Ning Zhang made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Ning Zhang made changes -
        Status Patch Available [ 10002 ] In Progress [ 3 ]
        Ning Zhang made changes -
        Status In Progress [ 3 ] Patch Available [ 10002 ]
        dhruba borthakur made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Ning Zhang made changes -
        Status Open [ 1 ] In Progress [ 3 ]
        Ning Zhang made changes -
        Attachment HDFS-767_3.patch [ 12428890 ]
        Ning Zhang made changes -
        Status In Progress [ 3 ] Patch Available [ 10002 ]
        dhruba borthakur made changes -
        Attachment HDFS-767_4.txt [ 12429024 ]
        dhruba borthakur made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Fix Version/s 0.22.0 [ 12314241 ]
        Resolution Fixed [ 1 ]
        Tom White made changes -
        Fix Version/s 0.21.0 [ 12314046 ]
        Fix Version/s 0.22.0 [ 12314241 ]
        Tom White made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Ning Zhang
            Reporter:
            Ning Zhang
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development