Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12943 Consistent Reads from Standby Node
  3. HDFS-13924

Handle BlockMissingException when reading from observer

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • HDFS-12943, 3.3.0
    • None
    • None
    • Reviewed

    Description

      Internally we found that reading from ObserverNode may result to BlockMissingException. This may happen when the observer sees a smaller number of DNs than active (maybe due to communication issue with those DNs), or (we guess) late block reports from some DNs to the observer. This error happens in DFSInputStream#chooseDataNode, when no valid DN can be found for the LocatedBlock got from the NN side.

      One potential solution (although a little hacky) is to ask the DFSInputStream to retry active when this happens. The retry logic already present in the code - we just have to dynamically set a flag to ask the ObserverReadProxyProvider try active in this case.

      cc shv, xkrogen, vagarychen, zero45 for discussion.

      Attachments

        1. HDFS-13924-HDFS-12943.000.patch
          8 kB
          Chao Sun
        2. HDFS-13924-HDFS-12943.001.patch
          8 kB
          Chao Sun
        3. HDFS-13924-HDFS-12943.002.patch
          10 kB
          Chao Sun
        4. HDFS-13924-HDFS-12943.003.patch
          10 kB
          Chao Sun
        5. HDFS-13924-HDFS-12943.004.patch
          11 kB
          Erik Krogen

        Activity

          People

            csun Chao Sun
            csun Chao Sun
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: