Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-826

Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline

    XMLWordPrintableJSON

Details

    • Reviewed
    • hbase

    Description

      HDFS does not replicate the last block of the file that is being currently written to by an application. Every datanode death in the write pipeline decreases the reliability of the last block of the currently-being-written block. This situation can be improved if the application can be notified of a datanode death in the write pipeline. Then, the application can decide what is the right course of action to be taken on this event.

      In our use-case, the application can close the file on the first datanode death, and start writing to a newly created file. This ensures that the reliability guarantee of a block is close to 3 at all time.

      One idea is to make DFSOutoutStream. write() throw an exception if the number of datanodes in the write pipeline fall below minimum.replication.factor that is set on the client (this is backward compatible).

      Attachments

        1. HDFS-826.20-security.1.patch
          5 kB
          Jitendra Nath Pandey
        2. HDFS-826-0.20.patch
          7 kB
          Nicolas Spiegelberg
        3. HDFS-826-0.20-v2.patch
          5 kB
          Michael Stack
        4. Replicable4.txt
          5 kB
          Dhruba Borthakur
        5. ReplicableHdfs.txt
          3 kB
          Dhruba Borthakur
        6. ReplicableHdfs2.txt
          5 kB
          Dhruba Borthakur
        7. ReplicableHdfs3.txt
          5 kB
          Dhruba Borthakur

        Issue Links

          Activity

            People

              dhruba Dhruba Borthakur
              dhruba Dhruba Borthakur
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: