Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8031 Follow-on work for erasure coding phase I (striping layout)
  3. HDFS-8704

Erasure Coding: client fails to write large file when one datanode fails

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Cannot Reproduce
    • None
    • None
    • None
    • None

    Description

      I test current code on a 5-node cluster using RS(3,2). When a datanode is corrupt, client succeeds to write a file smaller than a block group but fails to write a large one. TestDFSStripeOutputStreamWithFailure only tests files smaller than a block group, this jira will add more test situations.

      A streamer may encounter some bad datanodes when writing blocks allocated to it. When it fails to connect datanode or send a packet, the streamer needs to prepare for the next block. First it removes the packets of current block from its data queue. If the first packet of next block has already been in the data queue, the streamer will reset its state and start to wait for the next block allocated for it; otherwise it will just wait for the first packet of next block. The streamer will check periodically if it is asked to terminate during its waiting.

      Attachments

        1. HDFS-8704-HDFS-7285-008.patch
          35 kB
          Li Bo
        2. HDFS-8704-HDFS-7285-007.patch
          34 kB
          Li Bo
        3. HDFS-8704-HDFS-7285-006.patch
          24 kB
          Li Bo
        4. HDFS-8704-HDFS-7285-005.patch
          26 kB
          Li Bo
        5. HDFS-8704-HDFS-7285-004.patch
          26 kB
          Li Bo
        6. HDFS-8704-HDFS-7285-003.patch
          28 kB
          Li Bo
        7. HDFS-8704-HDFS-7285-002.patch
          28 kB
          Li Bo
        8. HDFS-8704-000.patch
          5 kB
          Li Bo

        Activity

          People

            libo-intel Li Bo
            libo-intel Li Bo
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: