Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10775

Under-Replicated Blocks can not be recovered

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0.0-alpha1
    • None
    • erasure-coding
    • None
    • 2 NameNodes, 5 DataNodes, Erasured code policy is set as "RS-DEFAULT-3-2-64k"

    Description

      I killed DataNode in the middle of the writing of the EC file. Under-Replicated Blocks has occurred, but did not recover.

      DataNodes: datanode[1-5]
      Rack awareness: not set
      Copy target files: /tmp/tpcds-generate/25/store_sales/*

      $ hdfs dfs -ls /tmp/tpcds-generate/25/store_sales
      Found 25 items
      -rw-r--r--   0 root supergroup  399430918 2016-08-16 15:11 /tmp/tpcds-generate/25/store_sales/data-m-00000
      -rw-r--r--   0 root supergroup  399054598 2016-08-16 15:11 /tmp/tpcds-generate/25/store_sales/data-m-00001
      -rw-r--r--   0 root supergroup  399329373 2016-08-16 15:11 /tmp/tpcds-generate/25/store_sales/data-m-00002
      -rw-r--r--   0 root supergroup  399528459 2016-08-16 15:11 /tmp/tpcds-generate/25/store_sales/data-m-00003
      -rw-r--r--   0 root supergroup  399329624 2016-08-16 15:11 /tmp/tpcds-generate/25/store_sales/data-m-00004
      -rw-r--r--   0 root supergroup  399085924 2016-08-16 15:11 /tmp/tpcds-generate/25/store_sales/data-m-00005
      -rw-r--r--   0 root supergroup  399337384 2016-08-16 15:12 /tmp/tpcds-generate/25/store_sales/data-m-00006
      -rw-r--r--   0 root supergroup  399199458 2016-08-16 15:12 /tmp/tpcds-generate/25/store_sales/data-m-00007
      -rw-r--r--   0 root supergroup  399679096 2016-08-16 15:12 /tmp/tpcds-generate/25/store_sales/data-m-00008
      -rw-r--r--   0 root supergroup  399440431 2016-08-16 15:12 /tmp/tpcds-generate/25/store_sales/data-m-00009
      -rw-r--r--   0 root supergroup  399403931 2016-08-16 15:12 /tmp/tpcds-generate/25/store_sales/data-m-00010
      -rw-r--r--   0 root supergroup  399472465 2016-08-16 15:12 /tmp/tpcds-generate/25/store_sales/data-m-00011
      -rw-r--r--   0 root supergroup  399451784 2016-08-16 15:12 /tmp/tpcds-generate/25/store_sales/data-m-00012
      -rw-r--r--   0 root supergroup  399240168 2016-08-16 15:12 /tmp/tpcds-generate/25/store_sales/data-m-00013
      -rw-r--r--   0 root supergroup  399370507 2016-08-16 15:12 /tmp/tpcds-generate/25/store_sales/data-m-00014
      -rw-r--r--   0 root supergroup  399633351 2016-08-16 15:12 /tmp/tpcds-generate/25/store_sales/data-m-00015
      -rw-r--r--   0 root supergroup  396532952 2016-08-16 15:13 /tmp/tpcds-generate/25/store_sales/data-m-00016
      -rw-r--r--   0 root supergroup  396258715 2016-08-16 15:13 /tmp/tpcds-generate/25/store_sales/data-m-00017
      -rw-r--r--   0 root supergroup  396382486 2016-08-16 15:13 /tmp/tpcds-generate/25/store_sales/data-m-00018
      -rw-r--r--   0 root supergroup  399016456 2016-08-16 15:13 /tmp/tpcds-generate/25/store_sales/data-m-00019
      -rw-r--r--   0 root supergroup  399465745 2016-08-16 15:13 /tmp/tpcds-generate/25/store_sales/data-m-00020
      -rw-r--r--   0 root supergroup  399208235 2016-08-16 15:13 /tmp/tpcds-generate/25/store_sales/data-m-00021
      -rw-r--r--   0 root supergroup  399198296 2016-08-16 15:13 /tmp/tpcds-generate/25/store_sales/data-m-00022
      -rw-r--r--   0 root supergroup  399599711 2016-08-16 15:13 /tmp/tpcds-generate/25/store_sales/data-m-00023
      -rw-r--r--   0 root supergroup  395150855 2016-08-16 15:13 /tmp/tpcds-generate/25/store_sales/data-m-00024
      

      Destination directory: /tmp/tpcds-generate/test

      $ sudo -u hdfs hdfs erasurecode -getPolicy /tmp/tpcds-generate/test
      ErasureCodingPolicy=[Name=RS-DEFAULT-3-2-64k, Schema=[ECSchema=[Codec=rs-default, numDataUnits=3, numParityUnits=2]], CellSize=65536 ]
      

      The following is the steps to reproduce:
      1) hdfs dfs -cp /tmp/tpcds-generate/25/store_sales/* /tmp/tpcds-generate/test
      2) datanode1: (in the middle of the copy) sudo pkill -9 -f datanode
      3) start a process of datanode1 two minutes later
      4) wait for a while

      Attachments

        Issue Links

          Activity

            People

              aihuaxu Aihua Xu
              ademu Eisuke Umeda
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated: