Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8486

DN startup may cause severe data loss

    XMLWordPrintableJSON

Details

    • Hide
      <!-- markdown -->
      Public service notice:
      * Every restart of a 2.6.x or 2.7.0 DN incurs a risk of unwanted block deletion.
      * Apply this patch if you are running a pre-2.7.1 release.
      Show
      <!-- markdown --> Public service notice: * Every restart of a 2.6.x or 2.7.0 DN incurs a risk of unwanted block deletion. * Apply this patch if you are running a pre-2.7.1 release.

    Description

      A race condition between block pool initialization and the directory scanner may cause a mass deletion of blocks in multiple storages.

      If block pool initialization finds a block on disk that is already in the replica map, it deletes one of the blocks based on size, GS, etc. Unfortunately it always deletes one of the blocks even if identical, thus the replica map must be empty when the pool is initialized.

      The directory scanner starts at a random time within its periodic interval (default 6h). If the scanner starts very early it races to populate the replica map, causing the block pool init to erroneously delete blocks.

      Attachments

        1. HDFS-8486.patch
          6 kB
          Daryn Sharp
        2. HDFS-8486.patch
          6 kB
          Daryn Sharp
        3. HDFS-8486-branch-2.6.patch
          6 kB
          Arpit Agarwal
        4. HDFS-8486-branch-2.6.02.patch
          6 kB
          Arpit Agarwal
        5. HDFS-8486-branch-2.6.addendum.patch
          0.7 kB
          Arpit Agarwal

        Issue Links

          Activity

            People

              daryn Daryn Sharp
              daryn Daryn Sharp
              Votes:
              0 Vote for this issue
              Watchers:
              26 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: