Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-503

Implement erasure coding as a layer on HDFS

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.21.0
    • contrib/raid
    • None
    • Reviewed
    • Hide
      This patch implements an optional layer over HDFS that implements offline erasure-coding. It can be used to reduce the total storage requirements of DFS.
      Show
      This patch implements an optional layer over HDFS that implements offline erasure-coding. It can be used to reduce the total storage requirements of DFS.

    Description

      The goal of this JIRA is to discuss how the cost of raw storage for a HDFS file system can be reduced. Keeping three copies of the same data is very costly, especially when the size of storage is huge. One idea is to reduce the replication factor and do erasure coding of a set of blocks so that the over probability of failure of a block remains the same as before.

      Many forms of error-correcting codes are available, see http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has described DiskReduce https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.

      My opinion is to discuss implementation strategies that are not part of base HDFS, but is a layer on top of HDFS.

      Attachments

        1. raid1.txt
          162 kB
          Dhruba Borthakur
        2. raid2.txt
          174 kB
          Dhruba Borthakur

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            dhruba Dhruba Borthakur
            dhruba Dhruba Borthakur
            Votes:
            0 Vote for this issue
            Watchers:
            39 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment