Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-503

Implement erasure coding as a layer on HDFS

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: contrib/raid
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      This patch implements an optional layer over HDFS that implements offline erasure-coding. It can be used to reduce the total storage requirements of DFS.
      Show
      This patch implements an optional layer over HDFS that implements offline erasure-coding. It can be used to reduce the total storage requirements of DFS.

      Description

      The goal of this JIRA is to discuss how the cost of raw storage for a HDFS file system can be reduced. Keeping three copies of the same data is very costly, especially when the size of storage is huge. One idea is to reduce the replication factor and do erasure coding of a set of blocks so that the over probability of failure of a block remains the same as before.

      Many forms of error-correcting codes are available, see http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has described DiskReduce https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.

      My opinion is to discuss implementation strategies that are not part of base HDFS, but is a layer on top of HDFS.

      1. raid1.txt
        162 kB
        dhruba borthakur
      2. raid2.txt
        174 kB
        dhruba borthakur

        Issue Links

          Activity

          Hide
          Hong Tang added a comment -

          As a reference, FAST 09 has a paper that benchmarks the performance of various open source erasure coding implementations: http://www.cs.utk.edu/~plank/plank/papers/FAST-2009.html.

          Show
          Hong Tang added a comment - As a reference, FAST 09 has a paper that benchmarks the performance of various open source erasure coding implementations: http://www.cs.utk.edu/~plank/plank/papers/FAST-2009.html .
          Hide
          dhruba borthakur added a comment -

          @Hong: does somebody have a online copy of the paper that we can reference?

          Show
          dhruba borthakur added a comment - @Hong: does somebody have a online copy of the paper that we can reference?
          Hide
          Hong Tang added a comment -

          Yes, there is a link to the pdf version of the paper: http://www.cs.utk.edu/~plank/plank/papers/FAST-2009.pdf.

          Show
          Hong Tang added a comment - Yes, there is a link to the pdf version of the paper: http://www.cs.utk.edu/~plank/plank/papers/FAST-2009.pdf .
          Hide
          dhruba borthakur added a comment -

          Here is a preliminary version of implementing Erasure coding in HDFS.

          This package implements a Distributed Raid File System. It is used alongwith
          an instance of the Hadoop Distributed File System (HDFS). It can be used to
          provide better protection against data corruption. It can also be used to
          reduce the total storage requirements of HDFS.

          Distributed Raid File System consists of two main software components. The first component
          is the RaidNode, a daemon that creates parity files from specified HDFS files.
          The second component "raidfs" is a software that is layered over a HDFS client and it
          intercepts all calls that an application makes to the HDFS client. If HDFS encounters
          corrupted data while reading a file, the raidfs client detects it; it uses the
          relevant parity blocks to recover the corrupted data (if possible) and returns
          the data to the application. The application is completely transparent to the
          fact that parity data was used to satisfy it's read request.

          The primary use of this feature is to save disk space for HDFS files.
          HDFS typically stores data in triplicate.
          The Distributed Raid File System can be configured in such a way that a set of
          data blocks of a file are combined together to form one or more parity blocks.
          This allows one to reduce the replication factor of a HDFS file from 3 to 2
          while keeping the failure probabilty relatively same as before. This typically
          results in saving 25% to 30% of storage space in a HDFS cluster.

          The RaidNode periodically scans all the specified paths in the configuration
          file. For each path, it recursively scans all files that have more than 2 blocks
          and that has not been modified during the last few hours (default is 24 hours).
          It picks the specified number of blocks (as specified by the stripe size),
          from the file, generates a parity block by combining them and
          stores the results as another HDFS file in the specified destination
          directory. There is a one-to-one mapping between a HDFS
          file and its parity file. The RaidNode also periodically finds parity files
          that are orphaned and deletes them.

          The Distributed Raid FileSystem is layered over a DistributedFileSystem
          instance intercepts all calls that go into HDFS. HDFS throws a ChecksumException
          or a BlocMissingException when a file read encounters bad data. The layered
          Distributed Raid FileSystem catches these exceptions, locates the corresponding
          parity file, extract the original data from the parity files and feeds the
          extracted data back to the application in a completely transparent way.

          The layered Distributed Raid FileSystem does not fix the data-loss that it
          encounters while serving data. It merely make the application transparently
          use the parity blocks to re-create the original data. A command line tool
          "fsckraid" is currently under development that will fix the corrupted files
          by extracting the data from the associated parity files. An adminstrator
          can run "fsckraid" manually as and when needed.

          More details in src/contrib/raid/README

          Show
          dhruba borthakur added a comment - Here is a preliminary version of implementing Erasure coding in HDFS. This package implements a Distributed Raid File System. It is used alongwith an instance of the Hadoop Distributed File System (HDFS). It can be used to provide better protection against data corruption. It can also be used to reduce the total storage requirements of HDFS. Distributed Raid File System consists of two main software components. The first component is the RaidNode, a daemon that creates parity files from specified HDFS files. The second component "raidfs" is a software that is layered over a HDFS client and it intercepts all calls that an application makes to the HDFS client. If HDFS encounters corrupted data while reading a file, the raidfs client detects it; it uses the relevant parity blocks to recover the corrupted data (if possible) and returns the data to the application. The application is completely transparent to the fact that parity data was used to satisfy it's read request. The primary use of this feature is to save disk space for HDFS files. HDFS typically stores data in triplicate. The Distributed Raid File System can be configured in such a way that a set of data blocks of a file are combined together to form one or more parity blocks. This allows one to reduce the replication factor of a HDFS file from 3 to 2 while keeping the failure probabilty relatively same as before. This typically results in saving 25% to 30% of storage space in a HDFS cluster. The RaidNode periodically scans all the specified paths in the configuration file. For each path, it recursively scans all files that have more than 2 blocks and that has not been modified during the last few hours (default is 24 hours). It picks the specified number of blocks (as specified by the stripe size), from the file, generates a parity block by combining them and stores the results as another HDFS file in the specified destination directory. There is a one-to-one mapping between a HDFS file and its parity file. The RaidNode also periodically finds parity files that are orphaned and deletes them. The Distributed Raid FileSystem is layered over a DistributedFileSystem instance intercepts all calls that go into HDFS. HDFS throws a ChecksumException or a BlocMissingException when a file read encounters bad data. The layered Distributed Raid FileSystem catches these exceptions, locates the corresponding parity file, extract the original data from the parity files and feeds the extracted data back to the application in a completely transparent way. The layered Distributed Raid FileSystem does not fix the data-loss that it encounters while serving data. It merely make the application transparently use the parity blocks to re-create the original data. A command line tool "fsckraid" is currently under development that will fix the corrupted files by extracting the data from the associated parity files. An adminstrator can run "fsckraid" manually as and when needed. More details in src/contrib/raid/README
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Took a quick look of the patch. Very cool!

          It seems that the parity is computed by xor. If there is a clean api, we may improve it by some advanced codes like Reed-Solomon in the future.

          Show
          Tsz Wo Nicholas Sze added a comment - Took a quick look of the patch. Very cool! It seems that the parity is computed by xor. If there is a clean api, we may improve it by some advanced codes like Reed-Solomon in the future.
          Hide
          dhruba borthakur added a comment -

          Hi Nicholas, I agree with you completely. The current patch implements basic xor. Once this patch is accepted by the community, I plan to make the algorithm pluggable, so that people can plug in more advanced erasure codes into the framework laid out by this patch.

          If you have the time and energy, please review the patch and provide any feedback you may have. Thanks.

          Show
          dhruba borthakur added a comment - Hi Nicholas, I agree with you completely. The current patch implements basic xor. Once this patch is accepted by the community, I plan to make the algorithm pluggable, so that people can plug in more advanced erasure codes into the framework laid out by this patch. If you have the time and energy, please review the patch and provide any feedback you may have. Thanks.
          Hide
          Andrew Ryan added a comment -

          In speaking with Dhruba about this, I had one additional optimization to offer, which he suggested I add to the issue.

          If an HDFS client detects that it is forced to go to the parity block for a file because there is a missing block, it should proactively perform, or initiate, the equivalent of the "fsckraid" on the file that it is reading, since going to parity means that something is seriously wrong (fsck will report 'CORRUPT' for example), and it should not wait for a periodic scan of the filesystem to occur.

          Also, the provided raid.xml in the patch contains only a subset of important configuration directives, I think it's nice when it includes all possible configuration directives, but that's just personal preference.

          Show
          Andrew Ryan added a comment - In speaking with Dhruba about this, I had one additional optimization to offer, which he suggested I add to the issue. If an HDFS client detects that it is forced to go to the parity block for a file because there is a missing block, it should proactively perform, or initiate, the equivalent of the "fsckraid" on the file that it is reading, since going to parity means that something is seriously wrong (fsck will report 'CORRUPT' for example), and it should not wait for a periodic scan of the filesystem to occur. Also, the provided raid.xml in the patch contains only a subset of important configuration directives, I think it's nice when it includes all possible configuration directives, but that's just personal preference.
          Hide
          Rodrigo Schmidt added a comment -

          I've discussed with Dhruba about the need to create a hadoop property to define an hdfs property to specify the FileSystem class that will be used by DistributedRaidFileSystem, which now creates a new instance of DistributedFileSystem. Instead, it should use a new property like "fs.raid.underlyingfs.impl".

          Show
          Rodrigo Schmidt added a comment - I've discussed with Dhruba about the need to create a hadoop property to define an hdfs property to specify the FileSystem class that will be used by DistributedRaidFileSystem, which now creates a new instance of DistributedFileSystem. Instead, it should use a new property like "fs.raid.underlyingfs.impl".
          Hide
          Raghu Angadi added a comment -

          This seems pretty useful. since this is done outside HDFS, it is simpler for users to start experimenting.

          Say a file has 5 blocks with replication of 3 : total 15 blocks
          With this tool, replication could be reduced to 2, with one block for parity : total 10 + 2 blocks
          This is a savings of 20% space. Is this math correct?

          Detecting when to 'unRaid' :

          • The patch does this using a wrapper filesystem over HDFS.
            • This requires file to be read by the client.
            • More often than not, HDFS knows about irrecoverable blocks much before a client reads.
            • this only semi-transparent to the users since they have to use the new filesystem.
          • Another completely transparent alternative could be to make 'RaidNode' ping NameNode for missing blocks.
            • NameNode already knows about blocks that don't have any known good replica. And fetching that list is cheap.
            • RaidNode could check if the corrupt/missing block belongs to any of its files.
            • Rest of RaidNode pretty much remains the same as this patch.
          Show
          Raghu Angadi added a comment - This seems pretty useful. since this is done outside HDFS, it is simpler for users to start experimenting. Say a file has 5 blocks with replication of 3 : total 15 blocks With this tool, replication could be reduced to 2, with one block for parity : total 10 + 2 blocks This is a savings of 20% space. Is this math correct? Detecting when to 'unRaid' : The patch does this using a wrapper filesystem over HDFS. This requires file to be read by the client. More often than not, HDFS knows about irrecoverable blocks much before a client reads. this only semi-transparent to the users since they have to use the new filesystem. Another completely transparent alternative could be to make 'RaidNode' ping NameNode for missing blocks. NameNode already knows about blocks that don't have any known good replica. And fetching that list is cheap. RaidNode could check if the corrupt/missing block belongs to any of its files. Rest of RaidNode pretty much remains the same as this patch.
          Hide
          dhruba borthakur added a comment -

          Incorporated a few review comments:

          1. Make the underlying filesystem configurable (the default is till DistributedFileSystem)
          2. The sample raid.xml lists the configuration properties that are exposed to the adminstrator.

          @Nicolas: I created a separate JIRA HDFS-600 to make the Parity generation algorithm pluggable. I will like to address it in a separate patch. This is going to play a critical part if we want to reduce the physical replication factor even more.

          @Andrew: I created HDFS-582 to implement a command line utility called fsckraid. It will periodically verify parity bits.

          @Raghu, you mentioned that "this only semi-transparent to the users since they have to use the new filesystem". In most cases, the cluster administrator sets the value of fs.hdfs.impl to DistributedRaidFileSystem, and no user and/or aplications need to change to use this raid feature.... that is what I meant by saying that this is "transparent" to the user. I also immensely like your idea of making the RaidNode fetch a list of corrupt blocks from the NN. As far as I know, such an API does not exist in the NN. I will open a new JIRA that retrieves a list of missing blocks from the NN.

          Thanks everybody for their review comments.

          Show
          dhruba borthakur added a comment - Incorporated a few review comments: 1. Make the underlying filesystem configurable (the default is till DistributedFileSystem) 2. The sample raid.xml lists the configuration properties that are exposed to the adminstrator. @Nicolas: I created a separate JIRA HDFS-600 to make the Parity generation algorithm pluggable. I will like to address it in a separate patch. This is going to play a critical part if we want to reduce the physical replication factor even more. @Andrew: I created HDFS-582 to implement a command line utility called fsckraid. It will periodically verify parity bits. @Raghu, you mentioned that "this only semi-transparent to the users since they have to use the new filesystem". In most cases, the cluster administrator sets the value of fs.hdfs.impl to DistributedRaidFileSystem, and no user and/or aplications need to change to use this raid feature.... that is what I meant by saying that this is "transparent" to the user. I also immensely like your idea of making the RaidNode fetch a list of corrupt blocks from the NN. As far as I know, such an API does not exist in the NN. I will open a new JIRA that retrieves a list of missing blocks from the NN. Thanks everybody for their review comments.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12419417/raid2.txt
          against trunk revision 814221.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          -1 release audit. The applied patch generated 157 release audit warnings (more than the trunk's current 154 warnings).

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/24/testReport/
          Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/24/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/24/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/24/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/24/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419417/raid2.txt against trunk revision 814221. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 157 release audit warnings (more than the trunk's current 154 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/24/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/24/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/24/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/24/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/24/console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > I created a separate JIRA HDFS-600 to make the Parity generation algorithm pluggable.
          Thanks, Dhruba.

          Show
          Tsz Wo Nicholas Sze added a comment - > I created a separate JIRA HDFS-600 to make the Parity generation algorithm pluggable. Thanks, Dhruba.
          Hide
          dhruba borthakur added a comment -

          I moved this away from the 0.21 release. I would also like to get this committed into the trunk so that that we can work on the secondary JIRAS related to this patch. This will hopefully be a fully workable feature by the time 0.22 is released. Please let me know if anybody has any concerns about this getting committed as a contrib module.

          Show
          dhruba borthakur added a comment - I moved this away from the 0.21 release. I would also like to get this committed into the trunk so that that we can work on the secondary JIRAS related to this patch. This will hopefully be a fully workable feature by the time 0.22 is released. Please let me know if anybody has any concerns about this getting committed as a contrib module.
          Hide
          dhruba borthakur added a comment -

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Show
          dhruba borthakur added a comment - +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests.
          Hide
          dhruba borthakur added a comment -

          I committed this.

          Show
          dhruba borthakur added a comment - I committed this.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #61 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/61/)
          . This patch implements an optional layer over HDFS that
          implements offline erasure-coding. It can be used to reduce the
          total storage requirements of HDFS. (dhruba)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #61 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/61/ ) . This patch implements an optional layer over HDFS that implements offline erasure-coding. It can be used to reduce the total storage requirements of HDFS. (dhruba)
          Hide
          Zooko Wilcox-O'Hearn added a comment -

          By the way the hadoop-lafs plugin for Hadoop also provides erasure coding, by using the Tahoe Least-Authority Filesystem instead of HDFS: http://code.google.com/p/hadoop-lafs/ . The erasure coding in Tahoe-LAFS is implemented with a high-speed Reed-Solomon implementation in C.

          Show
          Zooko Wilcox-O'Hearn added a comment - By the way the hadoop-lafs plugin for Hadoop also provides erasure coding, by using the Tahoe Least-Authority Filesystem instead of HDFS: http://code.google.com/p/hadoop-lafs/ . The erasure coding in Tahoe-LAFS is implemented with a high-speed Reed-Solomon implementation in C.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #106 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/106/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #106 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/106/ )
          Hide
          Hudson added a comment -

          Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #20 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/20/)

          Show
          Hudson added a comment - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #20 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/20/ )
          Hide
          Konstantin Boudnik added a comment -

          Raid related tests are keep failing for a while now (http://hudson.zones.apache.org/hudson/view/Hadoop/job/Hadoop-Hdfs-trunk/135/console and before):

              [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.711 sec
              [junit] Test org.apache.hadoop.hdfs.TestRaidDfs FAILED
          ...
              [junit] Tests run: 2, Failures: 0, Errors: 2, Time elapsed: 0.632 sec
              [junit] Test org.apache.hadoop.raid.TestRaidNode FAILED
          ...
              [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.461 sec
              [junit] Test org.apache.hadoop.raid.TestRaidPurge FAILED
          

          It appears that some configuration options are incorrect.

          Show
          Konstantin Boudnik added a comment - Raid related tests are keep failing for a while now ( http://hudson.zones.apache.org/hudson/view/Hadoop/job/Hadoop-Hdfs-trunk/135/console and before): [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.711 sec [junit] Test org.apache.hadoop.hdfs.TestRaidDfs FAILED ... [junit] Tests run: 2, Failures: 0, Errors: 2, Time elapsed: 0.632 sec [junit] Test org.apache.hadoop.raid.TestRaidNode FAILED ... [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.461 sec [junit] Test org.apache.hadoop.raid.TestRaidPurge FAILED It appears that some configuration options are incorrect.
          Hide
          dhruba borthakur added a comment -

          I am investigating....

          Show
          dhruba borthakur added a comment - I am investigating....
          Hide
          dhruba borthakur added a comment -

          I will fix the unit-test failure via HDFS-757. The unit tests failed when Maven integration code was checked into HDFS.

          Show
          dhruba borthakur added a comment - I will fix the unit-test failure via HDFS-757 . The unit tests failed when Maven integration code was checked into HDFS.
          Hide
          shravankumar added a comment -

          Dear sir,
          I have downloaded the code for "Implement erasure coding as a layer on HDFS". But i was unable to execute it. Please guide me regarding this.

          Show
          shravankumar added a comment - Dear sir, I have downloaded the code for "Implement erasure coding as a layer on HDFS". But i was unable to execute it. Please guide me regarding this.
          Hide
          Rodrigo Schmidt added a comment -

          Hi,

          Can you provide more details on what you have done and what didn't work.

          Did you follow the instructions on the README file? Which error did you see?

          Cheers,
          Rodrigo

          Show
          Rodrigo Schmidt added a comment - Hi, Can you provide more details on what you have done and what didn't work. Did you follow the instructions on the README file? Which error did you see? Cheers, Rodrigo
          Hide
          shravankumar added a comment -

          Thank you sir.
          I have one more query both raid1.txt and raid2.txt looks similar what is the difference between them.
          In the implementation for parity whether they are used NORMAL CRC OR SOME OTHER MECHANISMS like REED SOLOMON CODES.

          Shravan Kumar.

          Show
          shravankumar added a comment - Thank you sir. I have one more query both raid1.txt and raid2.txt looks similar what is the difference between them. In the implementation for parity whether they are used NORMAL CRC OR SOME OTHER MECHANISMS like REED SOLOMON CODES. Shravan Kumar.
          Hide
          shravankumar added a comment -

          For raid1.txt and raid2.txt any DESIGN DIAGRAMS like Class Diagram are there.

          Show
          shravankumar added a comment - For raid1.txt and raid2.txt any DESIGN DIAGRAMS like Class Diagram are there.
          Hide
          Rodrigo Schmidt added a comment -

          raid1.txt and raid2.txt are different patches. The most recent was the one that got committed.

          Raid is implementing simple xor parity right now, but we have plans to extend it in the future.

          Sorry, no design diagrams that I'm aware of.

          Show
          Rodrigo Schmidt added a comment - raid1.txt and raid2.txt are different patches. The most recent was the one that got committed. Raid is implementing simple xor parity right now, but we have plans to extend it in the future. Sorry, no design diagrams that I'm aware of.
          Hide
          shravankumar added a comment -

          Thank you.

          Show
          shravankumar added a comment - Thank you.
          Hide
          shravankumar added a comment -

          Hello sir,

          1. what is the meaning for this
          <srcPath prefix="hdfs://dfs1.xxx.com:8000/user/dhruba/">

          2. In ADMINISTRATION they mentioned RaidNode Software what it means.

          3. In HADOOP_HOME, run ant package to build Hadoop and its contrib packages.
          This will come when we installed hadoop 0.20.1 or we need download ant package.

          Show
          shravankumar added a comment - Hello sir, 1. what is the meaning for this <srcPath prefix="hdfs://dfs1.xxx.com:8000/user/dhruba/"> 2. In ADMINISTRATION they mentioned RaidNode Software what it means. 3. In HADOOP_HOME, run ant package to build Hadoop and its contrib packages. This will come when we installed hadoop 0.20.1 or we need download ant package.
          Hide
          shravankumar added a comment -

          The tags(property,description) used in programming are normal HTML Tags or they have different meaning.
          Can you send me the document which consist of meanings of these tags.

          Show
          shravankumar added a comment - The tags(property,description) used in programming are normal HTML Tags or they have different meaning. Can you send me the document which consist of meanings of these tags.
          Hide
          Rodrigo Schmidt added a comment -

          The tags are XML.

          There is no documentation for the tags, either.

          In short, Raid is still being optimized and changes are constant. Any strong documentation effort at this point would be meaningful for a very short period of time.

          The source code is the best and most precise documentation you can rely upon. That's the good thing about open source projects. You can easily get around stale documentation.

          Show
          Rodrigo Schmidt added a comment - The tags are XML. There is no documentation for the tags, either. In short, Raid is still being optimized and changes are constant. Any strong documentation effort at this point would be meaningful for a very short period of time. The source code is the best and most precise documentation you can rely upon. That's the good thing about open source projects. You can easily get around stale documentation.
          Hide
          shravankumar added a comment -

          Thank you sir.

          Show
          shravankumar added a comment - Thank you sir.
          Hide
          shravankumar added a comment -

          Hello sir,

          1. what is the meaning for this
          <srcPath prefix="hdfs://dfs1.xxx.com:8000/user/dhruba/">

          2. In ADMINISTRATION they mentioned RaidNode Software what it means.

          3. In HADOOP_HOME, run ant package to build Hadoop and its contrib packages.
          This will come when we installed hadoop 0.20.1 or we need download ant package.

          Show
          shravankumar added a comment - Hello sir, 1. what is the meaning for this <srcPath prefix="hdfs://dfs1.xxx.com:8000/user/dhruba/"> 2. In ADMINISTRATION they mentioned RaidNode Software what it means. 3. In HADOOP_HOME, run ant package to build Hadoop and its contrib packages. This will come when we installed hadoop 0.20.1 or we need download ant package.
          Hide
          Celina d´ Ávila Samogin added a comment -

          I have to intend to propose something about implementation of erasure coding techniques in HDFS, starting in July, 2010. I will add a comment for to say what I'm doing or ask for hints, soon as possible. For now, I have studied the texts suggested in this issue and others papers. I have read about RS codes and LDPC codes. I have not even started to implement and did not even start the test.

          Show
          Celina d´ Ávila Samogin added a comment - I have to intend to propose something about implementation of erasure coding techniques in HDFS, starting in July, 2010. I will add a comment for to say what I'm doing or ask for hints, soon as possible. For now, I have studied the texts suggested in this issue and others papers. I have read about RS codes and LDPC codes. I have not even started to implement and did not even start the test.
          Hide
          shravankumar added a comment -

          Can any one help me. In which stable version of hadoop this raid become a part and how can i access the API documents related to raid

          Show
          shravankumar added a comment - Can any one help me. In which stable version of hadoop this raid become a part and how can i access the API documents related to raid
          Hide
          Ramkumar Vadali added a comment -

          @shravankumar Quite a few bugs in raid have been fixed in trunk. This will be part of the upcoming release hadoop-0.22. What do you mean by raid API?

          Show
          Ramkumar Vadali added a comment - @shravankumar Quite a few bugs in raid have been fixed in trunk. This will be part of the upcoming release hadoop-0.22. What do you mean by raid API?
          Hide
          shravankumar added a comment -

          @Ramkumar vadali.Thank you sir
          Where can i access the raid code(which have been fixed).

          Show
          shravankumar added a comment - @Ramkumar vadali.Thank you sir Where can i access the raid code(which have been fixed).
          Hide
          Ramkumar Vadali added a comment -

          @shravankumar, to get a basic idea of HDFS RAID, you can read up Dhruba's blog post http://hadoopblog.blogspot.com/2009/08/hdfs-and-erasure-codes-hdfs-raid.html

          If you need this for demo purposes, could you use the current hadoop trunk? I am not sure about the exact date of the next release.
          To use RAID, you need to create a configuration file and start the RAID daemon. You can look for examples in the unit tests, say TestRaidNode.

          For further communication, you can contact me directly.

          Show
          Ramkumar Vadali added a comment - @shravankumar, to get a basic idea of HDFS RAID, you can read up Dhruba's blog post http://hadoopblog.blogspot.com/2009/08/hdfs-and-erasure-codes-hdfs-raid.html If you need this for demo purposes, could you use the current hadoop trunk? I am not sure about the exact date of the next release. To use RAID, you need to create a configuration file and start the RAID daemon. You can look for examples in the unit tests, say TestRaidNode. For further communication, you can contact me directly.
          Hide
          Krishnaraj added a comment -

          Is there any stable version of Hadoop erasure coding. Where can I download the source code of it. I am not able to find it, in the hadoop trunk.

          Show
          Krishnaraj added a comment - Is there any stable version of Hadoop erasure coding. Where can I download the source code of it. I am not able to find it, in the hadoop trunk.
          Show
          dhruba borthakur added a comment - source code in http://svn.apache.org/repos/asf/hadoop/mapreduce/trunk/src/contrib/raid
          Hide
          Krishnaraj added a comment -

          I got this patch and I took the HDFS from http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/ and put it in contrib and built it. But I din know how to use it further(ie instead of the hadoop jar that we setup in the cluster. I did not get the jar as mentioned in README). Is there any detailed tutorial?

          Show
          Krishnaraj added a comment - I got this patch and I took the HDFS from http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/ and put it in contrib and built it. But I din know how to use it further(ie instead of the hadoop jar that we setup in the cluster. I did not get the jar as mentioned in README). Is there any detailed tutorial?
          Hide
          sri added a comment -

          Error in name node
          ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.dfs.DistributedRaidFileSystem
          at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:866)
          at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1304)
          at org.apache.hadoop.fs.FileSystem.access$100(FileSystem.java:65)
          at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1328)
          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:226)
          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:109)
          at org.apache.hadoop.fs.Trash.<init>(Trash.java:62)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.startTrashEmptier(NameNode.java:292)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:288)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:434)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1153)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1162)
          Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.dfs.DistributedRaidFileSystem
          at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
          at java.security.AccessController.doPrivileged(Native Method)
          at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
          at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
          at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
          at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
          at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:334)
          at java.lang.Class.forName0(Native Method)
          at java.lang.Class.forName(Class.java:264)
          at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:819)
          at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:864)
          ... 11 more

          Can some body help me out..

          Show
          sri added a comment - Error in name node ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.dfs.DistributedRaidFileSystem at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:866) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1304) at org.apache.hadoop.fs.FileSystem.access$100(FileSystem.java:65) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1328) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:226) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:109) at org.apache.hadoop.fs.Trash.<init>(Trash.java:62) at org.apache.hadoop.hdfs.server.namenode.NameNode.startTrashEmptier(NameNode.java:292) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:288) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:434) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1153) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1162) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.dfs.DistributedRaidFileSystem at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:334) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:819) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:864) ... 11 more Can some body help me out..
          Hide
          Robert Chansler added a comment -

          I'll look forward to reading your message when I return to the office Friday 17 June.

          Show
          Robert Chansler added a comment - I'll look forward to reading your message when I return to the office Friday 17 June.
          Hide
          sri added a comment -

          I have couple of questions,

          1)With, Raid being setup, I am not able to generate DFSAdmin report (hadoop dfsadmin -report). Why is that so ?

          2)I am not able to reduce the targetReplicationFactor to 0 (I want to run mapreduce where the Bloackfixer retrives the data from the raided disks) Is der any way to do this.

          Thanks in advance

          Show
          sri added a comment - I have couple of questions, 1)With, Raid being setup, I am not able to generate DFSAdmin report (hadoop dfsadmin -report). Why is that so ? 2)I am not able to reduce the targetReplicationFactor to 0 (I want to run mapreduce where the Bloackfixer retrives the data from the raided disks) Is der any way to do this. Thanks in advance
          Hide
          sri added a comment -

          I would like to know, if the stripes just act as a recovery option(when other datanodes have failed), or can they act as input to the mapreduce jobs(to satisfy locality).

          Show
          sri added a comment - I would like to know, if the stripes just act as a recovery option(when other datanodes have failed), or can they act as input to the mapreduce jobs(to satisfy locality).
          Hide
          dhruba borthakur added a comment -

          1. Raid has no impact on dfsadmin -report command.

          2. You won't be able to set a replication factor to 0. You would have to manually pull the plug (kill it) on a datanode to see how raid works.

          3. stripe locations do not contribute to split locations of a block, thus they are not used for map-reduce locality.

          Show
          dhruba borthakur added a comment - 1. Raid has no impact on dfsadmin -report command. 2. You won't be able to set a replication factor to 0. You would have to manually pull the plug (kill it) on a datanode to see how raid works. 3. stripe locations do not contribute to split locations of a block, thus they are not used for map-reduce locality.
          Hide
          Hemanth Makkapati added a comment -

          Hey,
          I am a beginner with hadoop and started delving into the code only lately.
          As I was trying to get RAID up and running, I observed the following exception in the log

          ERROR org.apache.hadoop.raid.RaidNode: java.lang.NullPointerException
          at org.apache.hadoop.raid.RaidNode.tmpHarPathForCode(RaidNode.java:1491)
          at org.apache.hadoop.raid.RaidNode.doHar(RaidNode.java:1217)
          at org.apache.hadoop.raid.RaidNode.access$300(RaidNode.java:73)
          at org.apache.hadoop.raid.RaidNode$HarMonitor.run(RaidNode.java:1371)
          at java.lang.Thread.run(Thread.java:636)

          The reason for this seems to be the absence of 'erasurecode' tag in raid configuration file which, in my case, is very similar to the sample configuration file provided. Once the tag is introduced, which is allowed to assume either XOR or RS, I didn't see any exception. Also, the README file also doesn't mention anything about such a tag.
          Please confirm if my observation is correct.
          Thought of posting it here for the benefit of others.
          BTW, I checked out code from the trunk.

          Thank you.

          Show
          Hemanth Makkapati added a comment - Hey, I am a beginner with hadoop and started delving into the code only lately. As I was trying to get RAID up and running, I observed the following exception in the log ERROR org.apache.hadoop.raid.RaidNode: java.lang.NullPointerException at org.apache.hadoop.raid.RaidNode.tmpHarPathForCode(RaidNode.java:1491) at org.apache.hadoop.raid.RaidNode.doHar(RaidNode.java:1217) at org.apache.hadoop.raid.RaidNode.access$300(RaidNode.java:73) at org.apache.hadoop.raid.RaidNode$HarMonitor.run(RaidNode.java:1371) at java.lang.Thread.run(Thread.java:636) The reason for this seems to be the absence of 'erasurecode' tag in raid configuration file which, in my case, is very similar to the sample configuration file provided. Once the tag is introduced, which is allowed to assume either XOR or RS, I didn't see any exception. Also, the README file also doesn't mention anything about such a tag. Please confirm if my observation is correct. Thought of posting it here for the benefit of others. BTW, I checked out code from the trunk. Thank you.

            People

            • Assignee:
              dhruba borthakur
              Reporter:
              dhruba borthakur
            • Votes:
              0 Vote for this issue
              Watchers:
              37 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development