Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: fs
    • Labels:
    • Release Note:
      Adds a CephFileSystem, allowing you to use Ceph as the underlying storage of your Hadoop instance.

      Description

      The experimental distributed filesystem Ceph does not have a single point of failure, uses a distributed metadata cluster instead of a single in-memory server, and might be of use to some Hadoop users.

      http://ceph.com/docs/wip-hadoop-doc/cephfs/hadoop/

      1. HADOOP-6253.patch
        83 kB
        Gregory Farnum
      2. HADOOP-6253.patch
        50 kB
        Gregory Farnum
      3. HADOOP-6253.patch
        50 kB
        Gregory Farnum
      4. HADOOP-6253.patch
        52 kB
        Gregory Farnum
      5. HADOOP-6253.patch
        28 kB
        Gregory Farnum

        Issue Links

          Activity

          Gregory Farnum created issue -
          Hide
          Gregory Farnum added a comment -

          I've attached a patch which includes the CephFileSystem and IOStream classes, as well as package documentation. To actually use it you're going to need an installation of Ceph (ceph.newdream.net).
          I have not included any unit tests, as the code depends on the libhadoopceph shared library and without a Ceph install it seems sort of pointless – about all I can see to do is make sure that calling the methods throws an IOException for being uninitialized. Still, most of the other filesystems came up with something, so if you have any suggestions for useful test cases let me know and I can add them.

          In very basic testing (~900MB and ~6GB worth of data), this and the current Ceph code is roughly equivalent in speed to HDFS running a mapred via the hadoop-examples jar from .20 using the default values for both systems; Ceph tends to be slightly faster in a put and slightly slower in the mapred (~3:35 versus ~3:20 on the 6GB test case). However, Ceph, while still highly experimental and in-development, is a full filesystem with a linux kernel and full userspace client; it also distinguishes itself from HDFS by having no single point of failure – it uses a paxos-based monitor cluster for managing state and multiple metadata servers instead of the single HDFS namenode (though of course you can also run the entire system on one machine).

          Show
          Gregory Farnum added a comment - I've attached a patch which includes the CephFileSystem and IOStream classes, as well as package documentation. To actually use it you're going to need an installation of Ceph (ceph.newdream.net). I have not included any unit tests, as the code depends on the libhadoopceph shared library and without a Ceph install it seems sort of pointless – about all I can see to do is make sure that calling the methods throws an IOException for being uninitialized. Still, most of the other filesystems came up with something, so if you have any suggestions for useful test cases let me know and I can add them. In very basic testing (~900MB and ~6GB worth of data), this and the current Ceph code is roughly equivalent in speed to HDFS running a mapred via the hadoop-examples jar from .20 using the default values for both systems; Ceph tends to be slightly faster in a put and slightly slower in the mapred (~3:35 versus ~3:20 on the 6GB test case). However, Ceph, while still highly experimental and in-development, is a full filesystem with a linux kernel and full userspace client; it also distinguishes itself from HDFS by having no single point of failure – it uses a paxos-based monitor cluster for managing state and multiple metadata servers instead of the single HDFS namenode (though of course you can also run the entire system on one machine).
          Gregory Farnum made changes -
          Field Original Value New Value
          Status Open [ 1 ] Patch Available [ 10002 ]
          Gregory Farnum made changes -
          Attachment HADOOP-6253.patch [ 12419335 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12419335/HADOOP-6253.patch
          against trunk revision 813698.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/30/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419335/HADOOP-6253.patch against trunk revision 813698. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/30/console This message is automatically generated.
          Hide
          Gregory Farnum added a comment -

          Apparently I renamed the wrong file for upload. Sorry folks.

          Show
          Gregory Farnum added a comment - Apparently I renamed the wrong file for upload. Sorry folks.
          Gregory Farnum made changes -
          Attachment HADOOP-6253.patch [ 12419339 ]
          Gregory Farnum made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Gregory Farnum made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12419339/HADOOP-6253.patch
          against trunk revision 814455.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 148 javac compiler warnings (more than the trunk's current 145 warnings).

          -1 findbugs. The patch appears to introduce 11 new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/33/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/33/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/33/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/33/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419339/HADOOP-6253.patch against trunk revision 814455. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 148 javac compiler warnings (more than the trunk's current 145 warnings). -1 findbugs. The patch appears to introduce 11 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/33/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/33/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/33/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/33/console This message is automatically generated.
          Hide
          Gregory Farnum added a comment -

          Removed a few deprecated functions and changed some style issues that were making FindBugs et al unhappy.

          Show
          Gregory Farnum added a comment - Removed a few deprecated functions and changed some style issues that were making FindBugs et al unhappy.
          Gregory Farnum made changes -
          Attachment HADOOP-6253.patch [ 12419564 ]
          Gregory Farnum made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Gregory Farnum made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12419564/HADOOP-6253.patch
          against trunk revision 814455.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/34/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/34/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/34/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/34/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419564/HADOOP-6253.patch against trunk revision 814455. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/34/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/34/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/34/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/34/console This message is automatically generated.
          Hide
          Gregory Farnum added a comment -

          So at this point I think I'm supposed to ask for a code review. Please?

          Show
          Gregory Farnum added a comment - So at this point I think I'm supposed to ask for a code review. Please?
          Hide
          Konstantin Shvachko added a comment -

          Hey Gregory, Good job! Always wanted to compare Ceph with HDFS.
          Took a quick look at your patch. Noticed there some indentation issues: we use tabs as 2 spaces.
          Also there are several warnings:

          1. In all 3 files many imports are not necessary (like Vector, File), which is shown as warnings in eclipse.
          2. There is a comment line in the beginning of each file - should be removed.
          3. In CephFileSystem the following members and methods are not used anywhere:
            cephDebugLevel;
            monAddr;
            ceph_mkdir()
          4. bufferSize is not used neither in CephInputStream nor in CephOutputStream.
          5. ceph_seek_from_start() is unused in CephOutputStream.
          6. I see you explicitly throw RuntimeExceptions, like NullPointerException or IndexOutOfBoundsException, in your implementation. It would be better to replace them with IOExceptions. RuntimeExceptions should be treated as a bug in the code.

          What about libhadoopceph? Is it a part of Hadoop or Ceph?
          About testing. You might want to check KosmosFileSystem and S3FileSystem for testing examples. You want some tests committed with your patch because without tests software becomes stale pretty fast. In this case tests should test not the file system functionality, but your wrapping software, imho.

          Show
          Konstantin Shvachko added a comment - Hey Gregory, Good job! Always wanted to compare Ceph with HDFS. Took a quick look at your patch. Noticed there some indentation issues: we use tabs as 2 spaces. Also there are several warnings: In all 3 files many imports are not necessary (like Vector, File), which is shown as warnings in eclipse. There is a comment line in the beginning of each file - should be removed. In CephFileSystem the following members and methods are not used anywhere: cephDebugLevel; monAddr; ceph_mkdir() bufferSize is not used neither in CephInputStream nor in CephOutputStream . ceph_seek_from_start() is unused in CephOutputStream . I see you explicitly throw RuntimeExceptions, like NullPointerException or IndexOutOfBoundsException, in your implementation. It would be better to replace them with IOExceptions. RuntimeExceptions should be treated as a bug in the code. What about libhadoopceph? Is it a part of Hadoop or Ceph? About testing. You might want to check KosmosFileSystem and S3FileSystem for testing examples. You want some tests committed with your patch because without tests software becomes stale pretty fast. In this case tests should test not the file system functionality, but your wrapping software, imho.
          Hide
          Jakob Homan added a comment -

          Gregory-
          Take a lookat FileSystemContractBaseTest and its various extending classes. Each implementing file system should start with extending this test as it verifies that the implementation provides the expected semantics. Other tests are of course good as well.

          Show
          Jakob Homan added a comment - Gregory- Take a lookat FileSystemContractBaseTest and its various extending classes. Each implementing file system should start with extending this test as it verifies that the implementation provides the expected semantics. Other tests are of course good as well.
          Hide
          Chris Douglas added a comment -

          Canceling patch while Konstantin and Jakob's comments are addressed

          Show
          Chris Douglas added a comment - Canceling patch while Konstantin and Jakob's comments are addressed
          Chris Douglas made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Assignee Gregory Farnum [ gfarnum ]
          Hide
          Gregory Farnum added a comment -

          Okay; I've rewritten the IOStreams and re-architected the code to support pluggable Ceph instances, which allows me to write a CephFaker running on the LocalFS. Doing so it passes all the unit tests in FileSystemBaseContractTest.
          I've also added in uses of the Hadoop logging framework but don't have a good feel for what level of output that should be generating, so I'd especially appreciate somebody looking at that.

          Show
          Gregory Farnum added a comment - Okay; I've rewritten the IOStreams and re-architected the code to support pluggable Ceph instances, which allows me to write a CephFaker running on the LocalFS. Doing so it passes all the unit tests in FileSystemBaseContractTest. I've also added in uses of the Hadoop logging framework but don't have a good feel for what level of output that should be generating, so I'd especially appreciate somebody looking at that.
          Gregory Farnum made changes -
          Attachment HADOOP-6253.patch [ 12424148 ]
          Gregory Farnum made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Gregory Farnum made changes -
          Description The experimental distributed filesystem Ceph does not have a single point of failure, and might be of use to some Hadoop users. The experimental distributed filesystem Ceph does not have a single point of failure, uses a distributed metadata cluster instead of a single in-memory server, and might be of use to some Hadoop users.
          Gregory Farnum made changes -
          Status In Progress [ 3 ] Open [ 1 ]
          Hide
          Hairong Kuang added a comment -

          Grefoey, it looks that the most recent patch you uploaded is the same as the one you uploaded in September.

          Show
          Hairong Kuang added a comment - Grefoey, it looks that the most recent patch you uploaded is the same as the one you uploaded in September.
          Hide
          Gregory Farnum added a comment -

          Well, I'm not sure how that happened; must have gotten the patches crossed moving the new one around since they have the same name. :/

          Anyway, proper patch uploaded now.

          Show
          Gregory Farnum added a comment - Well, I'm not sure how that happened; must have gotten the patches crossed moving the new one around since they have the same name. :/ Anyway, proper patch uploaded now.
          Gregory Farnum made changes -
          Attachment HADOOP-6253.patch [ 12429794 ]
          Amandeep Khurana made changes -
          Link This issue is blocked by HADOOP-6779 [ HADOOP-6779 ]
          Amandeep Khurana made changes -
          Link This issue incorporates HADOOP-6779 [ HADOOP-6779 ]
          Amandeep Khurana made changes -
          Link This issue is blocked by HADOOP-6779 [ HADOOP-6779 ]
          Amandeep Khurana made changes -
          Labels ceph
          Hide
          Allen Wittenauer added a comment -

          Should this have a version associated with it? Should it get committed to trunk? Should this be in contrib since it is experimental?

          Show
          Allen Wittenauer added a comment - Should this have a version associated with it? Should it get committed to trunk? Should this be in contrib since it is experimental?
          Hide
          Otis Gospodnetic added a comment -

          This look interesting, but "Hadoop QA" didn't pick up the latest patch from 2010-01.
          Is this committable?

          Show
          Otis Gospodnetic added a comment - This look interesting, but "Hadoop QA" didn't pick up the latest patch from 2010-01. Is this committable?
          Hide
          Gregory Farnum added a comment -

          This was originally written against (the now dead, as I understand it) branch 0.21. We have patches in our (public) repository against branch .20 (which I believe will apply to the new 1.0, but I'd have to check), but before we resubmit upstream we want to do some restructuring work.
          (Right now they use a custom JNI .so "libhadoopcephfs" and we'd like to do up some proper libceph bindings in Java and just use those.)

          Show
          Gregory Farnum added a comment - This was originally written against (the now dead, as I understand it) branch 0.21. We have patches in our (public) repository against branch .20 (which I believe will apply to the new 1.0, but I'd have to check), but before we resubmit upstream we want to do some restructuring work. (Right now they use a custom JNI .so "libhadoopcephfs" and we'd like to do up some proper libceph bindings in Java and just use those.)
          Hide
          Steve Loughran added a comment -

          I'd argue strongly against adding any more FS support explicitly into the main hadoop distribution, purely on a test+support basis. Anything related to new filesystems are welcome, but adding more code that will only be testable on a subset of machines is trouble. It's not going to get tested before releases, it will ship untested with dependencies on native libraries that most people won't have, but there is still an expectation of support.

          If that's a strategy we want to take: how to get Ceph support in or near hadoop, with enough people caring about it to keep it up to date?

          Show
          Steve Loughran added a comment - I'd argue strongly against adding any more FS support explicitly into the main hadoop distribution, purely on a test+support basis. Anything related to new filesystems are welcome, but adding more code that will only be testable on a subset of machines is trouble. It's not going to get tested before releases, it will ship untested with dependencies on native libraries that most people won't have, but there is still an expectation of support. If that's a strategy we want to take: how to get Ceph support in or near hadoop, with enough people caring about it to keep it up to date?
          Hide
          Benoit Sigoure added a comment -

          I'm sure that people wouldn't necessarily expect the same level of scrutiny on the Ceph integration as on the HDFS stuff. But Ceph is one of the most promising distributed filesystems, and I think it would be great for Hadoop to integrate with it. I've been meaning to test Ceph at StumbleUpon for a while. I believe there's a fair bit of interest from various organizations, and completing this issue would help a lot.

          Show
          Benoit Sigoure added a comment - I'm sure that people wouldn't necessarily expect the same level of scrutiny on the Ceph integration as on the HDFS stuff. But Ceph is one of the most promising distributed filesystems, and I think it would be great for Hadoop to integrate with it. I've been meaning to test Ceph at StumbleUpon for a while. I believe there's a fair bit of interest from various organizations, and completing this issue would help a lot.
          Hide
          Todd Lipcon added a comment -

          I tend to agree with Steve – I see this as the same question about "contrib"s. Either software is part of Hadoop, in which case we should have several committers familiar with the code, or it's not, in which case it shouldn't be shipped with Hadoop.

          Now that we're mavenized, it seems like it should be straight-forward to maintain projects like this in a separate project with separate release cycles, and have their trunk builds depend on our published nightly SNAPSHOTs. So we still get the benefits of continuous integration of the projects, but we don't have extra stuff in our codebase that is only used by a small segment of users.

          So, I'd say I'm -0. Not enough to actually veto, but I'd be interested to know what the perceived benefit is for checking this into the main tree?

          Show
          Todd Lipcon added a comment - I tend to agree with Steve – I see this as the same question about "contrib"s. Either software is part of Hadoop, in which case we should have several committers familiar with the code, or it's not, in which case it shouldn't be shipped with Hadoop. Now that we're mavenized, it seems like it should be straight-forward to maintain projects like this in a separate project with separate release cycles, and have their trunk builds depend on our published nightly SNAPSHOTs. So we still get the benefits of continuous integration of the projects, but we don't have extra stuff in our codebase that is only used by a small segment of users. So, I'd say I'm -0. Not enough to actually veto, but I'd be interested to know what the perceived benefit is for checking this into the main tree?
          Hide
          Benoit Sigoure added a comment -

          Right actually I don't care much whether it's checked into the main tree or elsewhere, as long as it's checked in somewhere eventually.

          Show
          Benoit Sigoure added a comment - Right actually I don't care much whether it's checked into the main tree or elsewhere, as long as it's checked in somewhere eventually.
          Hide
          Gregory Farnum added a comment -

          From our perspective, it's always easier for users (and developers) to manage things that are in-tree rather than adding out-of-tree patches. Plus there is a faster feedback cycle on things like API changes.
          We really do want to rework the structure a little before we exert the upstreaming effort, but once we've done that the dependencies will be on nicely packaged libraries and a stable API. And we will commit to ongoing automated testing and maintenance of our own to make sure that upstream continues to work properly.

          If there's a better mechanism for supporting users from your perspective of course we'd love to hear about it.

          Show
          Gregory Farnum added a comment - From our perspective, it's always easier for users (and developers) to manage things that are in-tree rather than adding out-of-tree patches. Plus there is a faster feedback cycle on things like API changes. We really do want to rework the structure a little before we exert the upstreaming effort, but once we've done that the dependencies will be on nicely packaged libraries and a stable API. And we will commit to ongoing automated testing and maintenance of our own to make sure that upstream continues to work properly. If there's a better mechanism for supporting users from your perspective of course we'd love to hear about it.
          Hide
          Todd Lipcon added a comment -

          Question is why it needs any patches – shouldn't it be a standalone module? All you should need to do is drop your jar onto the user's classpath (HADOOP_CLASSPATH var), and set fs.ceph.impl in the user's config.

          Show
          Todd Lipcon added a comment - Question is why it needs any patches – shouldn't it be a standalone module? All you should need to do is drop your jar onto the user's classpath (HADOOP_CLASSPATH var), and set fs.ceph.impl in the user's config.
          Hide
          Allen Wittenauer added a comment -

          One of the comments made by another file system vendor that we've spoken too is that there are too many "if (HDFS) do this else do this" that undercut the capabilities of non-HDFS implementations. I wouldn't be too surprised if Ceph has the same problem.

          Show
          Allen Wittenauer added a comment - One of the comments made by another file system vendor that we've spoken too is that there are too many "if (HDFS) do this else do this" that undercut the capabilities of non-HDFS implementations. I wouldn't be too surprised if Ceph has the same problem.
          Hide
          Jakob Homan added a comment -

          One of the comments made by another file system vendor that we've spoken too is that there are too many "if (HDFS) do this else do this" that undercut the capabilities of non-HDFS implementations. I wouldn't be too surprised if Ceph has the same problem.

          Those should be removed and we'd of course be happy to accept patches to do so. I also agree (-0). A Ceph impl will require and deserve more attention than we can expect to be able to give (based on past experience).

          Show
          Jakob Homan added a comment - One of the comments made by another file system vendor that we've spoken too is that there are too many "if (HDFS) do this else do this" that undercut the capabilities of non-HDFS implementations. I wouldn't be too surprised if Ceph has the same problem. Those should be removed and we'd of course be happy to accept patches to do so. I also agree (-0). A Ceph impl will require and deserve more attention than we can expect to be able to give (based on past experience).
          Hide
          Todd Lipcon added a comment -

          Yep, I definitely agree we should commit patches that the ceph folks need for integration, but the actual ceph FS implementation seems like it could be implemented externally. Over time I'd think we could push out S3 as well for example.

          Show
          Todd Lipcon added a comment - Yep, I definitely agree we should commit patches that the ceph folks need for integration, but the actual ceph FS implementation seems like it could be implemented externally. Over time I'd think we could push out S3 as well for example.
          Hide
          Gregory Farnum added a comment -

          You're probably right, Todd (I hadn't realized you could define implementations in the user-level config) — given the other filesystems that are already upstream we haven't given much thought to alternate methods of implementing it so far.

          Show
          Gregory Farnum added a comment - You're probably right, Todd (I hadn't realized you could define implementations in the user-level config) — given the other filesystems that are already upstream we haven't given much thought to alternate methods of implementing it so far.
          Hide
          Ravi Prakash added a comment -

          In an orthogonal concern, just out of curiosity, does anyone happen to know how Ceph compares to HDFS in terms performance now? Gregory's first post is over 2 years old now and presumably a lot more development and optimizations have gone into both filesystems.

          Show
          Ravi Prakash added a comment - In an orthogonal concern, just out of curiosity, does anyone happen to know how Ceph compares to HDFS in terms performance now? Gregory's first post is over 2 years old now and presumably a lot more development and optimizations have gone into both filesystems.
          Hide
          Noah Watkins added a comment -

          Updated work for this ticket is at:

          http://ceph.com/docs/wip-hadoop-doc/cephfs/hadoop/

          Show
          Noah Watkins added a comment - Updated work for this ticket is at: http://ceph.com/docs/wip-hadoop-doc/cephfs/hadoop/
          Noah Watkins made changes -
          Description The experimental distributed filesystem Ceph does not have a single point of failure, uses a distributed metadata cluster instead of a single in-memory server, and might be of use to some Hadoop users. The experimental distributed filesystem Ceph does not have a single point of failure, uses a distributed metadata cluster instead of a single in-memory server, and might be of use to some Hadoop users.

          http://ceph.com/docs/wip-hadoop-doc/cephfs/hadoop/
          Hide
          Steve Loughran added a comment -

          Noah, I've been doing work in HADOOP-9361 both with trying to specify more rigorously what HDFS does (and hence what other filesystems need to do), and doing some more contract-driven testing. It'd be great if you could help w/ spec and see how the tests work.

          One thing that we may want to think about is adding some declarations of different filesystems into core-default.xml, but with a config option to provide a URL to include in any diagnostics message if the FS class doesn't load, e.g. {{"ClassNotFoundException: failed to load ceph://fs/file -please see http://ceph.com/docs/wip-hadoop-doc/cephfs/hadoop/ }} for help.

          I've opened a JIRA on that, HADOOP-9727.

          Show
          Steve Loughran added a comment - Noah, I've been doing work in HADOOP-9361 both with trying to specify more rigorously what HDFS does (and hence what other filesystems need to do), and doing some more contract-driven testing. It'd be great if you could help w/ spec and see how the tests work. One thing that we may want to think about is adding some declarations of different filesystems into core-default.xml , but with a config option to provide a URL to include in any diagnostics message if the FS class doesn't load, e.g. {{"ClassNotFoundException: failed to load ceph://fs/file -please see http://ceph.com/docs/wip-hadoop-doc/cephfs/hadoop/ }} for help. I've opened a JIRA on that, HADOOP-9727 .
          Hide
          Noah Watkins added a comment -

          Hi Steve,

          Noah, I've been doing work in HADOOP-9361 both with trying to specify more

          I'd be happy to! One of the biggest things we have run into is figuring out
          what the contract is. Inferring it from the behavior of applications has
          been our only real resource for that information. We have done some ad-hoc
          stuff in which we have adapted HDFS tests to run against Ceph, but it is
          pretty ugly and difficult to maintain.

          One thing that we have run into recently with a user is diagnosing some
          write performance problems. HDFS is performing well, and we think small
          writes might be a culprit if Ceph isn't doing write buffering.
          Understanding what that contract is, especially in terms of write safety is
          another area slightly different than just unit tests that we are interested
          in.

          Ahh, that would be cool. Certainly easier to merge changes to that than
          accept an entire new file system upstream

          Let me know how I get can started helping out.

          -Noah

          Show
          Noah Watkins added a comment - Hi Steve, Noah, I've been doing work in HADOOP-9361 both with trying to specify more I'd be happy to! One of the biggest things we have run into is figuring out what the contract is. Inferring it from the behavior of applications has been our only real resource for that information. We have done some ad-hoc stuff in which we have adapted HDFS tests to run against Ceph, but it is pretty ugly and difficult to maintain. One thing that we have run into recently with a user is diagnosing some write performance problems. HDFS is performing well, and we think small writes might be a culprit if Ceph isn't doing write buffering. Understanding what that contract is, especially in terms of write safety is another area slightly different than just unit tests that we are interested in. Ahh, that would be cool. Certainly easier to merge changes to that than accept an entire new file system upstream Let me know how I get can started helping out. -Noah

            People

            • Assignee:
              Gregory Farnum
              Reporter:
              Gregory Farnum
            • Votes:
              3 Vote for this issue
              Watchers:
              37 Start watching this issue

              Dates

              • Created:
                Updated:

                Development