HBase
  1. HBase
  2. HBASE-4169

FSUtils LeaseRecovery for non HDFS FileSystems.

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.90.3, 0.90.4
    • Fix Version/s: 0.92.0
    • Component/s: util
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      FSUtils.recoverFileLease uses HDFS's recoverLease method to get lease before splitting hlog file.
      This might not work for other filesystem implementations.

      1. HBASE-4196.3.v2.patch
        15 kB
        Lohit Vijayarenu
      2. HBASE-4196.3.patch
        13 kB
        Lohit Vijayarenu
      3. HBASE-4169.2.patch
        15 kB
        Lohit Vijayarenu
      4. HBASE-4169.1.patch
        15 kB
        Lohit Vijayarenu
      5. 4169-v5.txt
        15 kB
        Lohit Vijayarenu
      6. 4169-v4.txt
        15 kB
        stack
      7. 4169-correction.txt
        0.8 kB
        Lohit Vijayarenu

        Activity

        Lohit Vijayarenu created issue -
        Hide
        Lohit Vijayarenu added a comment -

        One suggestion is to do something like.

        if (!(fs instanceof DistributedFileSystem)) {
          // Change permission on file to ReadOnly.
          return
        } 
        
        // HDFS code to deal with lease recover as of today.
        
        

        Will this work?

        Show
        Lohit Vijayarenu added a comment - One suggestion is to do something like. if (!(fs instanceof DistributedFileSystem)) { // Change permission on file to ReadOnly. return } // HDFS code to deal with lease recover as of today. Will this work?
        Hide
        Andrew Purtell added a comment -

        I'm inclined to close this as invalid. It works with HDFS, which ships with Apache Hadoop, and we are Apache HBase that runs on Apache Hadoop. Is it incumbent on us to deal with the vagaries of XYZ vendor filesystems?

        Show
        Andrew Purtell added a comment - I'm inclined to close this as invalid. It works with HDFS, which ships with Apache Hadoop, and we are Apache HBase that runs on Apache Hadoop. Is it incumbent on us to deal with the vagaries of XYZ vendor filesystems?
        Hide
        Todd Lipcon added a comment -

        A design I"d be OK with us to use a set of configurations like hbase.fsutil.<scheme>.impl to define a particular class for each filesystem. For example, hbase.fsutil.hdfs.impl would instantiate the FSUtils class we have right now. Anyone with another filesystem could drop in their own with a configuration, without having to edit source upstream.

        Show
        Todd Lipcon added a comment - A design I"d be OK with us to use a set of configurations like hbase.fsutil.<scheme>.impl to define a particular class for each filesystem. For example, hbase.fsutil.hdfs.impl would instantiate the FSUtils class we have right now. Anyone with another filesystem could drop in their own with a configuration, without having to edit source upstream.
        Hide
        Andrew Purtell added a comment -

        A design I"d be OK with us to use a set of configurations like hbase.fsutil.<scheme>.impl to define a particular class for each filesystem.

        That's a good idea Todd, +1. Lohit?

        Show
        Andrew Purtell added a comment - A design I"d be OK with us to use a set of configurations like hbase.fsutil.<scheme>.impl to define a particular class for each filesystem. That's a good idea Todd, +1. Lohit?
        Hide
        Milind Bhandarkar added a comment -

        This suggests deeper flaws in the the FileSystem abstraction in Hadoop. If one has to do:

        if (fs instanceof DistributedFileSystem) {
        // hdfs-specific functionality
        }
        

        then it defeats the purpose of having o.a.h.fs as public apis and o.a.h.hdfs as implementation.

        Show
        Milind Bhandarkar added a comment - This suggests deeper flaws in the the FileSystem abstraction in Hadoop. If one has to do: if (fs instanceof DistributedFileSystem) { // hdfs-specific functionality } then it defeats the purpose of having o.a.h.fs as public apis and o.a.h.hdfs as implementation.
        Hide
        Todd Lipcon added a comment -

        Abstractions suck. I think most systems programmers agree. But they're better than the alternative

        That's why we have things like ioctl/fcntl, right? I wouldn't fault FileSystem here.

        Show
        Todd Lipcon added a comment - Abstractions suck. I think most systems programmers agree. But they're better than the alternative That's why we have things like ioctl/fcntl, right? I wouldn't fault FileSystem here.
        Hide
        Lohit Vijayarenu added a comment -

        Thanks Todd. I will make a patch to load current FSUtils::recoverFileLease if implementation is hdfs. Users who want to run on different filesystem can change config to load different implementation.

        Show
        Lohit Vijayarenu added a comment - Thanks Todd. I will make a patch to load current FSUtils::recoverFileLease if implementation is hdfs. Users who want to run on different filesystem can change config to load different implementation.
        Hide
        Lohit Vijayarenu added a comment -

        Attached is first version of patch. Please could anyone review this. Changes done are
        1. Moved recoverLease to FSHDFSUtils
        2. Added FSImplementedUtils which loads FSUtils based on config hbase.fsutils.fs.impl
        3. Added FSHDFSUtils as default in hbase-default.xml
        4. Changed all places where this is invoked.

        I rand TestHLog which invokes this piece of code to make sure FSHDFSUtils is loaded and recover the lease

        Show
        Lohit Vijayarenu added a comment - Attached is first version of patch. Please could anyone review this. Changes done are 1. Moved recoverLease to FSHDFSUtils 2. Added FSImplementedUtils which loads FSUtils based on config hbase.fsutils.fs.impl 3. Added FSHDFSUtils as default in hbase-default.xml 4. Changed all places where this is invoked. I rand TestHLog which invokes this piece of code to make sure FSHDFSUtils is loaded and recover the lease
        Lohit Vijayarenu made changes -
        Field Original Value New Value
        Attachment HBASE-4169.1.patch [ 12489565 ]
        Hide
        stack added a comment -

        @Andrew

        I'm inclined to close this as invalid. It works with HDFS, which ships with Apache Hadoop, and we are Apache HBase that runs on Apache Hadoop. Is it incumbent on us to deal with the vagaries of XYZ vendor filesystems?

        I don't have a problem making "small" changes to accomodate other FS implementations. It could broaden our user base and experience gained atop other FSs could help improve the home base.

        @Lohit

        I don't think this will work:

        if (!(fs instanceof DistributedFileSystem)) {
          // Change permission on file to ReadOnly.
          return
        } 
        

        The above will happen when running atop the local filesystem, not just for mapr (or ceph).

        I think the Todd suggestion of a mapr:// would be a better way to go though as the lads note above, the abstraction is likely lacking when it comes to other FS implementations (let me look at your patch).

        Show
        stack added a comment - @Andrew I'm inclined to close this as invalid. It works with HDFS, which ships with Apache Hadoop, and we are Apache HBase that runs on Apache Hadoop. Is it incumbent on us to deal with the vagaries of XYZ vendor filesystems? I don't have a problem making "small" changes to accomodate other FS implementations. It could broaden our user base and experience gained atop other FSs could help improve the home base. @Lohit I don't think this will work: if (!(fs instanceof DistributedFileSystem)) { // Change permission on file to ReadOnly. return } The above will happen when running atop the local filesystem, not just for mapr (or ceph). I think the Todd suggestion of a mapr:// would be a better way to go though as the lads note above, the abstraction is likely lacking when it comes to other FS implementations (let me look at your patch).
        Hide
        stack added a comment -

        @Lohit Patch is not bad. You did it this way because the bulk of FSUtils is static so not subclassable? The naming is a little awkward I'd say. FSImplementedUtils is a little odd when we still have FSUtils (And FSImplementedUtils is a factory so why not call it so?). I'd imagine going forward there'll be more than just this one method different across fs implementations? Do you think this method even belongs in our util package? I know thats where it was before you went messing but it seems more fundamental than a utility. Perhaps we need an fs package and in there you'd have an fs factory class that provides different fs implementations for different methods? You are trying to avoid doing 'mapr://' for the fs? (Even if you did mapr:// you'd need to change how we do this recoverLease). Good stuff.

        Show
        stack added a comment - @Lohit Patch is not bad. You did it this way because the bulk of FSUtils is static so not subclassable? The naming is a little awkward I'd say. FSImplementedUtils is a little odd when we still have FSUtils (And FSImplementedUtils is a factory so why not call it so?). I'd imagine going forward there'll be more than just this one method different across fs implementations? Do you think this method even belongs in our util package? I know thats where it was before you went messing but it seems more fundamental than a utility. Perhaps we need an fs package and in there you'd have an fs factory class that provides different fs implementations for different methods? You are trying to avoid doing 'mapr://' for the fs? (Even if you did mapr:// you'd need to change how we do this recoverLease). Good stuff.
        Hide
        Lohit Vijayarenu added a comment -

        @Stack Thanks for reviewing the patch. Yes, I made this separate class because FSUtils already has many other methods which works across non HDFS, so I moved recoverFileLease alone to different class. Are you suggesting I move FSImplementedUtils and FSHDFSUtils to org.apache.hadoop.hbase.fs.util
        Yes I was trying to avoid mapfs://. We have maprfs jar which includes client. I would planning to drop MapRFS implementation of recoverFileLease which would do chmod and update hbase-site.xml when running under maprfs. Would this work?

        Show
        Lohit Vijayarenu added a comment - @Stack Thanks for reviewing the patch. Yes, I made this separate class because FSUtils already has many other methods which works across non HDFS, so I moved recoverFileLease alone to different class. Are you suggesting I move FSImplementedUtils and FSHDFSUtils to org.apache.hadoop.hbase.fs.util Yes I was trying to avoid mapfs://. We have maprfs jar which includes client. I would planning to drop MapRFS implementation of recoverFileLease which would do chmod and update hbase-site.xml when running under maprfs. Would this work?
        Hide
        Lohit Vijayarenu added a comment -

        Attached is patch with changes incorporating comments from Stack.

        Show
        Lohit Vijayarenu added a comment - Attached is patch with changes incorporating comments from Stack.
        Lohit Vijayarenu made changes -
        Attachment HBASE-4169.2.patch [ 12489612 ]
        Lohit Vijayarenu made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Lohit Vijayarenu added a comment -

        Ran full test and I see few failures which are present even in trunk without the changes.
        No new test failures due to this patch. Any suggestions how we should proceed.

        Show
        Lohit Vijayarenu added a comment - Ran full test and I see few failures which are present even in trunk without the changes. No new test failures due to this patch. Any suggestions how we should proceed.
        Hide
        Todd Lipcon added a comment -

        This patch is a little strange - why use reflection here to access the static method in FSHDFSUtils instead of just creating a new interface or abstract class, and using non-static methods?

        I think a more conventional approach would be:

        • Create an abstract class FSUtils, which declares either default implementations or abstract methods for all of the things that different file systems would want to override
        • This class would also have a method getInstance(Filesystem, Configuration), which returns a subclass of FSUtils dependent on scheme, using ReflectionUtils.newInstance
        • If this ever becomes a performance hotspot we could add a cache here, but for rare operations like recoverLease it shouldn't matter
        Show
        Todd Lipcon added a comment - This patch is a little strange - why use reflection here to access the static method in FSHDFSUtils instead of just creating a new interface or abstract class, and using non-static methods? I think a more conventional approach would be: Create an abstract class FSUtils, which declares either default implementations or abstract methods for all of the things that different file systems would want to override This class would also have a method getInstance(Filesystem, Configuration), which returns a subclass of FSUtils dependent on scheme, using ReflectionUtils.newInstance If this ever becomes a performance hotspot we could add a cache here, but for rare operations like recoverLease it shouldn't matter
        Lohit Vijayarenu made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Lohit Vijayarenu added a comment -

        Attached is updated patch incorporating Todd's comments. I ran tests and do not see any new tests fail.

        Show
        Lohit Vijayarenu added a comment - Attached is updated patch incorporating Todd's comments. I ran tests and do not see any new tests fail.
        Lohit Vijayarenu made changes -
        Attachment HBASE-4196.3.patch [ 12489743 ]
        Lohit Vijayarenu made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Lohit Vijayarenu added a comment -

        I made little modification to previous patch. Included NonHDFS implementation of FSUtils which does chmod. If this patch is included, while running on non hdfs, users could just change their config to point to the NonHDFS class.

        Show
        Lohit Vijayarenu added a comment - I made little modification to previous patch. Included NonHDFS implementation of FSUtils which does chmod. If this patch is included, while running on non hdfs, users could just change their config to point to the NonHDFS class.
        Lohit Vijayarenu made changes -
        Attachment HBASE-4196.3.v2.patch [ 12489752 ]
        Hide
        Todd Lipcon added a comment -

        The non-HDFS one doesn't make much sense to me... I tried the chmod trick on our NFS filer here and it doesn't work – opened a file from one node for write, and was able to continue to write/fsync even after I chmodded on a different node. It might be that there are some specific file systems where this works, but it's far from the norm in my experience.

        Show
        Todd Lipcon added a comment - The non-HDFS one doesn't make much sense to me... I tried the chmod trick on our NFS filer here and it doesn't work – opened a file from one node for write, and was able to continue to write/fsync even after I chmodded on a different node. It might be that there are some specific file systems where this works, but it's far from the norm in my experience.
        Hide
        Lohit Vijayarenu added a comment -

        @Todd Yes that is the reason it is optional. Having with distribution would help users use vanila HBase on non HDFS without having them include file.

        Show
        Lohit Vijayarenu added a comment - @Todd Yes that is the reason it is optional. Having with distribution would help users use vanila HBase on non HDFS without having them include file.
        Hide
        stack added a comment -

        Here is what I committed to trunk. Its Lohit's patch some doc including note that the nonhdfs util class is just a generic implementation that may not work (doesn't work on nfs for example as per todd). Thanks for the patch Lohit.

        Show
        stack added a comment - Here is what I committed to trunk. Its Lohit's patch some doc including note that the nonhdfs util class is just a generic implementation that may not work (doesn't work on nfs for example as per todd). Thanks for the patch Lohit.
        stack made changes -
        Attachment 4169-v4.txt [ 12489772 ]
        Hide
        stack added a comment -

        Assigning Lohit (Added you as contributor Lohit)

        Show
        stack added a comment - Assigning Lohit (Added you as contributor Lohit)
        stack made changes -
        Assignee Lohit Vijayarenu [ lohit ]
        Hide
        Todd Lipcon added a comment -

        I am still -1 on having this "generic interface" since it works on only one filesystem I've ever heard of. It certainly doesn't work on local filesystem either. If we're going to implement something for mapr, we should just call it Mapr.

        Show
        Todd Lipcon added a comment - I am still -1 on having this "generic interface" since it works on only one filesystem I've ever heard of. It certainly doesn't work on local filesystem either. If we're going to implement something for mapr, we should just call it Mapr.
        Hide
        Andrew Purtell added a comment -

        I agree with Todd. "NonHDFS" is not the right name for something that only works on MapR. Call it "MapRFS".

        Show
        Andrew Purtell added a comment - I agree with Todd. "NonHDFS" is not the right name for something that only works on MapR. Call it "MapRFS".
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2094 (See https://builds.apache.org/job/HBase-TRUNK/2094/)
        HBASE-4169 FSUtils LeaseRecovery for non HDFS FileSystems

        stack :
        Files :

        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java
        • /hbase/trunk/CHANGES.txt
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSNonHDFSUtils.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2094 (See https://builds.apache.org/job/HBase-TRUNK/2094/ ) HBASE-4169 FSUtils LeaseRecovery for non HDFS FileSystems stack : Files : /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSNonHDFSUtils.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
        Hide
        stack added a comment -

        Reverting to fix class name (please base new patch on mine Lohit)

        Show
        stack added a comment - Reverting to fix class name (please base new patch on mine Lohit)
        Hide
        Lohit Vijayarenu added a comment -

        Attached is patch based on Stack's changes (4169-v4.txt). Changed name of NonHDFS to MapR

        Show
        Lohit Vijayarenu added a comment - Attached is patch based on Stack's changes (4169-v4.txt). Changed name of NonHDFS to MapR
        Lohit Vijayarenu made changes -
        Attachment 4169-v5.txt [ 12489782 ]
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2095 (See https://builds.apache.org/job/HBase-TRUNK/2095/)
        HBASE-4169 FSUtils LeaseRecovery for non HDFS FileSystems – reverted until we address issues raised around class name

        stack :
        Files :

        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java
        • /hbase/trunk/CHANGES.txt
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSNonHDFSUtils.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2095 (See https://builds.apache.org/job/HBase-TRUNK/2095/ ) HBASE-4169 FSUtils LeaseRecovery for non HDFS FileSystems – reverted until we address issues raised around class name stack : Files : /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSNonHDFSUtils.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
        Hide
        stack added a comment -

        @Lohit Does the MapR class need to be in hbase? Could it be in jars of yours rather than in hbase core? If it has to be in hbase core, no problem, will commit as is.

        Show
        stack added a comment - @Lohit Does the MapR class need to be in hbase? Could it be in jars of yours rather than in hbase core? If it has to be in hbase core, no problem, will commit as is.
        Hide
        Lohit Vijayarenu added a comment -

        @stack if the class is in hbase core the users could run different versions of hbase without upgrading MapR.

        Show
        Lohit Vijayarenu added a comment - @stack if the class is in hbase core the users could run different versions of hbase without upgrading MapR.
        Hide
        stack added a comment -

        Committed to TRUNK. Thanks for the patch Lohit.

        Show
        stack added a comment - Committed to TRUNK. Thanks for the patch Lohit.
        stack made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Fix Version/s 0.92.0 [ 12314223 ]
        Resolution Fixed [ 1 ]
        Hide
        Lohit Vijayarenu added a comment -

        Thanks everyone for the help

        Show
        Lohit Vijayarenu added a comment - Thanks everyone for the help
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2097 (See https://builds.apache.org/job/HBase-TRUNK/2097/)
        HBASE-4169 FSUtils LeaseRecovery for non HDFS FileSystems

        stack :
        Files :

        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSMapRUtils.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java
        • /hbase/trunk/CHANGES.txt
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2097 (See https://builds.apache.org/job/HBase-TRUNK/2097/ ) HBASE-4169 FSUtils LeaseRecovery for non HDFS FileSystems stack : Files : /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSMapRUtils.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
        Hide
        Lohit Vijayarenu added a comment -

        Sorry guys. In hurry I ended up adding wrong FSMapRUtils. I will update patch soon against trunk. Please could we include this.

        Show
        Lohit Vijayarenu added a comment - Sorry guys. In hurry I ended up adding wrong FSMapRUtils. I will update patch soon against trunk. Please could we include this.
        Lohit Vijayarenu made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Hide
        Lohit Vijayarenu added a comment -

        Please could we patch this. I have tested ths internally.

        Show
        Lohit Vijayarenu added a comment - Please could we patch this. I have tested ths internally.
        Lohit Vijayarenu made changes -
        Attachment 4169-correction.txt [ 12489818 ]
        Hide
        stack added a comment -

        I applied 4169-correction.txt. Thanks Lohit (You might want to make a doc patch to go along w/ this change Lohit to explain whats needed to make hbase run on mapr).

        Show
        stack added a comment - I applied 4169-correction.txt. Thanks Lohit (You might want to make a doc patch to go along w/ this change Lohit to explain whats needed to make hbase run on mapr).
        Hide
        stack added a comment -

        I applied the correction. Closing. Let me know if I have it wrong Lohit.

        Show
        stack added a comment - I applied the correction. Closing. Let me know if I have it wrong Lohit.
        stack made changes -
        Status Reopened [ 4 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]

          People

          • Assignee:
            Lohit Vijayarenu
            Reporter:
            Lohit Vijayarenu
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development