Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1599

Umbrella Jira for Improving HBASE support in HDFS

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Umbrella Jira for improved HBase support in HDFS

        Issue Links

          Activity

          Hide
          Jonathan Hsieh added a comment -

          #9 is being addressed in HBASE-5680 – short story is that a recompile is required when running HBase against HDFS in 0.23.

          Show
          Jonathan Hsieh added a comment - #9 is being addressed in HBASE-5680 – short story is that a recompile is required when running HBase against HDFS in 0.23.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Uma, thanks for listing them out. I have created HDFS-3184 for adding new HDFS client APIs.

          Show
          Tsz Wo Nicholas Sze added a comment - Uma, thanks for listing them out. I have created HDFS-3184 for adding new HDFS client APIs.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > Most of the reflection in HBase has to do with version compatibility, not accessing private APIs. Adding a new API on HDFS doesn't solve the problem, really, since the whole reason for the reflection is to compile against old versions which don't have the new APIs

          It does not solve the problem today but it will solve the problem in the future.

          Show
          Tsz Wo Nicholas Sze added a comment - > Most of the reflection in HBase has to do with version compatibility, not accessing private APIs. Adding a new API on HDFS doesn't solve the problem, really, since the whole reason for the reflection is to compile against old versions which don't have the new APIs It does not solve the problem today but it will solve the problem in the future.
          Hide
          Jonathan Hsieh added a comment -

          I looked into the history of #9 (HDFS-2412, HDFS-1620). It was suggested that the enums are essentially final classes and we can't shim in the SafeModeAction enum into the FSConstants via subclassing.

          Show
          Jonathan Hsieh added a comment - I looked into the history of #9 ( HDFS-2412 , HDFS-1620 ). It was suggested that the enums are essentially final classes and we can't shim in the SafeModeAction enum into the FSConstants via subclassing.
          Hide
          Uma Maheswara Rao G added a comment -

          10)getFileLength from DFSInputStream

          Show
          Uma Maheswara Rao G added a comment - 10)getFileLength from DFSInputStream
          Hide
          Uma Maheswara Rao G added a comment -

          @Nicholas,

          Currently I can see below points, where Hbase is invoking the HDFS APIs.

          1) Accessing the Cache class from FileSystem

                   field.setAccessible(true);
                   Field cacheField = FileSystem.class.getDeclaredField("CACHE");
                   cacheField.setAccessible(true);
                   Object cacheInstance = cacheField.get(fs);
                   hdfsClientFinalizer = (Thread)field.get(cacheInstance)
                

          2) Invoking the getJar method from JarFinder

                Class<?> jarFinder = Class.forName("org.apache.hadoop.util.JarFinder");
                // hadoop-0.23 has a JarFinder class that will create the jar
                // if it doesn't exist.  Note that this is needed to run the mapreduce
                // unit tests post-0.23, because mapreduce v2 requires the relevant jars
                // to be in the mr cluster to do output, split, etc.  At unit test time,
                // the hbase jars do not exist, so we need to create some.  Note that we
                // can safely fall back to findContainingJars for pre-0.23 mapreduce.
                Method m = jarFinder.getMethod("getJar", Class.class);
              

          3) accessing the getNumCurrentReplicas from DFSOutPutStream
          4) accessing the creatWriter method from SequenceFile.Writer
          5) accessing the syncFS method from SequenceFile writer
          6) hflush apis
          7) accessing the 'out' variable from FSDataOutPutStream
          8) recoverLease api from DistributedFilesystem
          9) using org.apache.hadoop.hdfs.protocol.FSConstants.SafeModeAction.SAFEMODE_GET
          Hbase currently broken with 23 version because of this constant usage.

          Show
          Uma Maheswara Rao G added a comment - @Nicholas, Currently I can see below points, where Hbase is invoking the HDFS APIs. 1) Accessing the Cache class from FileSystem field.setAccessible( true ); Field cacheField = FileSystem.class.getDeclaredField( "CACHE" ); cacheField.setAccessible( true ); Object cacheInstance = cacheField.get(fs); hdfsClientFinalizer = ( Thread )field.get(cacheInstance) 2) Invoking the getJar method from JarFinder Class <?> jarFinder = Class .forName( "org.apache.hadoop.util.JarFinder" ); // hadoop-0.23 has a JarFinder class that will create the jar // if it doesn't exist. Note that this is needed to run the mapreduce // unit tests post-0.23, because mapreduce v2 requires the relevant jars // to be in the mr cluster to do output, split, etc. At unit test time, // the hbase jars do not exist, so we need to create some. Note that we // can safely fall back to findContainingJars for pre-0.23 mapreduce. Method m = jarFinder.getMethod( "getJar" , Class .class); 3) accessing the getNumCurrentReplicas from DFSOutPutStream 4) accessing the creatWriter method from SequenceFile.Writer 5) accessing the syncFS method from SequenceFile writer 6) hflush apis 7) accessing the 'out' variable from FSDataOutPutStream 8) recoverLease api from DistributedFilesystem 9) using org.apache.hadoop.hdfs.protocol.FSConstants.SafeModeAction.SAFEMODE_GET Hbase currently broken with 23 version because of this constant usage.
          Hide
          Uma Maheswara Rao G added a comment -

          Yah, I agree that Hbase implemented reflections for supporting older versions.

          Here What I mean is, If Hbase keep using internal API and HDFS keep changing them means, Hbase will end-up in putting one one if-else condition for each version . I feel that Hbase really should request the HDFS about the needs, instead silently adding if-else with reflection.

          At one point at least, if HDFS provides some public apis to Hbase say from X version. Hbase can add last if else with this public API. Then there won't be any issues in migrating of Hadoop version. X+ can easily support Hbase.

          >What are the other methods that HBase requires?
          @Nicholas, Let me get the list of APIs Habse using currently reflections.

          I just seen the comment for solving one issue in Habse. HBASE-5680. Solution proposed is reflection. But HdfsConstants marked as private. That means HDFS may change it tomorrow in next version. Again should add one more if-else in Hbase
          If we expose throw some public API here, Then HDFS will not change it so easily right. Hbase need not worry much in migrating to newer version?

          What do you say Todd?

          Show
          Uma Maheswara Rao G added a comment - Yah, I agree that Hbase implemented reflections for supporting older versions. Here What I mean is, If Hbase keep using internal API and HDFS keep changing them means, Hbase will end-up in putting one one if-else condition for each version . I feel that Hbase really should request the HDFS about the needs, instead silently adding if-else with reflection. At one point at least, if HDFS provides some public apis to Hbase say from X version. Hbase can add last if else with this public API. Then there won't be any issues in migrating of Hadoop version. X+ can easily support Hbase. >What are the other methods that HBase requires? @Nicholas, Let me get the list of APIs Habse using currently reflections. I just seen the comment for solving one issue in Habse. HBASE-5680 . Solution proposed is reflection. But HdfsConstants marked as private. That means HDFS may change it tomorrow in next version. Again should add one more if-else in Hbase If we expose throw some public API here, Then HDFS will not change it so easily right. Hbase need not worry much in migrating to newer version? What do you say Todd?
          Hide
          Todd Lipcon added a comment -

          Most of the reflection in HBase has to do with version compatibility, not accessing private APIs. Adding a new API on HDFS doesn't solve the problem, really, since the whole reason for the reflection is to compile against old versions which don't have the new APIs

          Show
          Todd Lipcon added a comment - Most of the reflection in HBase has to do with version compatibility, not accessing private APIs. Adding a new API on HDFS doesn't solve the problem, really, since the whole reason for the reflection is to compile against old versions which don't have the new APIs
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Uma, you are right that we should have some public API for HBase and other projects.

          What are the other methods that HBase requires?

          Show
          Tsz Wo Nicholas Sze added a comment - Uma, you are right that we should have some public API for HBase and other projects. What are the other methods that HBase requires?
          Hide
          Uma Maheswara Rao G added a comment -

          Other point I wanted to mention is, Presently HBase using lot of reflection based invocations to call HDFS methods( which are not exposed).

          Ex:

          Field fIn = FilterInputStream.class.getDeclaredField("in");
                      fIn.setAccessible(true);
                      Object realIn = fIn.get(this.in);
                      // In hadoop 0.22, DFSInputStream is a standalone class.  Before this,
                      // it was an inner class of DFSClient.
                      if (realIn.getClass().getName().endsWith("DFSInputStream")) {
                        Method getFileLength = realIn.getClass().
                          getDeclaredMethod("getFileLength", new Class<?> []{});
                        getFileLength.setAccessible(true);
                        long realLength = ((Long)getFileLength.
                          invoke(realIn, new Object []{})).longValue();

          What is plan for support in exposing some kind of real usage details to dependant components through some special interfaces?

          I can see that, lot of code filled with reflection based invocations in Hbase.
          That will really make harder in version migrations later. Since they are internal APIs HDFS may change very easily. But Hbase might depend on them tightly. In such cases, Hbase will face lot of difficultuies in migrating to newer versions.

          Let's start brainstorming on this issue here.

          Show
          Uma Maheswara Rao G added a comment - Other point I wanted to mention is, Presently HBase using lot of reflection based invocations to call HDFS methods( which are not exposed). Ex: Field fIn = FilterInputStream.class.getDeclaredField( "in" ); fIn.setAccessible( true ); Object realIn = fIn.get( this .in); // In hadoop 0.22, DFSInputStream is a standalone class. Before this , // it was an inner class of DFSClient. if (realIn.getClass().getName().endsWith( "DFSInputStream" )) { Method getFileLength = realIn.getClass(). getDeclaredMethod( "getFileLength" , new Class <?> []{}); getFileLength.setAccessible( true ); long realLength = (( Long )getFileLength. invoke(realIn, new Object []{})).longValue(); What is plan for support in exposing some kind of real usage details to dependant components through some special interfaces? I can see that, lot of code filled with reflection based invocations in Hbase. That will really make harder in version migrations later. Since they are internal APIs HDFS may change very easily. But Hbase might depend on them tightly. In such cases, Hbase will face lot of difficultuies in migrating to newer versions. Let's start brainstorming on this issue here.
          Hide
          stack added a comment -

          Thanks for filing this one Sanjay. Here's a bit of input if it'll help.

          HDFS-918 is an attempt at moving datanode away from (2?) threads per open file – which is just a killer for HBase loadings (Mozilla had datanodes that had 8k plus threads running in them because they had about 1k regions up on each of their cluster of 20 odd nodes). HBase keeps open all files to save on trip to Namenode inline with a random-read. The patch that has been posted has been through many iterations, does the read path only currently (the important one as far as hbase is concerned), seems to work in basic testing done by me and others, and holds lots of promise (Or, lets just rewrite the datanode – smile). The patch is pretty big and Todd is suggesting we get it in in smaller pieces but also argument for dropping the big patch in (Related: HDFS-223, HDFS-285, HDFS-374 which I think can now be closed).

          Next up would be some kinda keepalive on pread. At the moment, we'll set up the socket on each pread (hbase uses pread doing random lookups) EVEN though we are seeking the same block as just read from (See HDFS-380). Chatting w/ some of the lads, fixing this – HDFS-941 – is probably the least intrusive of the issues attached but it'll get us a pretty nice improvement.

          HDFS-347 is radical but in hackups, its already been demo'd that it can make for a massive improvement in both latency AND in CPU use (Nathan in a chat on Thursday asked why does this make for such a big win? What is the network version doing that is causing such a slowdown. I think Dhruba makes the same comment inline in the issue IIRC).

          HDFS-1034 looks good.

          HDFS-236 looks like an effort worth reviving.

          Thats enough for now.

          Show
          stack added a comment - Thanks for filing this one Sanjay. Here's a bit of input if it'll help. HDFS-918 is an attempt at moving datanode away from (2?) threads per open file – which is just a killer for HBase loadings (Mozilla had datanodes that had 8k plus threads running in them because they had about 1k regions up on each of their cluster of 20 odd nodes). HBase keeps open all files to save on trip to Namenode inline with a random-read. The patch that has been posted has been through many iterations, does the read path only currently (the important one as far as hbase is concerned), seems to work in basic testing done by me and others, and holds lots of promise (Or, lets just rewrite the datanode – smile). The patch is pretty big and Todd is suggesting we get it in in smaller pieces but also argument for dropping the big patch in (Related: HDFS-223 , HDFS-285 , HDFS-374 which I think can now be closed). Next up would be some kinda keepalive on pread. At the moment, we'll set up the socket on each pread (hbase uses pread doing random lookups) EVEN though we are seeking the same block as just read from (See HDFS-380 ). Chatting w/ some of the lads, fixing this – HDFS-941 – is probably the least intrusive of the issues attached but it'll get us a pretty nice improvement. HDFS-347 is radical but in hackups, its already been demo'd that it can make for a massive improvement in both latency AND in CPU use (Nathan in a chat on Thursday asked why does this make for such a big win? What is the network version doing that is causing such a slowdown. I think Dhruba makes the same comment inline in the issue IIRC). HDFS-1034 looks good. HDFS-236 looks like an effort worth reviving. Thats enough for now.

            People

            • Assignee:
              Unassigned
              Reporter:
              Sanjay Radia
            • Votes:
              0 Vote for this issue
              Watchers:
              42 Start watching this issue

              Dates

              • Created:
                Updated:

                Development