Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-8778

Region assigments scan table directory making them slow for huge tables

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.0, 0.95.2
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Table descriptors are now moved inside hdfs from residing directly in the table directory (alongside region directories) to being in a well known subdirectory called ".tabledesc". For example, instead of /hbase/exampleTable/.tableinfo.0000000003 the file would be /hbase/exampleTable/.tabledesc/.tableinfo.0000000003 after this release. The same will be true for snapshots. The first active master to be started up will move these files for existing tables and snapshots.
      Show
      Table descriptors are now moved inside hdfs from residing directly in the table directory (alongside region directories) to being in a well known subdirectory called ".tabledesc". For example, instead of /hbase/exampleTable/.tableinfo.0000000003 the file would be /hbase/exampleTable/.tabledesc/.tableinfo.0000000003 after this release. The same will be true for snapshots. The first active master to be started up will move these files for existing tables and snapshots.
    • Tags:
      0.96notable

      Description

      On a table with 130k regions it takes about 3 seconds for a region server to open a region once it has been assigned.

      Watching the threads for a region server running 0.94.5 that is opening many such regions shows the thread opening the reigon in code like this:

      "PRI IPC Server handler 4 on 60020" daemon prio=10 tid=0x00002aaac07e9000 nid=0x6566 runnable [0x000000004c46d000]
         java.lang.Thread.State: RUNNABLE
              at java.lang.String.indexOf(String.java:1521)
              at java.net.URI$Parser.scan(URI.java:2912)
              at java.net.URI$Parser.parse(URI.java:3004)
              at java.net.URI.<init>(URI.java:736)
              at org.apache.hadoop.fs.Path.initialize(Path.java:145)
              at org.apache.hadoop.fs.Path.<init>(Path.java:126)
              at org.apache.hadoop.fs.Path.<init>(Path.java:50)
              at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:215)
              at org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:252)
              at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:311)
              at org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:159)
              at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:842)
              at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:867)
              at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1168)
              at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:269)
              at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:255)
              at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoModtime(FSTableDescriptors.java:368)
              at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:155)
              at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126)
              at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2834)
              at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2807)
              at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
              at java.lang.reflect.Method.invoke(Method.java:597)
              at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
              at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
      

      To open the region, the region server first loads the latest HTableDescriptor. Since HBASE-4553 HTableDescriptor's are stored in the file system at "/hbase/<tableDir>/.tableinfo.<sequenceNum>". The file with the largest sequenceNum is the current descriptor. This is done so that the current descirptor is updated atomically. However, since the filename is not known in advance FSTableDescriptors it has to do a FileSystem.listStatus operation which has to list all files in the directory to find it. The directory also contains all the region directories, so in our case it has to load 130k FileStatus objects. Even using a globStatus matching function still transfers all the objects to the client before performing the pattern matching. Furthermore HDFS uses a default of transferring 1000 directory entries in each RPC call, so it requires 130 roundtrips to the namenode to fetch all the directory entries.

      Consequently, to reassign all the regions of a table (or a constant fraction thereof) requires time proportional to the square of the number of regions.

      In our case, if a region server fails with 200 such regions, it takes 10+ minutes for them all to be reassigned, after the zk expiration and log splitting.

      1. 8778-dirmodtime.txt
        1 kB
        Lars Hofhansl
      2. HBASE-8778.patch
        80 kB
        Dave Latham
      3. HBASE-8778-0.94.5.patch
        77 kB
        Dave Latham
      4. HBASE-8778-0.94.5-v2.patch
        77 kB
        Dave Latham
      5. HBASE-8778-v2.patch
        81 kB
        Dave Latham
      6. HBASE-8778-v3.patch
        81 kB
        Dave Latham
      7. HBASE-8778-v4.patch
        81 kB
        Dave Latham
      8. HBASE-8778-v5.patch
        82 kB
        Dave Latham

        Issue Links

          Activity

          Hide
          davelatham Dave Latham added a comment -

          One solution is to instead keep the table descriptor files in a subdirectory of the table directory so that only that subdirectory needs a scan. The attached patch is one from 0.94.5 that implements this scheme. In order to be applicable in a rolling restart scenario, the new descriptor is written to both the table directory and the subdirectory. Readers first read the subdirectory, then fall back to the table directory. In order to be robust against failures or races, a lock file is used in the subdirectory during writes. The patch also refactors the FSTableDescriptors class to require a Configuration (to determine lock wait duration) as well as updates so that it more uniformly enforces the fsreadonly flag (RegionServers never do writes) and stick with using instance methods rather than static methods. We are proceeding with this and hope to roll it out to our cluster. To update to this patch once the writers (HBase Master, tools like hbck, merge, compact) are upgraded then old writers should not be used.

          I would love to hear the opinion of the HBase community regarding this issue. Some questions:

          • Is it worth fixing? (I strongly believe so as it has a big impact on MTTR for large clusters)
          • What's the best approach to fixing?
          • Some other possibilities:
          • Using a lock file and well known table descriptor file rather than sequence ids
          • Relying on more descriptor caching rather than hitting hdfs on every region assignment (as bulk assignment already does).
          • Move table descriptors to a different location in hdfs (single location for all tables?)
          • Move table descriptors out of hdfs to ZK
          • How and when can we migrate to that approach?
          • For the patch above once the cluster has been upgraded and updated the location of the descriptor files to have a copy in the subdirectory it would be easy to have the next version use only those files.
          • Alternatively, for the singularity there could be a one-time piece of migration code that just moved the files there.
          Show
          davelatham Dave Latham added a comment - One solution is to instead keep the table descriptor files in a subdirectory of the table directory so that only that subdirectory needs a scan. The attached patch is one from 0.94.5 that implements this scheme. In order to be applicable in a rolling restart scenario, the new descriptor is written to both the table directory and the subdirectory. Readers first read the subdirectory, then fall back to the table directory. In order to be robust against failures or races, a lock file is used in the subdirectory during writes. The patch also refactors the FSTableDescriptors class to require a Configuration (to determine lock wait duration) as well as updates so that it more uniformly enforces the fsreadonly flag (RegionServers never do writes) and stick with using instance methods rather than static methods. We are proceeding with this and hope to roll it out to our cluster. To update to this patch once the writers (HBase Master, tools like hbck, merge, compact) are upgraded then old writers should not be used. I would love to hear the opinion of the HBase community regarding this issue. Some questions: Is it worth fixing? (I strongly believe so as it has a big impact on MTTR for large clusters) What's the best approach to fixing? Some other possibilities: Using a lock file and well known table descriptor file rather than sequence ids Relying on more descriptor caching rather than hitting hdfs on every region assignment (as bulk assignment already does). Move table descriptors to a different location in hdfs (single location for all tables?) Move table descriptors out of hdfs to ZK How and when can we migrate to that approach? For the patch above once the cluster has been upgraded and updated the location of the descriptor files to have a copy in the subdirectory it would be easy to have the next version use only those files. Alternatively, for the singularity there could be a one-time piece of migration code that just moved the files there.
          Hide
          davelatham Dave Latham added a comment -

          Here's an updated patch with a bit of cleanup. We've begun rolling this out to one of our production clusters, and it is showing about a 5x speedup in assignments during the rolling restart.

          Show
          davelatham Dave Latham added a comment - Here's an updated patch with a bit of cleanup. We've begun rolling this out to one of our production clusters, and it is showing about a 5x speedup in assignments during the rolling restart.
          Hide
          sershe Sergey Shelukhin added a comment -

          Can you please post on RB (https://reviews.apache.org/r/)? The patch is relatively large

          Show
          sershe Sergey Shelukhin added a comment - Can you please post on RB ( https://reviews.apache.org/r/)? The patch is relatively large
          Hide
          sershe Sergey Shelukhin added a comment -

          Actually on an entirely unrelated note it would be interesting to learn how you run with so many regions (200+ memstores per server?). Are you coming to Hadoop summit? Was there an HBaseCon talk I missed

          Show
          sershe Sergey Shelukhin added a comment - Actually on an entirely unrelated note it would be interesting to learn how you run with so many regions (200+ memstores per server?). Are you coming to Hadoop summit? Was there an HBaseCon talk I missed
          Hide
          ianfriedman Ian Friedman added a comment -

          Actually Dave was on the panel at the HBase Operations session at HBaseCon, so if you went to that you might have heard about it. Also FYI looks like we have something like 258 regions per server nowadays.

          Show
          ianfriedman Ian Friedman added a comment - Actually Dave was on the panel at the HBase Operations session at HBaseCon, so if you went to that you might have heard about it. Also FYI looks like we have something like 258 regions per server nowadays.
          Hide
          davelatham Dave Latham added a comment -

          Sergey ShelukhinThanks for taking a look. I haven't used the review board before. I registered and tried to post up the patch there, but it gives me a 500 error when trying to post to the "hbase" repository. I also didn't see anywhere to choose what branch the patch is against, so perhaps it is because this patch is not for trunk. Feel free to post it up if you know how or point me in the right direction.

          I'm more interested at first for what people think is the best general strategy for dealing with this. If some consensus is reached on that can put forward reasonable effort to push it through.

          [OT] I'll probably be at the Hadoop summit on Thursday, or feel free to email me directly.

          Show
          davelatham Dave Latham added a comment - Sergey Shelukhin Thanks for taking a look. I haven't used the review board before. I registered and tried to post up the patch there, but it gives me a 500 error when trying to post to the "hbase" repository. I also didn't see anywhere to choose what branch the patch is against, so perhaps it is because this patch is not for trunk. Feel free to post it up if you know how or point me in the right direction. I'm more interested at first for what people think is the best general strategy for dealing with this. If some consensus is reached on that can put forward reasonable effort to push it through. [OT] I'll probably be at the Hadoop summit on Thursday, or feel free to email me directly.
          Hide
          enis Enis Soztutar added a comment -

          it gives me a 500 error when trying to post to the "hbase" repository

          You can use the hbase-git repository instead of the hbase repository.

          I'm more interested at first for what people think is the best general strategy for dealing with this

          Putting tabledesc in a subdir only solves the problem for region opening. But other tools, which does an ls on directory will still run into this.
          Namespaces issue changes the directory layout on hdfs, so maybe we should rethink how we are organizing region dirs as well. Datanodes have a multi-level hierarchical directory layout structure for managing the block files. We can implement something like that.

          Show
          enis Enis Soztutar added a comment - it gives me a 500 error when trying to post to the "hbase" repository You can use the hbase-git repository instead of the hbase repository. I'm more interested at first for what people think is the best general strategy for dealing with this Putting tabledesc in a subdir only solves the problem for region opening. But other tools, which does an ls on directory will still run into this. Namespaces issue changes the directory layout on hdfs, so maybe we should rethink how we are organizing region dirs as well. Datanodes have a multi-level hierarchical directory layout structure for managing the block files. We can implement something like that.
          Hide
          sershe Sergey Shelukhin added a comment -

          use hbase-git one

          As for the general approach, it seems like HDFS issue should be raised to be able to do scanning more efficiently in this scenario (doing pattern matching on the server). The subdirectory approach is a reasonable workaround

          Show
          sershe Sergey Shelukhin added a comment - use hbase-git one As for the general approach, it seems like HDFS issue should be raised to be able to do scanning more efficiently in this scenario (doing pattern matching on the server). The subdirectory approach is a reasonable workaround
          Hide
          davelatham Dave Latham added a comment -

          use hbase-git one

          I had tried the hbase-git one too, but it complains about that the files are not in the repo, for example "The file 'src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java' (rec0bd62) could not be found in the repository". I assume that this is because there is no option to specify a branch and the code was reorganized in master into modules.

          If we wanted to go with this direction, I will do the work to port changes up to trunk and 0.95, but I'd prefer to avoid that if another direction is preferred.

          Show
          davelatham Dave Latham added a comment - use hbase-git one I had tried the hbase-git one too, but it complains about that the files are not in the repo, for example "The file 'src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java' (rec0bd62) could not be found in the repository". I assume that this is because there is no option to specify a branch and the code was reorganized in master into modules. If we wanted to go with this direction, I will do the work to port changes up to trunk and 0.95, but I'd prefer to avoid that if another direction is preferred.
          Hide
          davelatham Dave Latham added a comment -

          Putting tabledesc in a subdir only solves the problem for region opening. But other tools, which does an ls on directory will still run into this.

          It would solve it for any tools loading the table descriptor information. It would not solve it for any tools which want to scan the table directory, but not the full directory. However, I'm not aware of any other cases where you want only some data from the table directory, but don't know it's name. Either you want a single region, or all the regions in which case even if they are organized hierarchically you are still reading all of them.

          Show
          davelatham Dave Latham added a comment - Putting tabledesc in a subdir only solves the problem for region opening. But other tools, which does an ls on directory will still run into this. It would solve it for any tools loading the table descriptor information. It would not solve it for any tools which want to scan the table directory, but not the full directory. However, I'm not aware of any other cases where you want only some data from the table directory, but don't know it's name. Either you want a single region, or all the regions in which case even if they are organized hierarchically you are still reading all of them.
          Hide
          enis Enis Soztutar added a comment -

          I think it is not just doing a (partial) list on directory for region names, but for opening any file inside the region, you have to locate the regionDir's inode from it's parent directory. If parent contains 1000's of entries, then this search becomes slower and slower. NN keeps Inodes in memory, but that can change in the future.

          Show
          enis Enis Soztutar added a comment - I think it is not just doing a (partial) list on directory for region names, but for opening any file inside the region, you have to locate the regionDir's inode from it's parent directory. If parent contains 1000's of entries, then this search becomes slower and slower. NN keeps Inodes in memory, but that can change in the future.
          Hide
          davelatham Dave Latham added a comment -

          So to open a region file you need to locate the inode for the regionDir which means a binary search of the parent's list. If we make that list into a hierarchical tree that still requires log(N) steps, right? But now some of the steps would require round trips between dfs client and namenode. I have a feeling that I'm missing something here.

          Show
          davelatham Dave Latham added a comment - So to open a region file you need to locate the inode for the regionDir which means a binary search of the parent's list. If we make that list into a hierarchical tree that still requires log(N) steps, right? But now some of the steps would require round trips between dfs client and namenode. I have a feeling that I'm missing something here.
          Hide
          davelatham Dave Latham added a comment -

          As for the general approach, it seems like HDFS issue should be raised to be able to do scanning more efficiently in this scenario (doing pattern matching on the server).

          That's an approach I hadn't considered. However, it seems we'd still be pushing a lot of work on to a single NameNode as it would still need to apply the pattern to every inode or else be very intelligent about pattern processing.

          Show
          davelatham Dave Latham added a comment - As for the general approach, it seems like HDFS issue should be raised to be able to do scanning more efficiently in this scenario (doing pattern matching on the server). That's an approach I hadn't considered. However, it seems we'd still be pushing a lot of work on to a single NameNode as it would still need to apply the pattern to every inode or else be very intelligent about pattern processing.
          Hide
          enis Enis Soztutar added a comment -

          So to open a region file you need to locate the inode for the regionDir which means a binary search of the parent's list.

          Yes, currently NN does a binary search for this, but I am not sure we should rely on that. Having 100K entries in a single directory does not seem right.

          Show
          enis Enis Soztutar added a comment - So to open a region file you need to locate the inode for the regionDir which means a binary search of the parent's list. Yes, currently NN does a binary search for this, but I am not sure we should rely on that. Having 100K entries in a single directory does not seem right.
          Hide
          davelatham Dave Latham added a comment -

          Yes, currently NN does a binary search for this, but I am not sure we should rely on that. Having 100K entries in a single directory does not seem right.

          I can't image HDFS would revert looking up an inode to linear or anything slower than log time. That would be a major regression. Still, if you think having all the region directories in the same table directory is just not right, let's open another issue and discuss the merits there. For this one, do you propose waiting for that change and in that case leaving the descriptor files to sit in the top level table dir alongside the first level of region subdirectories? Even with a hierarchy it would seem to be that descriptor files would be better off in their own table metadata directory.

          Show
          davelatham Dave Latham added a comment - Yes, currently NN does a binary search for this, but I am not sure we should rely on that. Having 100K entries in a single directory does not seem right. I can't image HDFS would revert looking up an inode to linear or anything slower than log time. That would be a major regression. Still, if you think having all the region directories in the same table directory is just not right, let's open another issue and discuss the merits there. For this one, do you propose waiting for that change and in that case leaving the descriptor files to sit in the top level table dir alongside the first level of region subdirectories? Even with a hierarchy it would seem to be that descriptor files would be better off in their own table metadata directory.
          Hide
          davelatham Dave Latham added a comment -

          Sergey Shelukhin Since I haven't been able to make review board cooperate if you want to view the patch in a more friendly format I made it visible on GitHub too:
          https://github.com/ddlatham/hbase/pull/1/files

          Show
          davelatham Dave Latham added a comment - Sergey Shelukhin Since I haven't been able to make review board cooperate if you want to view the patch in a more friendly format I made it visible on GitHub too: https://github.com/ddlatham/hbase/pull/1/files
          Hide
          enis Enis Soztutar added a comment -

          Thinking again about this, maybe we should go with the table of tables approach instead of keeping tableinfos on hdfs. That has been proposed in some other jira. With namespaces (HBASE-8015), we will have a table for namespaces. We already have a table for regions (ROOT and META). Having a table table makes sense to me.

          Show
          enis Enis Soztutar added a comment - Thinking again about this, maybe we should go with the table of tables approach instead of keeping tableinfos on hdfs. That has been proposed in some other jira. With namespaces ( HBASE-8015 ), we will have a table for namespaces. We already have a table for regions (ROOT and META). Having a table table makes sense to me.
          Hide
          davelatham Dave Latham added a comment -

          A table descriptor table is an interesting approach I had not thought of. I like that it pulls the NameNode out of the path and can take advantage of atomic row operations for modifications. A couple questions come to my mind:

          • Where does this table's descriptor information get stored?
          • What does the bootstrap look like? Need to load the META table descriptor from this table to open the META table but need to look up this table's location in META to lookup the META table descriptor?
          • What happens if this table is corrupted somehow? Currently, hbck can recreate META entries if they are lost, but is there any other place to recover table descriptor information?
          Show
          davelatham Dave Latham added a comment - A table descriptor table is an interesting approach I had not thought of. I like that it pulls the NameNode out of the path and can take advantage of atomic row operations for modifications. A couple questions come to my mind: Where does this table's descriptor information get stored? What does the bootstrap look like? Need to load the META table descriptor from this table to open the META table but need to look up this table's location in META to lookup the META table descriptor? What happens if this table is corrupted somehow? Currently, hbck can recreate META entries if they are lost, but is there any other place to recover table descriptor information?
          Hide
          enis Enis Soztutar added a comment -

          Where does this table's descriptor information get stored?

          I don't know of any design plan for this yet. But I imagine we can store them in serialized form in some column. Table name becomes the row key, and info:tableInfo column contains serialized table info. This can mimic the META structure.

          What does the bootstrap look like?

          META's HTD is already hardcoded in the source code. We can do the same for TABLE table, just hardcoding the HTD in source code. Make sure that it is assigned first before accepting any changes to table metadata. I guess we have to assign META first, then TABLE table, then user regions.

          What happens if this table is corrupted somehow? Currently, hbck can recreate META entries if they are lost, but is there any other place to recover table descriptor information?

          We keep regionInfo's serialized in hdfs and in META. HBCK can repair the META regionInfos from hdfs. Maybe we can do the same here for tableinfos as well. We can keep the current logic in hdfs tableInfos, but serialize them in TABLE table as well. I think having multiple sources of truth hurts us though. Ideally, we should not need to keep this data in multiple places.

          Show
          enis Enis Soztutar added a comment - Where does this table's descriptor information get stored? I don't know of any design plan for this yet. But I imagine we can store them in serialized form in some column. Table name becomes the row key, and info:tableInfo column contains serialized table info. This can mimic the META structure. What does the bootstrap look like? META's HTD is already hardcoded in the source code. We can do the same for TABLE table, just hardcoding the HTD in source code. Make sure that it is assigned first before accepting any changes to table metadata. I guess we have to assign META first, then TABLE table, then user regions. What happens if this table is corrupted somehow? Currently, hbck can recreate META entries if they are lost, but is there any other place to recover table descriptor information? We keep regionInfo's serialized in hdfs and in META. HBCK can repair the META regionInfos from hdfs. Maybe we can do the same here for tableinfos as well. We can keep the current logic in hdfs tableInfos, but serialize them in TABLE table as well. I think having multiple sources of truth hurts us though. Ideally, we should not need to keep this data in multiple places.
          Hide
          davelatham Dave Latham added a comment -

          I would really like to see a resolution of this for 0.96 so that 0.96 can handle huge tables with reasonable performance. For 0.94 I'm content to rely on the patch posted here. Other users can grab it if they find it helpful.

          I'm still most comfortable with simply moving the table descriptor files to a well known subdirectory of the tabledir in HDFS. If done as part of the 0.96 migration then it can be a simple change without concern for supporting readers or writers using the old location and new location simultaneously. I'd like to make the trunk patch to do this migration and operate in the new location.

          I like Enis's proposal of using a system table (HBASE-7999) for table descriptors as well, but am not as comfortable trying to pull that off correctly in a short time frame as we're looking for a 0.96 release. That presents a couple possibilites. One, we can first change the table dir locaiton in hdfs but create a new JIRA to work on transitioning it into a system table. Or if someone who is more comfortable with that work wants to dive in and work on it together with me now and thinks we can get it done for the 0.96 then I'll give it a shot with them. Does this sound reasonable? If I hear no objections, I'll work on a trunk patch later this week or next week.

          Show
          davelatham Dave Latham added a comment - I would really like to see a resolution of this for 0.96 so that 0.96 can handle huge tables with reasonable performance. For 0.94 I'm content to rely on the patch posted here. Other users can grab it if they find it helpful. I'm still most comfortable with simply moving the table descriptor files to a well known subdirectory of the tabledir in HDFS. If done as part of the 0.96 migration then it can be a simple change without concern for supporting readers or writers using the old location and new location simultaneously. I'd like to make the trunk patch to do this migration and operate in the new location. I like Enis's proposal of using a system table ( HBASE-7999 ) for table descriptors as well, but am not as comfortable trying to pull that off correctly in a short time frame as we're looking for a 0.96 release. That presents a couple possibilites. One, we can first change the table dir locaiton in hdfs but create a new JIRA to work on transitioning it into a system table. Or if someone who is more comfortable with that work wants to dive in and work on it together with me now and thinks we can get it done for the 0.96 then I'll give it a shot with them. Does this sound reasonable? If I hear no objections, I'll work on a trunk patch later this week or next week.
          Hide
          enis Enis Soztutar added a comment -

          One, we can first change the table dir locaiton in hdfs but create a new JIRA to work on transitioning it into a system table.

          Agreed, we can do that as a follow up, and possibly after 0.96 comes out.
          There is already some migration and re-shuffling going on for hdfs directories, so please take a look at HBASE-8015 to see whether you can piggy back on that.

          Show
          enis Enis Soztutar added a comment - One, we can first change the table dir locaiton in hdfs but create a new JIRA to work on transitioning it into a system table. Agreed, we can do that as a follow up, and possibly after 0.96 comes out. There is already some migration and re-shuffling going on for hdfs directories, so please take a look at HBASE-8015 to see whether you can piggy back on that.
          Hide
          lhofhansl Lars Hofhansl added a comment - - edited

          Skimmed the patch. Looks OK. Will do a closer soon.

          +1 on doing the known-subdir approach in 0.94 and a tables-table in 0.96+.

          Could you run the full test suite with the patch attached? Have you been using this patch in any production environment, yet?

          Show
          lhofhansl Lars Hofhansl added a comment - - edited Skimmed the patch. Looks OK. Will do a closer soon. +1 on doing the known-subdir approach in 0.94 and a tables-table in 0.96+. Could you run the full test suite with the patch attached? Have you been using this patch in any production environment, yet?
          Hide
          ianfriedman Ian Friedman added a comment -

          Yes Lars, we have been running this patch in both of our production clusters for a few weeks now and it's been performing admirably.

          Show
          ianfriedman Ian Friedman added a comment - Yes Lars, we have been running this patch in both of our production clusters for a few weeks now and it's been performing admirably.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Also, with 130k regions @ 10GB each, do you have a 1.3PB table?!

          Show
          lhofhansl Lars Hofhansl added a comment - Also, with 130k regions @ 10GB each, do you have a 1.3PB table?!
          Hide
          ianfriedman Ian Friedman added a comment -

          No, the regions are not 10GB each.

          Show
          ianfriedman Ian Friedman added a comment - No, the regions are not 10GB each.
          Hide
          ianfriedman Ian Friedman added a comment -

          The 130k+ region table in question is more like 159 TB

          Show
          ianfriedman Ian Friedman added a comment - The 130k+ region table in question is more like 159 TB
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Shame I would have like to see that.

          As for the patch. Four questions:

          1. Is there a minimal version of the patch? It seems to include some extra clean up.
          2. There are some extra exceptions thrown now, that might throw off existing code. Could do this in 0.96+.
          3. Why do we need the lock file approach now? Could we use the previous logic of creating files with unique names instead? Might also need to delete old tableinfo first, etc.
          4. Are the pom changes needed?
          Show
          lhofhansl Lars Hofhansl added a comment - Shame I would have like to see that. As for the patch. Four questions: Is there a minimal version of the patch? It seems to include some extra clean up. There are some extra exceptions thrown now, that might throw off existing code. Could do this in 0.96+. Why do we need the lock file approach now? Could we use the previous logic of creating files with unique names instead? Might also need to delete old tableinfo first, etc. Are the pom changes needed?
          Hide
          lhofhansl Lars Hofhansl added a comment -

          159TB is nothing to cough at either.

          Show
          lhofhansl Lars Hofhansl added a comment - 159TB is nothing to cough at either.
          Hide
          davelatham Dave Latham added a comment -

          Thanks, Lars Hofhansl, for taking a look.

          1 on doing the known-subdir approach in 0.94 and a tables-table in 0.96.

          Actually what I was proposing in my comment on 17/Jul/13 was to not commit for 0.94, commit a known-subdir for 0.96 and opening a new tables-table JIRA for 0.96+. A couple reasons behind that. First - the patch is not purely rolling compatible. Break the code down into writers (Master, tools like hbck, merge, compact) and readers (everything). If one writer is updated and writes in the new way, then an old writer does an update in the old location only, then new readers will miss that update. So the requirement is to make no writes from old writers once you upgrade to a new writer - I'm not sure if we can/should make that a requirement for a rolling upgrade to 0.94. I'm not sure if there's a way around that. Also as you noted the patch involved additional cleanup and refactoring. If you want to see it go into 0.94 I'm game to explore it further, but I'm also content to push for 0.96 and leave the 0.94 patch here for interested parties.

          Could you run the full test suite with the patch attached? Have you been using this patch in any production environment, yet?

          Yes, the tests pass for 0.94.5 + HBASE-8778-0.94.5-v2.patch. And as Ian noted we've seen great results in production.

          Is there a minimal version of the patch? It seems to include some extra clean up.

          Don't currently have a minimal version of the patch. Because there is now a wait for a lock, I introduced a Configuration to be able to adjust the lock wait time. Also as I was putting this together I noticed that some clients are accessing static methods and others using instance methods, without much reason, that there is a fsreadonly field intended to prevent file system changes when set but that is not (cannot) be checked by the static methods and even some instance methods call those static methods which then ignore the field thus losing the guarantee. The patch changes everything to require an instance, correctly enforces the fsreadonly flag everywhere and can use the instance's Configuration. It also adds a great deal more commenting to make things more clear for the next maintainer.

          There are some extra exceptions thrown now, that might throw off existing code. Could do this in 0.96+.

          Examining them - FSTableDescriptors.add now throws NotImplementedException if in fsreadonly mode instead of silently failing to add the descriptor. It's not called by any current code that sets fsreadonly (RegionServer) but I believe if new code is added that does call it that failing loudly is better than failing silently. Likewise for updateHTableDescriptor, deleteTableDescriptorIfExists and createTableDescriptor. Those are the only cases of new exceptions that I can find. I'll grant that if there is third-party code (CPs?) that is calling these and is currently failing silently it is possible this change would break that. Though that code would have to update for the new method signatures anyway. Do we provide guarantees about compatibility on internal classes?

          Why do we need the lock file approach now? Could we use the previous logic of creating files with unique names instead? Might also need to delete old tableinfo first, etc.

          It was the only way I could find to make this work with a rolling upgrade (old readers and new readers simultaneously) and guarantee atomic updates in the presence of failures. If you can flesh out your approach a bit more and it works better, I'd love to lose the locks. If we proceed with 0.96 only then we can do the migration once and don't need the locks.

          Are the pom changes needed?

          Nope, sorry. That was just to get Eclipse to compile the thing.

          Also, with 130k regions @ 10GB each, do you have a 1.3PB table?!

          Yeah, definitely wish this table had fewer smaller regions. Been watching the online merge discussions. This is an old table from when the default region sizes were much smaller. We are looking at we grow to migrate to a newer table with fatter regions or merge this one down at some point. However, this issue will still be important even at that size (and the data keeps growing!)

          Thanks again for your thoughts.

          Show
          davelatham Dave Latham added a comment - Thanks, Lars Hofhansl , for taking a look. 1 on doing the known-subdir approach in 0.94 and a tables-table in 0.96 . Actually what I was proposing in my comment on 17/Jul/13 was to not commit for 0.94, commit a known-subdir for 0.96 and opening a new tables-table JIRA for 0.96+. A couple reasons behind that. First - the patch is not purely rolling compatible. Break the code down into writers (Master, tools like hbck, merge, compact) and readers (everything). If one writer is updated and writes in the new way, then an old writer does an update in the old location only, then new readers will miss that update. So the requirement is to make no writes from old writers once you upgrade to a new writer - I'm not sure if we can/should make that a requirement for a rolling upgrade to 0.94. I'm not sure if there's a way around that. Also as you noted the patch involved additional cleanup and refactoring. If you want to see it go into 0.94 I'm game to explore it further, but I'm also content to push for 0.96 and leave the 0.94 patch here for interested parties. Could you run the full test suite with the patch attached? Have you been using this patch in any production environment, yet? Yes, the tests pass for 0.94.5 + HBASE-8778 -0.94.5-v2.patch. And as Ian noted we've seen great results in production. Is there a minimal version of the patch? It seems to include some extra clean up. Don't currently have a minimal version of the patch. Because there is now a wait for a lock, I introduced a Configuration to be able to adjust the lock wait time. Also as I was putting this together I noticed that some clients are accessing static methods and others using instance methods, without much reason, that there is a fsreadonly field intended to prevent file system changes when set but that is not (cannot) be checked by the static methods and even some instance methods call those static methods which then ignore the field thus losing the guarantee. The patch changes everything to require an instance, correctly enforces the fsreadonly flag everywhere and can use the instance's Configuration. It also adds a great deal more commenting to make things more clear for the next maintainer. There are some extra exceptions thrown now, that might throw off existing code. Could do this in 0.96+. Examining them - FSTableDescriptors.add now throws NotImplementedException if in fsreadonly mode instead of silently failing to add the descriptor. It's not called by any current code that sets fsreadonly (RegionServer) but I believe if new code is added that does call it that failing loudly is better than failing silently. Likewise for updateHTableDescriptor, deleteTableDescriptorIfExists and createTableDescriptor. Those are the only cases of new exceptions that I can find. I'll grant that if there is third-party code (CPs?) that is calling these and is currently failing silently it is possible this change would break that. Though that code would have to update for the new method signatures anyway. Do we provide guarantees about compatibility on internal classes? Why do we need the lock file approach now? Could we use the previous logic of creating files with unique names instead? Might also need to delete old tableinfo first, etc. It was the only way I could find to make this work with a rolling upgrade (old readers and new readers simultaneously) and guarantee atomic updates in the presence of failures. If you can flesh out your approach a bit more and it works better, I'd love to lose the locks. If we proceed with 0.96 only then we can do the migration once and don't need the locks. Are the pom changes needed? Nope, sorry. That was just to get Eclipse to compile the thing. Also, with 130k regions @ 10GB each, do you have a 1.3PB table?! Yeah, definitely wish this table had fewer smaller regions. Been watching the online merge discussions. This is an old table from when the default region sizes were much smaller. We are looking at we grow to migrate to a newer table with fatter regions or merge this one down at some point. However, this issue will still be important even at that size (and the data keeps growing!) Thanks again for your thoughts.
          Hide
          davelatham Dave Latham added a comment -

          There is already some migration and re-shuffling going on for hdfs directories, so please take a look at HBASE-8015 to see whether you can piggy back on that.

          Thanks, Enis Soztutar, for the tip. I read through the latest patch there and looked at the code up on GitHub. It looks to me like it should be fairly easy to combine the updates. HBASE-8015 moves the tabledir and uses a new FullyQualifiedTableName. The same changes can be done to the updated FSTableDescriptors to use FullyQualifiedTableName and call FSUtil as this patch doesn't care where the tabledir is as long as it can get to it.

          Show
          davelatham Dave Latham added a comment - There is already some migration and re-shuffling going on for hdfs directories, so please take a look at HBASE-8015 to see whether you can piggy back on that. Thanks, Enis Soztutar , for the tip. I read through the latest patch there and looked at the code up on GitHub. It looks to me like it should be fairly easy to combine the updates. HBASE-8015 moves the tabledir and uses a new FullyQualifiedTableName. The same changes can be done to the updated FSTableDescriptors to use FullyQualifiedTableName and call FSUtil as this patch doesn't care where the tabledir is as long as it can get to it.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Thanks Dave Latham. Yeah, need to think through the upgrade issues in 0.94.
          It just puts a limit on HBase's scalability. But in any case anything more 10000 regions per cluster is probably an extreme case.
          Could make is a config option. That way one could rolling upgrade the cluster, then flip the option on, and roll the cluster again. Would have to think through the details.

          Show
          lhofhansl Lars Hofhansl added a comment - Thanks Dave Latham . Yeah, need to think through the upgrade issues in 0.94. It just puts a limit on HBase's scalability. But in any case anything more 10000 regions per cluster is probably an extreme case. Could make is a config option. That way one could rolling upgrade the cluster, then flip the option on, and roll the cluster again. Would have to think through the details.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Looking at the code. If the modtime of the tabledescriptor has changed after the cached version, we do what Dave described twice! First getTableInfoModtime is called, if that determines that the cache was changed, getTableDescriptorModtime is called, which does the same work of stat'ing the dir all over again.

          I can think of a few ways to make this better:

          1. When the table descriptor is cached, check the mod time of the table directory first; if that mod time is <= the cached descriptor's mod time we're good and do not need to stat the table directory. In a high churn table that might not help much, though, as new region dirs are constantly added.
          2. Record the sequence number of a table descriptor when cached. Instead of checking mod time, we can check whether next highest sequence number exists. If so, we need to reload (but no need to check the mod time by stat'ing the dir). Can there be gaps in the sequence numbers?
          Show
          lhofhansl Lars Hofhansl added a comment - Looking at the code. If the modtime of the tabledescriptor has changed after the cached version, we do what Dave described twice! First getTableInfoModtime is called, if that determines that the cache was changed, getTableDescriptorModtime is called, which does the same work of stat'ing the dir all over again. I can think of a few ways to make this better: When the table descriptor is cached, check the mod time of the table directory first; if that mod time is <= the cached descriptor's mod time we're good and do not need to stat the table directory. In a high churn table that might not help much, though, as new region dirs are constantly added. Record the sequence number of a table descriptor when cached. Instead of checking mod time, we can check whether next highest sequence number exists. If so, we need to reload (but no need to check the mod time by stat'ing the dir). Can there be gaps in the sequence numbers?
          Hide
          davelatham Dave Latham added a comment -

          Could make is a config option. That way one could rolling upgrade the cluster, then flip the option on, and roll the cluster again. Would have to think through the details.

          Yeah I think it's still tricky to get the second roll correct - if you're supporting having some writers that write only to the old dir and some readers who are trying to read from the new location. I think you may be able to pull it off by two passes - first upgrade all the writers to write to both places but keep everything reading from the old location, then do a second rolling pass to move the readers to the new location. Then you could do a third pass to have writers only write to the new location.

          Looking at the code. If the modtime of the tabledescriptor has changed after the cached version, we do what Dave described twice! First getTableInfoModtime is called, if that determines that the cache was changed, getTableDescriptorModtime is called, which does the same work of stat'ing the dir all over again.

          Yes. I wasn't too concerned about this case as normally the descriptors are not changed often so this would only happen once per table.

          When the table descriptor is cached, check the mod time of the table directory first; if that mod time is <= the cached descriptor's mod time we're good and do not need to stat the table directory. In a high churn table that might not help much, though, as new region dirs are constantly added.

          That's an interesting approach I hadn't considered. I wasn't aware that HDFS maintained directory mod times. Sounds like a pretty simple change for 0.94 to me. In our case it would solve the issue for one of our huge tables that doesn't have much split activity and make a huge difference for the other that has splits every few minutes or so.

          Record the sequence number of a table descriptor when cached. Instead of checking mod time, we can check whether next highest sequence number exists. If so, we need to reload (but no need to check the mod time by stat'ing the dir). Can there be gaps in the sequence numbers?

          Don't think this approach would work as the table could be modified many times since the last check so the reader couldn't know which sequence number to check.

          Show
          davelatham Dave Latham added a comment - Could make is a config option. That way one could rolling upgrade the cluster, then flip the option on, and roll the cluster again. Would have to think through the details. Yeah I think it's still tricky to get the second roll correct - if you're supporting having some writers that write only to the old dir and some readers who are trying to read from the new location. I think you may be able to pull it off by two passes - first upgrade all the writers to write to both places but keep everything reading from the old location, then do a second rolling pass to move the readers to the new location. Then you could do a third pass to have writers only write to the new location. Looking at the code. If the modtime of the tabledescriptor has changed after the cached version, we do what Dave described twice! First getTableInfoModtime is called, if that determines that the cache was changed, getTableDescriptorModtime is called, which does the same work of stat'ing the dir all over again. Yes. I wasn't too concerned about this case as normally the descriptors are not changed often so this would only happen once per table. When the table descriptor is cached, check the mod time of the table directory first; if that mod time is <= the cached descriptor's mod time we're good and do not need to stat the table directory. In a high churn table that might not help much, though, as new region dirs are constantly added. That's an interesting approach I hadn't considered. I wasn't aware that HDFS maintained directory mod times. Sounds like a pretty simple change for 0.94 to me. In our case it would solve the issue for one of our huge tables that doesn't have much split activity and make a huge difference for the other that has splits every few minutes or so. Record the sequence number of a table descriptor when cached. Instead of checking mod time, we can check whether next highest sequence number exists. If so, we need to reload (but no need to check the mod time by stat'ing the dir). Can there be gaps in the sequence numbers? Don't think this approach would work as the table could be modified many times since the last check so the reader couldn't know which sequence number to check.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Yeah, the 2nd roll would essentially still have he same problem. Maybe we can just generally make this configurable, then an admin can enable this at a convenient time (and we'd document in the release notes and hbase-defaults.xml that rolling upgrades are broken when this is enabled for the first time).

          Maybe we can do the directory mod-time unconditionally and a simplified patch behind a config option.

          Show
          lhofhansl Lars Hofhansl added a comment - Yeah, the 2nd roll would essentially still have he same problem. Maybe we can just generally make this configurable, then an admin can enable this at a convenient time (and we'd document in the release notes and hbase-defaults.xml that rolling upgrades are broken when this is enabled for the first time). Maybe we can do the directory mod-time unconditionally and a simplified patch behind a config option.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Trivial patch that does the table dir modtime check before the table dir is enumerated.

          Show
          lhofhansl Lars Hofhansl added a comment - Trivial patch that does the table dir modtime check before the table dir is enumerated.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Let's get a test run with that.

          Show
          lhofhansl Lars Hofhansl added a comment - Let's get a test run with that.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12593576/8778-dirmodtime.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          +1 site. The mvn site goal succeeds with this patch.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12593576/8778-dirmodtime.txt against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6424//console This message is automatically generated.
          Hide
          davelatham Dave Latham added a comment -

          I began working on a trunk/0.96 patch that would move the table descriptor files to a known sub directory as well as take the refactoring, cleanup and documentation from the 0.94.5 patch above but adding a one time migration instead of the locking or rolling upgrade support. One issue I ran into is support for snapshots. The snapshot code calls into FSTableDescriptors to write a table descriptor file in the snapshot directory. How should this work when FSTableDescriptors is putting descriptors into a known subdir?

          Options I see:

          • Snapshots behave just like actual table directories and put their descriptors into a known subdir. During migration, snapshots are migrated to move their descriptor into their known subdir.
          • New snapshots put table descriptors into known subdir. Reading snapshots support finding the table descriptor in the subdir or the orig directory so no migration of snapshots are needed.
          • Snapshots continue to store table descriptors directly in the snapshot directory and FSTableDescriptors is refactored to share code to write sequenced descriptors in any directory.

          Thoughts? I'm leaning toward the last option.

          Show
          davelatham Dave Latham added a comment - I began working on a trunk/0.96 patch that would move the table descriptor files to a known sub directory as well as take the refactoring, cleanup and documentation from the 0.94.5 patch above but adding a one time migration instead of the locking or rolling upgrade support. One issue I ran into is support for snapshots. The snapshot code calls into FSTableDescriptors to write a table descriptor file in the snapshot directory. How should this work when FSTableDescriptors is putting descriptors into a known subdir? Options I see: Snapshots behave just like actual table directories and put their descriptors into a known subdir. During migration, snapshots are migrated to move their descriptor into their known subdir. New snapshots put table descriptors into known subdir. Reading snapshots support finding the table descriptor in the subdir or the orig directory so no migration of snapshots are needed. Snapshots continue to store table descriptors directly in the snapshot directory and FSTableDescriptors is refactored to share code to write sequenced descriptors in any directory. Thoughts? I'm leaning toward the last option.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Either #1 or #3. If the layout is the same between table and snapshot, maybe #1 makes the most sense.

          What about the directory modtime check, should we just commit this to 0.94 (provided it solves your problem as you said)?

          Show
          lhofhansl Lars Hofhansl added a comment - Either #1 or #3. If the layout is the same between table and snapshot, maybe #1 makes the most sense. What about the directory modtime check, should we just commit this to 0.94 (provided it solves your problem as you said)?
          Hide
          davelatham Dave Latham added a comment -

          The directory modtime check looks good to me. For us we will probably stick with the patch I have above for awhile, but I think the modtime would help others.

          For snapshots I can't find a good reference for their actual hdfs layout, but from a brief look at the code I think it tries to mirror an actual table directory except for using hfile refs. For #1 I wonder if there are guarantees that would make sure we can find all snapshots during a migration.

          Show
          davelatham Dave Latham added a comment - The directory modtime check looks good to me. For us we will probably stick with the patch I have above for awhile, but I think the modtime would help others. For snapshots I can't find a good reference for their actual hdfs layout, but from a brief look at the code I think it tries to mirror an actual table directory except for using hfile refs. For #1 I wonder if there are guarantees that would make sure we can find all snapshots during a migration.
          Hide
          davelatham Dave Latham added a comment -

          Actually, one thought about the directory mod time. The descriptor files are normally written to a tmp dir and then renamed into place. In this case I'm thinking the directory mod time may end up greater than the cached descriptor mod time and the extra check wouldn't help. So may need to cache the directory mod time directly.

          Show
          davelatham Dave Latham added a comment - Actually, one thought about the directory mod time. The descriptor files are normally written to a tmp dir and then renamed into place. In this case I'm thinking the directory mod time may end up greater than the cached descriptor mod time and the extra check wouldn't help. So may need to cache the directory mod time directly.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Hmm... In that case, take the modtime of the temp dir? Getting a bit fragile.

          Show
          lhofhansl Lars Hofhansl added a comment - Hmm... In that case, take the modtime of the temp dir? Getting a bit fragile.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Or just cache the dir time, as you say.

          Show
          lhofhansl Lars Hofhansl added a comment - Or just cache the dir time, as you say.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Of course the .tmp directory will also be newer, since the tableinfo file was removed from it. Well, it was a good try.

          Show
          lhofhansl Lars Hofhansl added a comment - Of course the .tmp directory will also be newer, since the tableinfo file was removed from it. Well, it was a good try.
          Hide
          davelatham Dave Latham added a comment -

          Attached HBASE-8778.patch which is a patch for trunk. Up at reviewboard at:
          https://reviews.apache.org/r/12920/

          This patch is similar to HBASE-8778-0.94.5-v2 but because it is for trunk only it does not keep two copies (one in tabledir another in .tabledesc subdir) - it only keeps table info files in the known subdir. As a result it also does not use any locking.

          The patch still includes some cleanup and refactoring of FSTableDescriptors to consistently enforce the fsreadonly flag and use more instance methods than static methods in order to do so.

          It also changes snapshots to likewise store their table descriptors in the .tabledesc subdirectory so they continue to share code and look just like table directories (option #1 from the previous comment.)

          It also adds a migration step to HMaster.finishInitialization which migrates existing snapshot directories, user tables, and system tables to store descriptors in the .tabledesc subdirectory.

          Going to run it by Hadoop QA and welcome review and comment.

          Show
          davelatham Dave Latham added a comment - Attached HBASE-8778 .patch which is a patch for trunk. Up at reviewboard at: https://reviews.apache.org/r/12920/ This patch is similar to HBASE-8778 -0.94.5-v2 but because it is for trunk only it does not keep two copies (one in tabledir another in .tabledesc subdir) - it only keeps table info files in the known subdir. As a result it also does not use any locking. The patch still includes some cleanup and refactoring of FSTableDescriptors to consistently enforce the fsreadonly flag and use more instance methods than static methods in order to do so. It also changes snapshots to likewise store their table descriptors in the .tabledesc subdirectory so they continue to share code and look just like table directories (option #1 from the previous comment.) It also adds a migration step to HMaster.finishInitialization which migrates existing snapshot directories, user tables, and system tables to store descriptors in the .tabledesc subdirectory. Going to run it by Hadoop QA and welcome review and comment.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12594033/HBASE-8778.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 30 new or modified tests.

          +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          -1 javadoc. The javadoc tool appears to have generated 3 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings).

          -1 lineLengths. The patch introduces lines longer than 100

          +1 site. The mvn site goal succeeds with this patch.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.master.handler.TestTableDeleteFamilyHandler
          org.apache.hadoop.hbase.client.TestCloneSnapshotFromClient
          org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence
          org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient
          org.apache.hadoop.hbase.snapshot.TestFlushSnapshotFromClient
          org.apache.hadoop.hbase.snapshot.TestRestoreFlushSnapshotFromClient
          org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//testReport/
          Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12594033/HBASE-8778.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 30 new or modified tests. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 3 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit . The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). -1 lineLengths . The patch introduces lines longer than 100 +1 site . The mvn site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.master.handler.TestTableDeleteFamilyHandler org.apache.hadoop.hbase.client.TestCloneSnapshotFromClient org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient org.apache.hadoop.hbase.snapshot.TestFlushSnapshotFromClient org.apache.hadoop.hbase.snapshot.TestRestoreFlushSnapshotFromClient org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6456//console This message is automatically generated.
          Hide
          davelatham Dave Latham added a comment -

          Attaching HBASE-8778-v2.patch

          This patch addresses a couple comments from Ted on reviewboard (include license, added missing early return on migrating table info that doesn't need it) as well as the Hadoop QA notes (license, a few lines >100 characters, a few javadoc warnings, and fixed the test failures)

          Show
          davelatham Dave Latham added a comment - Attaching HBASE-8778 -v2.patch This patch addresses a couple comments from Ted on reviewboard (include license, added missing early return on migrating table info that doesn't need it) as well as the Hadoop QA notes (license, a few lines >100 characters, a few javadoc warnings, and fixed the test failures)
          Hide
          davelatham Dave Latham added a comment -

          Resubmitting to give Hadoop QA another chance now that trunk was fixed.

          Show
          davelatham Dave Latham added a comment - Resubmitting to give Hadoop QA another chance now that trunk was fixed.
          Hide
          davelatham Dave Latham added a comment -

          Reattaching since changing issue status didn't seem to trigger Hadoop QA.

          Show
          davelatham Dave Latham added a comment - Reattaching since changing issue status didn't seem to trigger Hadoop QA.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12594293/HBASE-8778-v2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 30 new or modified tests.

          +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          +1 site. The mvn site goal succeeds with this patch.

          -1 core tests. The patch failed these unit tests:

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12594293/HBASE-8778-v2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 30 new or modified tests. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 1 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6472//console This message is automatically generated.
          Hide
          davelatham Dave Latham added a comment -

          Attaching HBASE-8778-v3.patch.

          Only change from v2 is fixing one javadoc warning. Don't know why Hadoop QA said that tests failed for v2 when all passed.

          Show
          davelatham Dave Latham added a comment - Attaching HBASE-8778 -v3.patch. Only change from v2 is fixing one javadoc warning. Don't know why Hadoop QA said that tests failed for v2 when all passed.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12594312/HBASE-8778-v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 30 new or modified tests.

          +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          +1 site. The mvn site goal succeeds with this patch.

          -1 core tests. The patch failed these unit tests:

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12594312/HBASE-8778-v3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 30 new or modified tests. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6476//console This message is automatically generated.
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Thanks Dave. I'll have a look at RB tomorrow (sorry for the lack of response).

          Show
          lhofhansl Lars Hofhansl added a comment - Thanks Dave. I'll have a look at RB tomorrow (sorry for the lack of response).
          Hide
          lhofhansl Lars Hofhansl added a comment -

          Removing 0.94 tag; not likely that we'll fix it there.

          Show
          lhofhansl Lars Hofhansl added a comment - Removing 0.94 tag; not likely that we'll fix it there.
          Hide
          jmspaggi Jean-Marc Spaggiari added a comment -

          One thing about this patch. If the content or the directory itself get corrupted, is there a way to fix that? Is it possible to update fsck to check this directory/file content/format and fix it if it's corrupted?

          Show
          jmspaggi Jean-Marc Spaggiari added a comment - One thing about this patch. If the content or the directory itself get corrupted, is there a way to fix that? Is it possible to update fsck to check this directory/file content/format and fix it if it's corrupted?
          Hide
          davelatham Dave Latham added a comment -

          Thanks Jean-Marc Spaggiari for taking a look.

          This change doesn't alter the ability to repair or recreate wrong or missing table descriptor information. hbck's fixOrphanTables still works with the new location.

          Show
          davelatham Dave Latham added a comment - Thanks Jean-Marc Spaggiari for taking a look. This change doesn't alter the ability to repair or recreate wrong or missing table descriptor information. hbck's fixOrphanTables still works with the new location.
          Hide
          yuzhihong@gmail.com Ted Yu added a comment -

          Patch v3 needs slight rebasing:

          hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java.rej

          Show
          yuzhihong@gmail.com Ted Yu added a comment - Patch v3 needs slight rebasing: hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java.rej
          Hide
          yuzhihong@gmail.com Ted Yu added a comment -

          It turns out that TestMetaMigrationConvertingToPB hung.

          In its test output, I saw:

          2013-07-30 18:19:26,137 ERROR [RS_OPEN_REGION-kiyo:57646-1] handler.OpenRegionHandler(475): Failed open of region=TestTable,row1,1344396119749.bdb336540cb3959f86178d3a83c94394., starting to roll back the global memstore size.
          java.lang.IllegalStateException: Could not instantiate a region instance.
                  at org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3821)
                  at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4077)
                  at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4031)
                  at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3982)
                  at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:459)
                  at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:137)
                  at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:130)
                  at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
                  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
                  at java.lang.Thread.run(Thread.java:662)
          Caused by: java.lang.reflect.InvocationTargetException
                  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
                  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
                  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
                  at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
                  at org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3818)
                  ... 9 more
          Caused by: java.lang.IllegalArgumentException: Need table descriptor
                  at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:458)
                  at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:434)
                  ... 14 more
          
          Show
          yuzhihong@gmail.com Ted Yu added a comment - It turns out that TestMetaMigrationConvertingToPB hung. In its test output, I saw: 2013-07-30 18:19:26,137 ERROR [RS_OPEN_REGION-kiyo:57646-1] handler.OpenRegionHandler(475): Failed open of region=TestTable,row1,1344396119749.bdb336540cb3959f86178d3a83c94394., starting to roll back the global memstore size. java.lang.IllegalStateException: Could not instantiate a region instance. at org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3821) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4077) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4031) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3982) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:459) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:137) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:130) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang. Thread .run( Thread .java:662) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3818) ... 9 more Caused by: java.lang.IllegalArgumentException: Need table descriptor at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:458) at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:434) ... 14 more
          Hide
          davelatham Dave Latham added a comment -

          HBASE-8778-v4.patch - rebased to the latest trunk.

          Ted, what is that result from? The line numbers don't match up for me from trunk and I haven't seen that failure.

          Show
          davelatham Dave Latham added a comment - HBASE-8778 -v4.patch - rebased to the latest trunk. Ted, what is that result from? The line numbers don't match up for me from trunk and I haven't seen that failure.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12595022/HBASE-8778-v4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 30 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6524//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12595022/HBASE-8778-v4.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 30 new or modified tests. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6524//console This message is automatically generated.
          Hide
          yuzhihong@gmail.com Ted Yu added a comment -

          @Dave:
          Trunk changes very fast.
          By running TestMetaMigrationConvertingToPB, you should be able to find it in test output.

          Show
          yuzhihong@gmail.com Ted Yu added a comment - @Dave: Trunk changes very fast. By running TestMetaMigrationConvertingToPB, you should be able to find it in test output.
          Hide
          yuzhihong@gmail.com Ted Yu added a comment -

          Patch v3 only had one hunk which needs to be resolved.
          I noticed the following in test output of TestMetaMigrationConvertingToPB which was responsible for the IllegalArgumentException mentioned above:

          2013-07-30 15:27:59,630 DEBUG [RpcServer.handler=0,port=51198] util.FSTableDescriptors(189): Exception during readTableDecriptor. Current table name = TestTable
          org.apache.hadoop.hbase.TableInfoMissingException: No table descriptor file under hdfs://localhost:51139/user/tyu/hbase/TestTable
          	at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorAndModtime(FSTableDescriptors.java:506)
          	at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorAndModtime(FSTableDescriptors.java:499)
          	at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:184)
          	at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:144)
          	at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:3450)
          	at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:14390)
          

          This is due to FSTableDescriptors#getTableInfoPath() only checking in the new tableinfo dir which didn't exist in the tar ball that is used by the test:

            private static FileStatus getTableInfoPath(FileSystem fs, Path tableDir, boolean removeOldFiles)
            throws IOException {
              Path tableInfoDir = new Path(tableDir, TABLEINFO_DIR);
              return getCurrentTableInfoStatus(fs, tableInfoDir, removeOldFiles);
          
          Show
          yuzhihong@gmail.com Ted Yu added a comment - Patch v3 only had one hunk which needs to be resolved. I noticed the following in test output of TestMetaMigrationConvertingToPB which was responsible for the IllegalArgumentException mentioned above: 2013-07-30 15:27:59,630 DEBUG [RpcServer.handler=0,port=51198] util.FSTableDescriptors(189): Exception during readTableDecriptor. Current table name = TestTable org.apache.hadoop.hbase.TableInfoMissingException: No table descriptor file under hdfs: //localhost:51139/user/tyu/hbase/TestTable at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorAndModtime(FSTableDescriptors.java:506) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorAndModtime(FSTableDescriptors.java:499) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:184) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:144) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:3450) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:14390) This is due to FSTableDescriptors#getTableInfoPath() only checking in the new tableinfo dir which didn't exist in the tar ball that is used by the test: private static FileStatus getTableInfoPath(FileSystem fs, Path tableDir, boolean removeOldFiles) throws IOException { Path tableInfoDir = new Path(tableDir, TABLEINFO_DIR); return getCurrentTableInfoStatus(fs, tableInfoDir, removeOldFiles);
          Hide
          stack stack added a comment -

          Any chance of more review on this?

          Show
          stack stack added a comment - Any chance of more review on this?
          Hide
          davelatham Dave Latham added a comment -

          Yes, trunk indeed moves fast.
          Thanks, Ted, for the review and help with the test.

          Here's HBASE-8778-v5.patch which solves the issue revealed by the test. I will also update it on review board.

          Show
          davelatham Dave Latham added a comment - Yes, trunk indeed moves fast. Thanks, Ted, for the review and help with the test. Here's HBASE-8778 -v5.patch which solves the issue revealed by the test. I will also update it on review board.
          Hide
          hadoopqa Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12595634/HBASE-8778-v5.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 30 new or modified tests.

          +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          +1 site. The mvn site goal succeeds with this patch.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12595634/HBASE-8778-v5.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 30 new or modified tests. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6574//console This message is automatically generated.
          Hide
          yuzhihong@gmail.com Ted Yu added a comment -

          +1 from me.

          Show
          yuzhihong@gmail.com Ted Yu added a comment - +1 from me.
          Hide
          stack stack added a comment -

          Matteo Bertozzi You want to have a look at this one boss?

          Show
          stack stack added a comment - Matteo Bertozzi You want to have a look at this one boss?
          Hide
          mbertozzi Matteo Bertozzi added a comment -

          +1, v5 looks good to me

          Show
          mbertozzi Matteo Bertozzi added a comment - +1, v5 looks good to me
          Hide
          stack stack added a comment -

          I scanned the patch. LGTM. Will apply after namespaces goes in. I can do the necessary shoe horning on commit.

          Show
          stack stack added a comment - I scanned the patch. LGTM. Will apply after namespaces goes in. I can do the necessary shoe horning on commit.
          Hide
          davelatham Dave Latham added a comment -

          Thanks, Stack. Let me know if you want me to do merging when namespaces goes in.

          Lars Hofhansl, I created a new JIRA HBASE-9132 and put up a patch for 0.94 to use the directory modtime to reduce these directory scans for 0.94. Thanks for the idea. Let me know what you think.

          Show
          davelatham Dave Latham added a comment - Thanks, Stack. Let me know if you want me to do merging when namespaces goes in. Lars Hofhansl , I created a new JIRA HBASE-9132 and put up a patch for 0.94 to use the directory modtime to reduce these directory scans for 0.94. Thanks for the idea. Let me know what you think.
          Hide
          stack stack added a comment -

          Committed to trunk and 0.95 (ns needs another revision anyways). Thanks D.

          Show
          stack stack added a comment - Committed to trunk and 0.95 (ns needs another revision anyways). Thanks D.
          Hide
          stack stack added a comment -

          Dave Latham Mind adding a release note? Just high level on what gets moved where? Thanks boss...

          Show
          stack stack added a comment - Dave Latham Mind adding a release note? Just high level on what gets moved where? Thanks boss...
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in hbase-0.95 #407 (See https://builds.apache.org/job/hbase-0.95/407/)
          HBASE-8778 Region assigments scan table directory making them slow for huge tables (stack: rev 1510978)

          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/MasterSnapshotVerifier.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotManager.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotInfo.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/TableInfoCopyTask.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptorMigrationToSubdir.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HMerge.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Merge.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/TestFSTableDescriptorForceCreation.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestTableDescriptorModification.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/SnapshotTestingUtils.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestFlushSnapshotFromClient.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestRestoreSnapshotHelper.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in hbase-0.95 #407 (See https://builds.apache.org/job/hbase-0.95/407/ ) HBASE-8778 Region assigments scan table directory making them slow for huge tables (stack: rev 1510978) /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/MasterSnapshotVerifier.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotManager.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotInfo.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/TableInfoCopyTask.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptorMigrationToSubdir.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HMerge.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Merge.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/TestFSTableDescriptorForceCreation.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestTableDescriptorModification.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/SnapshotTestingUtils.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestFlushSnapshotFromClient.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestRestoreSnapshotHelper.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in hbase-0.95-on-hadoop2 #221 (See https://builds.apache.org/job/hbase-0.95-on-hadoop2/221/)
          HBASE-8778 Region assigments scan table directory making them slow for huge tables (stack: rev 1510978)

          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/MasterSnapshotVerifier.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotManager.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotInfo.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/TableInfoCopyTask.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptorMigrationToSubdir.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HMerge.java
          • /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Merge.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/TestFSTableDescriptorForceCreation.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestTableDescriptorModification.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/SnapshotTestingUtils.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestFlushSnapshotFromClient.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestRestoreSnapshotHelper.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java
          • /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in hbase-0.95-on-hadoop2 #221 (See https://builds.apache.org/job/hbase-0.95-on-hadoop2/221/ ) HBASE-8778 Region assigments scan table directory making them slow for huge tables (stack: rev 1510978) /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/MasterSnapshotVerifier.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotManager.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotInfo.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/TableInfoCopyTask.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptorMigrationToSubdir.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HMerge.java /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Merge.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/TestFSTableDescriptorForceCreation.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestTableDescriptorModification.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/SnapshotTestingUtils.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestFlushSnapshotFromClient.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestRestoreSnapshotHelper.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in HBase-TRUNK #4347 (See https://builds.apache.org/job/HBase-TRUNK/4347/)
          HBASE-8778 Region assigments scan table directory making them slow for huge tables (stack: rev 1510977)

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/MasterSnapshotVerifier.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotManager.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotInfo.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/TableInfoCopyTask.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptorMigrationToSubdir.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HMerge.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Merge.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestFSTableDescriptorForceCreation.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestTableDescriptorModification.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/SnapshotTestingUtils.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestFlushSnapshotFromClient.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestRestoreSnapshotHelper.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK #4347 (See https://builds.apache.org/job/HBase-TRUNK/4347/ ) HBASE-8778 Region assigments scan table directory making them slow for huge tables (stack: rev 1510977) /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/MasterSnapshotVerifier.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotManager.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotInfo.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/TableInfoCopyTask.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptorMigrationToSubdir.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HMerge.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Merge.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestFSTableDescriptorForceCreation.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestTableDescriptorModification.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/SnapshotTestingUtils.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestFlushSnapshotFromClient.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestRestoreSnapshotHelper.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #655 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/655/)
          HBASE-8778 Region assigments scan table directory making them slow for huge tables (stack: rev 1510977)

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/MasterSnapshotVerifier.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotManager.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotInfo.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/TableInfoCopyTask.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptorMigrationToSubdir.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HMerge.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Merge.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestFSTableDescriptorForceCreation.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestTableDescriptorModification.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/SnapshotTestingUtils.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestFlushSnapshotFromClient.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestRestoreSnapshotHelper.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #655 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/655/ ) HBASE-8778 Region assigments scan table directory making them slow for huge tables (stack: rev 1510977) /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/MasterSnapshotVerifier.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotManager.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotInfo.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/TableInfoCopyTask.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptorMigrationToSubdir.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HMerge.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Merge.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestFSTableDescriptorForceCreation.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestTableDescriptorModification.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/SnapshotTestingUtils.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestFlushSnapshotFromClient.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestRestoreSnapshotHelper.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java
          Hide
          lars_francke Lars Francke added a comment -

          I know this has been long closed but it introduced the FSTableDescriptorMigrationToSubdir class which handles the migration from the old to the new style.

          The comment says it "will be removed for the major release after 0.96".

          Are you okay with this being removed now?

          If so any suggestions on how to handle this now:

              // Make sure the meta region directory exists!
              if (!FSUtils.metaRegionExists(fs, rd)) {
                bootstrap(rd, c);
              } else {
                // Migrate table descriptor files if necessary
                org.apache.hadoop.hbase.util.FSTableDescriptorMigrationToSubdir
                  .migrateFSTableDescriptorsIfNecessary(fs, rd);
              }
          

          I'll create a new JIRA when/if you think it's time to remove this now.

          Show
          lars_francke Lars Francke added a comment - I know this has been long closed but it introduced the FSTableDescriptorMigrationToSubdir class which handles the migration from the old to the new style. The comment says it "will be removed for the major release after 0.96". Are you okay with this being removed now? If so any suggestions on how to handle this now: // Make sure the meta region directory exists! if (!FSUtils.metaRegionExists(fs, rd)) { bootstrap(rd, c); } else { // Migrate table descriptor files if necessary org.apache.hadoop.hbase.util.FSTableDescriptorMigrationToSubdir .migrateFSTableDescriptorsIfNecessary(fs, rd); } I'll create a new JIRA when/if you think it's time to remove this now.
          Hide
          davelatham Dave Latham added a comment -

          Lars Francke, thanks for bringing it up. The code needs to be in any version where we want to support a direct upgrade from 0.94. At https://hbase.apache.org/book.html#upgrade1.0.from.0.94 it indicates a direct (but not rolling) upgrade is supported from 0.94 to 1.0, so the code needs to survive for the 1.0 line. I don't know, but am guessing that would be extended to any 1.x. I have no idea if there has been a decision to support or not direct upgrades from 0.94 to 2.0 or higher. If it has been decided not to support that, then it would finally be safe to remove `FSTableDescriptorMigrationToSubdir` from the 2.0 code (master, I believe). I suppose the javadoc you quoted is wrong and would likely better read "will be removed after upgrade from 0.94 is no longer supported." Once it's safe to remove, simply remove the `else` clause you quoted as well as the `FSTableDescriptorMigrationToSubdir` class itself.

          Show
          davelatham Dave Latham added a comment - Lars Francke , thanks for bringing it up. The code needs to be in any version where we want to support a direct upgrade from 0.94. At https://hbase.apache.org/book.html#upgrade1.0.from.0.94 it indicates a direct (but not rolling) upgrade is supported from 0.94 to 1.0, so the code needs to survive for the 1.0 line. I don't know, but am guessing that would be extended to any 1.x. I have no idea if there has been a decision to support or not direct upgrades from 0.94 to 2.0 or higher. If it has been decided not to support that, then it would finally be safe to remove `FSTableDescriptorMigrationToSubdir` from the 2.0 code (master, I believe). I suppose the javadoc you quoted is wrong and would likely better read "will be removed after upgrade from 0.94 is no longer supported." Once it's safe to remove, simply remove the `else` clause you quoted as well as the `FSTableDescriptorMigrationToSubdir` class itself.
          Hide
          lars_francke Lars Francke added a comment -

          Thanks Dave Latham. I'll bring it up on the mailing list and will submit a JIRA that fixes wording and deprecation tags.

          Show
          lars_francke Lars Francke added a comment - Thanks Dave Latham . I'll bring it up on the mailing list and will submit a JIRA that fixes wording and deprecation tags.

            People

            • Assignee:
              davelatham Dave Latham
              Reporter:
              davelatham Dave Latham
            • Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development