HBase
  1. HBase
  2. HBASE-7

[hbase] Provide a HBase checker and repair tool similar to fsck

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.90.0
    • Component/s: util
    • Labels:
    • Hadoop Flags:
      Reviewed

      Description

      We need a tool to verify (and repair) HBase much like fsck

      1. HBaseConfiguration.java
        3 kB
        Ted Yu
      2. HBaseFsck.java
        20 kB
        Ted Yu
      3. hbase.fsck1.txt
        21 kB
        stack
      4. check_meta.rb
        5 kB
        stack
      5. check_meta.rb
        5 kB
        stack
      6. add_region.rb
        2 kB
        stack
      7. HBASE-7.patch
        13 kB
        ryan rawson
      8. patch.txt
        2 kB
        Jim Kellerman

        Issue Links

        1. Redundant meta tables Sub-task Open Unassigned
         

          Activity

          Hide
          stack added a comment -

          One suggestion: The hbsk tool would list all regions in the filesystem and make sure each has an entry in the .META.

          Show
          stack added a comment - One suggestion: The hbsk tool would list all regions in the filesystem and make sure each has an entry in the .META.
          Hide
          stack added a comment -

          Another suggestion: Check info files and rewrite them if broken (Seems to be at root of HADOOP-2445)

          Show
          stack added a comment - Another suggestion: Check info files and rewrite them if broken (Seems to be at root of HADOOP-2445 )
          Hide
          Jim Kellerman added a comment -

          Would it make sense to write the HRegionInfo into a file in the HRegion directory?

          If we were just scanning the disk, it would make it easier to figure out what region was in the HRegion directory.

          Maybe not, because then we'd have to deal with splits.

          How about writing the region name to a file in the region directory?

          Also probably not since it is hard to decode a region name.

          The goal here is be able to recover the meta table if it is corrupted. Similarly recovering the root region from the meta region(s) if the root region is corrupted.

          Show
          Jim Kellerman added a comment - Would it make sense to write the HRegionInfo into a file in the HRegion directory? If we were just scanning the disk, it would make it easier to figure out what region was in the HRegion directory. Maybe not, because then we'd have to deal with splits. How about writing the region name to a file in the region directory? Also probably not since it is hard to decode a region name. The goal here is be able to recover the meta table if it is corrupted. Similarly recovering the root region from the meta region(s) if the root region is corrupted.
          Hide
          Jim Kellerman added a comment -

          In the Region directory, write an info file which contains the region name.

          We now have the row key for the meta, and can compute the encoded region name.

          Currently, only the HStores know if they are a reference to another HStore. They need to know this so they can construct the right kind of reader.

          However, the region itself doesn't know that is is a child region and whether or not it has a reference to a parent region. This would also be useful information to know when just recovering from the files in hdfs

          Show
          Jim Kellerman added a comment - In the Region directory, write an info file which contains the region name. We now have the row key for the meta, and can compute the encoded region name. Currently, only the HStores know if they are a reference to another HStore. They need to know this so they can construct the right kind of reader. However, the region itself doesn't know that is is a child region and whether or not it has a reference to a parent region. This would also be useful information to know when just recovering from the files in hdfs
          Hide
          Jim Kellerman added a comment -

          Addresses HADOOP-2458 and HADOOP-2465

          Show
          Jim Kellerman added a comment - Addresses HADOOP-2458 and HADOOP-2465
          Hide
          Jim Kellerman added a comment -

          Passes locally, try Hudson

          Show
          Jim Kellerman added a comment - Passes locally, try Hudson
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12371914/patch.txt
          against trunk revision r605441.

          @author +1. The patch does not contain any @author tags.

          javadoc +1. The javadoc tool did not generate any warning messages.

          javac +1. The applied patch does not generate any new compiler warnings.

          findbugs +1. The patch does not introduce any new Findbugs warnings.

          core tests +1. The patch passed core unit tests.

          contrib tests +1. The patch passed contrib unit tests.

          Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1391/testReport/
          Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1391/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1391/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1391/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12371914/patch.txt against trunk revision r605441. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1391/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1391/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1391/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1391/console This message is automatically generated.
          Hide
          stack added a comment -

          hbasck should take an option that renders the table being checked read-only while it runs.

          hbasck is probably also one place where we'd locate the force-flush and then force-compaction of all stores feature.

          See HADOOP-1958.

          Show
          stack added a comment - hbasck should take an option that renders the table being checked read-only while it runs. hbasck is probably also one place where we'd locate the force-flush and then force-compaction of all stores feature. See HADOOP-1958 .
          Hide
          Billy Pearson added a comment -

          Can we commit this and open a new jira for the add on options you would like added to this stack?

          Show
          Billy Pearson added a comment - Can we commit this and open a new jira for the add on options you would like added to this stack?
          Hide
          Jim Kellerman added a comment -

          Patch was for one of the sub-issues and has been committed.

          The main issue has not yet been addressed.

          Show
          Jim Kellerman added a comment - Patch was for one of the sub-issues and has been committed. The main issue has not yet been addressed.
          Hide
          Bryan Duxbury added a comment -

          Should we adjust the fix version out to 0.17 if the main issue hasn't been addressed yet?

          Show
          Bryan Duxbury added a comment - Should we adjust the fix version out to 0.17 if the main issue hasn't been addressed yet?
          Hide
          Jim Kellerman added a comment -

          Pushing fix out to 0.17 since adding the referential integrity needed to make this tool really work will require another migration tool.

          Show
          Jim Kellerman added a comment - Pushing fix out to 0.17 since adding the referential integrity needed to make this tool really work will require another migration tool.
          Hide
          stack added a comment -

          Are you going to rename the current migration script 'migrate' for 0.16 (so users always know that to migrate, they need to run the 'migrate' tool – not 'updateRegionDirs' followed by 'addIntegrityChecks' followd by 'polishTheWindows'...)? (The migrate script will know by looking at the FS what subscripts to run, etc.).

          Show
          stack added a comment - Are you going to rename the current migration script 'migrate' for 0.16 (so users always know that to migrate, they need to run the 'migrate' tool – not 'updateRegionDirs' followed by 'addIntegrityChecks' followd by 'polishTheWindows'...)? (The migrate script will know by looking at the FS what subscripts to run, etc.).
          Hide
          stack added a comment -

          Its possible that a delete of a table could be interrupted with regions and the table dir left imcompletely removed. hbasck should take a look-see and clean up any leftovers.

          Show
          stack added a comment - Its possible that a delete of a table could be interrupted with regions and the table dir left imcompletely removed. hbasck should take a look-see and clean up any leftovers.
          Hide
          Onur GUN added a comment - - edited

          Is is possible that we have a case where forcing flush after the checks is more appropriate than forcing before? When it should be done?

          Let's say we have a region1 having faulty data. We have run hbasck. It forced flush and we now have region11 and region12. If region1 did not have faulty data, we would have only region1. Later the compaction resolved this, but split and compaction ran for nothing in this region server.

          If i am wrong, please point me what to look for.

          Show
          Onur GUN added a comment - - edited Is is possible that we have a case where forcing flush after the checks is more appropriate than forcing before? When it should be done? Let's say we have a region1 having faulty data. We have run hbasck. It forced flush and we now have region11 and region12. If region1 did not have faulty data, we would have only region1. Later the compaction resolved this, but split and compaction ran for nothing in this region server. If i am wrong, please point me what to look for.
          Hide
          Jim Kellerman added a comment -

          There are (at least) three areas where we are still vulnerable:

          1. Incomplete table deletion. (see above)
          2. Incomplete cache flush (region server dies during flush) see below.
          3. Inability to recover write ahead log (HLog) if server dies. Depends on HADOOP-4379

          HBase protects itself from incomplete compactions by performing the operation in a temporary directory. If the compaction does not complete successfully, another compaction request will be generated and the partially completed compaction data is erased.

          We should do something similar for a cache flush: write the flush to a temporary directory and move the new store file into place only if the flush completes successfully. Any subsequent cache flush will erase data in the temporary flush directory. Recovery will happen when HLog is replayed by new server for the region.

          Without HADOOP-4379, we cannot guarantee that we can recover the most recent HLog file. Although Dhruba is looking at the issue, he would probably accept help from someone else. Getting HADOOP-4379 integrated into Hadoop is the most important thing we can do to ensure data integrity.

          The second most important thing to do is to put cache flushes into a temporary directory.

          That would leave hbasefsck handling incomplete deletes (and perhaps other inconsistencies in the HBase file structure)

          Show
          Jim Kellerman added a comment - There are (at least) three areas where we are still vulnerable: 1. Incomplete table deletion. (see above) 2. Incomplete cache flush (region server dies during flush) see below. 3. Inability to recover write ahead log (HLog) if server dies. Depends on HADOOP-4379 HBase protects itself from incomplete compactions by performing the operation in a temporary directory. If the compaction does not complete successfully, another compaction request will be generated and the partially completed compaction data is erased. We should do something similar for a cache flush: write the flush to a temporary directory and move the new store file into place only if the flush completes successfully. Any subsequent cache flush will erase data in the temporary flush directory. Recovery will happen when HLog is replayed by new server for the region. Without HADOOP-4379 , we cannot guarantee that we can recover the most recent HLog file. Although Dhruba is looking at the issue, he would probably accept help from someone else. Getting HADOOP-4379 integrated into Hadoop is the most important thing we can do to ensure data integrity. The second most important thing to do is to put cache flushes into a temporary directory. That would leave hbasefsck handling incomplete deletes (and perhaps other inconsistencies in the HBase file structure)
          Hide
          Jim Kellerman added a comment -

          If we cannot get HADOOP-4379 soon we should investigate using BookKeeper for HLogs. (see HADOOP-5189)

          Show
          Jim Kellerman added a comment - If we cannot get HADOOP-4379 soon we should investigate using BookKeeper for HLogs. (see HADOOP-5189 )
          Hide
          stack added a comment -

          Onur, can you say more about what you are asking?

          Of note, flushes and compactions can now be done from the shell – see under tools (Tools in shell might be the place to expose hbfsck facility).

          Other notes: Ryan just noted that rather than put table into read-only mode, might want a facility to put it into safe-mode.

          Show
          stack added a comment - Onur, can you say more about what you are asking? Of note, flushes and compactions can now be done from the shell – see under tools (Tools in shell might be the place to expose hbfsck facility). Other notes: Ryan just noted that rather than put table into read-only mode, might want a facility to put it into safe-mode.
          Hide
          Onur GUN added a comment - - edited

          Sorry for late reply.

          Hi Jim and stack,

          Thanks for the replies. Maybe i am thinking of an exceptional or irrelevant situation, i don't know for sure. But let me try to explain.

          Lets say that we have run hbasck. It forces flush as it is suggested. Since the base functionality of hbasck is to work in a corruption state; lets say we have corrupt mapfiles, storefiles or regions. I just wonder if there could be a case, that flush triggers some other events or event series "just" because of the corruptions. I mean we reach the split size etc just because of the flush and corruptions.This will not probably effect the results, but i wonder if forcing the flush is the best case. Maybe we should do correction and we should flush later or anything else. I don't know if this is possible or i am on the correct track, i just need a direction to think of.

          I am planning to do this as my SOC by the way. If anybody suggests any other ideas, i will prepare them for my proposal and research to learn more. I will also email Dhruba and try to help if he needs. I think that, i can help testing HADOOP-4379 for now.

          Show
          Onur GUN added a comment - - edited Sorry for late reply. Hi Jim and stack, Thanks for the replies. Maybe i am thinking of an exceptional or irrelevant situation, i don't know for sure. But let me try to explain. Lets say that we have run hbasck. It forces flush as it is suggested. Since the base functionality of hbasck is to work in a corruption state; lets say we have corrupt mapfiles, storefiles or regions. I just wonder if there could be a case, that flush triggers some other events or event series "just" because of the corruptions. I mean we reach the split size etc just because of the flush and corruptions.This will not probably effect the results, but i wonder if forcing the flush is the best case. Maybe we should do correction and we should flush later or anything else. I don't know if this is possible or i am on the correct track, i just need a direction to think of. I am planning to do this as my SOC by the way. If anybody suggests any other ideas, i will prepare them for my proposal and research to learn more. I will also email Dhruba and try to help if he needs. I think that, i can help testing HADOOP-4379 for now.
          Hide
          Jim Kellerman added a comment -

          hbasefsck was intended to be a tool used while hdfs was running but hbase was offline. So I do not understand how flush enters the picture.

          The idea was for it to walk through the entire on-disk structure of an hbase installation, and repair 'stuff that is broken' such as bad entries in the meta and root region, 'extra files', etc.

          If you have a different idea, please post it. We are always open to new approaches.

          Show
          Jim Kellerman added a comment - hbasefsck was intended to be a tool used while hdfs was running but hbase was offline. So I do not understand how flush enters the picture. The idea was for it to walk through the entire on-disk structure of an hbase installation, and repair 'stuff that is broken' such as bad entries in the meta and root region, 'extra files', etc. If you have a different idea, please post it. We are always open to new approaches.
          Hide
          Onur GUN added a comment -

          I understood the steps now. I have thought that hbase was also online and that made things complicated. Thank you.

          Show
          Onur GUN added a comment - I understood the steps now. I have thought that hbase was also online and that made things complicated. Thank you.
          Hide
          ryan rawson added a comment -

          I had a sick cluster after severe OOME, the symptom seemed to be that there were more regions than there really were.

          I used this tool to help recover my cluster and get going again.

          Show
          ryan rawson added a comment - I had a sick cluster after severe OOME, the symptom seemed to be that there were more regions than there really were. I used this tool to help recover my cluster and get going again.
          Hide
          Onur GUN added a comment -

          I am trying to finalize my proposal and i welcome any comments for the below.

          I have examined Ryan's patch and for deleting bogus entries in .META. In the patch there is a comment that "it is a trivial fix case". At this point i wonder if we can have any other cases for bogus entries that hbasfck should resolve in a different manner.

          And since we are vulnerable in deletion, flush and HLog, I guess hbascfck should assume that some of these are resolved. If we are fixing meta from files, we assume that files are correct.

          As of now i guess what needs to be done is:
          1-)We should force flush and compaction (for now we can have problems related to incomplete cache flush and HLog). After that we should put table into safe-mode(since otherwise hbasfck can not operate safely, and i guess this will be similar to dfsadmin safemode).
          2-)Recovering root region from meta regions.
          3-)Listing all regions, comparing them with the .META. As a result recovering meta from regions
          by adding or deleting (Done by Ryan's patch i guess) .META entries.
          4-)Checking info files.

          More can be:
          5-) Adding an option for taking a snapshot before the fix.
          6-) Creating a log file that can be used to track improvement needs and problems.

          Show
          Onur GUN added a comment - I am trying to finalize my proposal and i welcome any comments for the below. I have examined Ryan's patch and for deleting bogus entries in .META. In the patch there is a comment that "it is a trivial fix case". At this point i wonder if we can have any other cases for bogus entries that hbasfck should resolve in a different manner. And since we are vulnerable in deletion, flush and HLog, I guess hbascfck should assume that some of these are resolved. If we are fixing meta from files, we assume that files are correct. As of now i guess what needs to be done is: 1-)We should force flush and compaction (for now we can have problems related to incomplete cache flush and HLog). After that we should put table into safe-mode(since otherwise hbasfck can not operate safely, and i guess this will be similar to dfsadmin safemode). 2-)Recovering root region from meta regions. 3-)Listing all regions, comparing them with the .META. As a result recovering meta from regions by adding or deleting (Done by Ryan's patch i guess) .META entries. 4-)Checking info files. More can be: 5-) Adding an option for taking a snapshot before the fix. 6-) Creating a log file that can be used to track improvement needs and problems.
          Hide
          ryan rawson added a comment -

          There are some other things that can be done:

          • Rebuilding .META. if is totally ruined.
          • Automate more of the checking that HbaseFsck fails on - if you have multiple region directories for the same start key/end key, the tool tells you so and barfs out and lets you resolve it by hand (moving stuff out of /hbase/table_name).
          • Other byzantine error cases.

          One thing we want to probably avoid is unnecessary read/writes - flush is good, but major compactions is probably not good. Compactions to resolve parent/child split relationships good too.

          I have an outstanding patch that provides a tool called HFileStats which dumps the meta blocks of a HFile (the 0.20 storage format)... the stats contain start/end key, and will allow us to figure if a hfile for a region matches or what region it might actually belong to (for rebuilding .META. info).

          Lots of work to be done, and I hope my initial stab at it (spurred by a broken cluster I thereby fixed) will make a good framework for further work.

          Good luck!

          Show
          ryan rawson added a comment - There are some other things that can be done: Rebuilding .META. if is totally ruined. Automate more of the checking that HbaseFsck fails on - if you have multiple region directories for the same start key/end key, the tool tells you so and barfs out and lets you resolve it by hand (moving stuff out of /hbase/table_name). Other byzantine error cases. One thing we want to probably avoid is unnecessary read/writes - flush is good, but major compactions is probably not good. Compactions to resolve parent/child split relationships good too. I have an outstanding patch that provides a tool called HFileStats which dumps the meta blocks of a HFile (the 0.20 storage format)... the stats contain start/end key, and will allow us to figure if a hfile for a region matches or what region it might actually belong to (for rebuilding .META. info). Lots of work to be done, and I hope my initial stab at it (spurred by a broken cluster I thereby fixed) will make a good framework for further work. Good luck!
          Hide
          Onur GUN added a comment -

          Hi Ryan, I have learned a lot from your patch, will work on it in more detail. Thanks for your recommendations.

          Show
          Onur GUN added a comment - Hi Ryan, I have learned a lot from your patch, will work on it in more detail. Thanks for your recommendations.
          Hide
          Jean-Daniel Cryans added a comment -

          Onur,

          Working on something that rebuilds META would be very appreciated. For example, last week we had a power failure while META was being compacted and its only store file was lost! Fortunately most of our data is recoverable (segments) and the important stuff was in a backup.

          Show
          Jean-Daniel Cryans added a comment - Onur, Working on something that rebuilds META would be very appreciated. For example, last week we had a power failure while META was being compacted and its only store file was lost! Fortunately most of our data is recoverable (segments) and the important stuff was in a backup.
          Hide
          stack added a comment -

          Other things:

          + Clean up dross in the filesystem. An option on hbfsck would go through the filesystem and remove anything not referenced. For example, the powerset instance of hbase has been migrated multiple times and there is a bunch of junk under /hbase. I could go through manually deleting stuff but I'm afraid I'd remove something needed. Tool could also look at regions and make sure they have a reference in .META. If not, then they have been left over somehow and can be cleaned ("OK to remove region XYZ?").
          + Region names on filesystem are encoded. An info file that said what actual name was and other attributes of region would help hbfsck put things back together again – would also help debugging a hosed cluster.
          + (Stretch goal) If a data file is corrupt, rewrite it with as much of the original data as is possible to save (skip bad section). Make smart decisions about edit sequence id, etc., if unreadable inferring from neighbors if necessary.
          + (Stretch goal) Have an --info mode where hbfsck dumps out stats on the content of the filesystem

          First cut, I'd imagine, the hbfsck would run through the filesystem looking at content of zk and .META. and try to report anomalies. Next step after that would be effect repair. Part of the hbfsck dev. would be figuring what you need in filesystem to do things like repair a broken .META. or to figure what content is dangling/unreferenced.

          See how your tool is evolving. As you write it, keep in mind that you might later want to host it inside a MR job – especially if you ever need to read a fat hbase instance with lots of regions. Also consider, as in fsck, that you might have a 'quick' mode (Just a thought).

          It should probably run like fsck where it asks yes/no when its effecting repair (user should be able to pass flag which says 'yes' to all questions).

          Documentation and clean integration with shell I'd imagine, would be key components of any hbasck tool since it'll only be needed rarely but when it is, the user in distress will want to be clear on how it all works.

          Hope this helps.

          Show
          stack added a comment - Other things: + Clean up dross in the filesystem. An option on hbfsck would go through the filesystem and remove anything not referenced. For example, the powerset instance of hbase has been migrated multiple times and there is a bunch of junk under /hbase. I could go through manually deleting stuff but I'm afraid I'd remove something needed. Tool could also look at regions and make sure they have a reference in .META. If not, then they have been left over somehow and can be cleaned ("OK to remove region XYZ?"). + Region names on filesystem are encoded. An info file that said what actual name was and other attributes of region would help hbfsck put things back together again – would also help debugging a hosed cluster. + (Stretch goal) If a data file is corrupt, rewrite it with as much of the original data as is possible to save (skip bad section). Make smart decisions about edit sequence id, etc., if unreadable inferring from neighbors if necessary. + (Stretch goal) Have an --info mode where hbfsck dumps out stats on the content of the filesystem First cut, I'd imagine, the hbfsck would run through the filesystem looking at content of zk and .META. and try to report anomalies. Next step after that would be effect repair. Part of the hbfsck dev. would be figuring what you need in filesystem to do things like repair a broken .META. or to figure what content is dangling/unreferenced. See how your tool is evolving. As you write it, keep in mind that you might later want to host it inside a MR job – especially if you ever need to read a fat hbase instance with lots of regions. Also consider, as in fsck, that you might have a 'quick' mode (Just a thought). It should probably run like fsck where it asks yes/no when its effecting repair (user should be able to pass flag which says 'yes' to all questions). Documentation and clean integration with shell I'd imagine, would be key components of any hbasck tool since it'll only be needed rarely but when it is, the user in distress will want to be clear on how it all works. Hope this helps.
          Hide
          Onur GUN added a comment -

          I have submitted my proposal by considering all recommendations up to now. I will be watching the issue for possible comments. Thank you.

          Show
          Onur GUN added a comment - I have submitted my proposal by considering all recommendations up to now. I will be watching the issue for possible comments. Thank you.
          Hide
          Onur GUN added a comment -

          When we have succesfully dropped a table, HBase does not remove the directory /hbase/mytablename in HDFS. Hbfsck should also remove these.

          Show
          Onur GUN added a comment - When we have succesfully dropped a table, HBase does not remove the directory /hbase/mytablename in HDFS. Hbfsck should also remove these.
          Hide
          stack added a comment -

          Moving out of 0.20.0. If tool shows up meantime, well and good; we'll commit it. Trying to prune the number of issues we have filed against 0.20.0.

          Show
          stack added a comment - Moving out of 0.20.0. If tool shows up meantime, well and good; we'll commit it. Trying to prune the number of issues we have filed against 0.20.0.
          Hide
          Lars George added a comment -

          From http://devblog.streamy.com/2009/08/09/hbase-hackathon-2-day-one/

          • Sequence IDs
          • Change to stamps?
          • Must be unique
          • Fix for merges
          • Start/end keys match
          • Figures out / runs merges
          • Repair references
          • Two modes
          • Quick mode scans META
          • Full mode reads HFiles and verifies fully
          Show
          Lars George added a comment - From http://devblog.streamy.com/2009/08/09/hbase-hackathon-2-day-one/ Sequence IDs Change to stamps? Must be unique Fix for merges Start/end keys match Figures out / runs merges Repair references Two modes Quick mode scans META Full mode reads HFiles and verifies fully
          Hide
          Todd Lipcon added a comment -

          Anyone working on this these days? I'd love to see this in 0.21 - when a table gets out of sync with meta it really would be great to have a tool.

          Show
          Todd Lipcon added a comment - Anyone working on this these days? I'd love to see this in 0.21 - when a table gets out of sync with meta it really would be great to have a tool.
          Hide
          stack added a comment -

          Flagging as a gsoc possible project.

          Show
          stack added a comment - Flagging as a gsoc possible project.
          Hide
          stack added a comment -

          Silly little script that takes tablename, startkey, and endkey and inserts into meta an entry. Does not check fs. Presumes that you are not trying to fixup the first region in table (it reads first region in table to learn the table descriptor). This is a hack to quickly help out someone w/ hole in their table.

          Show
          stack added a comment - Silly little script that takes tablename, startkey, and endkey and inserts into meta an entry. Does not check fs. Presumes that you are not trying to fixup the first region in table (it reads first region in table to learn the table descriptor). This is a hack to quickly help out someone w/ hole in their table.
          Hide
          stack added a comment -

          Script that will run through .META. looking for holes where the former regions endkey does not match the startkey. If you pass --fix, it will search the fs looking for candidate region and will load it into meta if it finds one.

          Show
          stack added a comment - Script that will run through .META. looking for holes where the former regions endkey does not match the startkey. If you pass --fix, it will search the fs looking for candidate region and will load it into meta if it finds one.
          Hide
          stack added a comment -

          I added the check_meta.rb script to TRUNK

          Show
          stack added a comment - I added the check_meta.rb script to TRUNK
          Hide
          stack added a comment -

          This version deals with offlined parents.

          Show
          stack added a comment - This version deals with offlined parents.
          Hide
          stack added a comment -

          TODO: Add verifying that the regionserver in server field is actually carrying the region listed in .META.

          Show
          stack added a comment - TODO: Add verifying that the regionserver in server field is actually carrying the region listed in .META.
          Hide
          Jonathan Gray added a comment -

          FYI, there is an existing call in HRegionInterface that can be used:

          public HRegionInfo[] getRegionsAssignment() throws IOException;

          Currently that's being used during master failover though this may change with the zk/master stuff underway, I think it will still make sense to retain this call for debugging/fsck.

          While on the subject, there's also another call in HRI:

          public HRegion [] getOnlineRegionsAsArray();

          This doesn't appear to be used anywhere except in 4 commented-out tests in TestHBaseAdmin Unless there are objections, I'm going to remove it while doing other cleanup in master/client/rs comm.

          Show
          Jonathan Gray added a comment - FYI, there is an existing call in HRegionInterface that can be used: public HRegionInfo[] getRegionsAssignment() throws IOException; Currently that's being used during master failover though this may change with the zk/master stuff underway, I think it will still make sense to retain this call for debugging/fsck. While on the subject, there's also another call in HRI: public HRegion [] getOnlineRegionsAsArray(); This doesn't appear to be used anywhere except in 4 commented-out tests in TestHBaseAdmin Unless there are objections, I'm going to remove it while doing other cleanup in master/client/rs comm.
          Hide
          dhruba borthakur added a comment -

          I am in the process of writing a java utility that uses getOnlineRegionsAsArray (among many other calls to Master and regionserver) that will display a report of any inconsistencies between master/regionserver/hdfs.. I will post this patch (along with documentation) very soon.

          Show
          dhruba borthakur added a comment - I am in the process of writing a java utility that uses getOnlineRegionsAsArray (among many other calls to Master and regionserver) that will display a report of any inconsistencies between master/regionserver/hdfs.. I will post this patch (along with documentation) very soon.
          Hide
          dhruba borthakur added a comment -

          I meant I am using getRegionsAssignment() (I am not using getOnlineRegionsAsArray)

          Show
          dhruba borthakur added a comment - I meant I am using getRegionsAssignment() (I am not using getOnlineRegionsAsArray)
          Hide
          stack added a comment -

          Committed. Add the bit to hook it up so it shows when you do bin/hbase (as per Nicolas suggestion). Lets file new issues to improve of if bugs.

          Show
          stack added a comment - Committed. Add the bit to hook it up so it shows when you do bin/hbase (as per Nicolas suggestion). Lets file new issues to improve of if bugs.
          Hide
          stack added a comment -

          I made some comments over on review.hbase.org but they didn't make it in here. Go there if interested. Let me also attach Dhruba's patch here in case anyone goes looking for it.

          Show
          stack added a comment - I made some comments over on review.hbase.org but they didn't make it in here. Go there if interested. Let me also attach Dhruba's patch here in case anyone goes looking for it.
          Hide
          stack added a comment -

          Here is Dhruba's patch copied over from review.hbase.org. Its what I committed (my commit also made this script show through when you do bin/hbase... its the hbck option).

          Show
          stack added a comment - Here is Dhruba's patch copied over from review.hbase.org. Its what I committed (my commit also made this script show through when you do bin/hbase... its the hbck option).
          Hide
          Ted Yu added a comment -

          HBaseConfiguration.java and HBaseFsck.java, patched for 0.20.5 release.

          Show
          Ted Yu added a comment - HBaseConfiguration.java and HBaseFsck.java, patched for 0.20.5 release.
          Hide
          stack added a comment -

          @Ted

          You don't need to make the HBaseConfiguration change. In fact you shouldn't. Better to change the HBaseConfiguration.create to new HBaseConfiguration in HBaseFsck.

          I tried it and it gives me the following issue:

          Number of Tables: 1
          Number of live region servers:1
          Number of dead region servers:0
          Exception in thread "main" java.lang.NullPointerException
                  at org.apache.hadoop.hbase.client.HBaseFsck.processRegionServers(HBaseFsck.java:309)
                  at org.apache.hadoop.hbase.client.HBaseFsck.doWork(HBaseFsck.java:162)
                  at org.apache.hadoop.hbase.client.HBaseFsck.main(HBaseFsck.java:534)
          

          Do you see that?

          Show
          stack added a comment - @Ted You don't need to make the HBaseConfiguration change. In fact you shouldn't. Better to change the HBaseConfiguration.create to new HBaseConfiguration in HBaseFsck. I tried it and it gives me the following issue: Number of Tables: 1 Number of live region servers:1 Number of dead region servers:0 Exception in thread "main" java.lang.NullPointerException at org.apache.hadoop.hbase.client.HBaseFsck.processRegionServers(HBaseFsck.java:309) at org.apache.hadoop.hbase.client.HBaseFsck.doWork(HBaseFsck.java:162) at org.apache.hadoop.hbase.client.HBaseFsck.main(HBaseFsck.java:534) Do you see that?
          Hide
          Luke Forehand added a comment -

          I just ran this utility on our corrupt META table and got this exception:

          Exception in thread "main" java.io.IOException: Two entries in META are same REGION => {NAME => 'feedData,20100717 0463ff1d352930dc354f35358d3d11997c2fa050,1281549071340.d905798b9e7dca1c24a09a908bd1081c.', STARTKEY => '20100717 0463ff1d352930dc354f35358d3d11997c2fa050', ENDKEY => '20100717 25beb50790d72e3e2bf1fbebc0656dbcd82cdbd3', ENCODED => d905798b9e7dca1c24a09a908bd1081c, TABLE => {{NAME => 'feedData', FAMILIES => [

          {NAME => 'core', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'LZO', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '131072', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

          ,

          {NAME => 'tf', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'LZO', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '131072', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

          ,

          {NAME => 'tfidf', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'LZO', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '131072', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

          ]}}
          at org.apache.hadoop.hbase.client.HBaseFsck$1.processRow(HBaseFsck.java:420)
          at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:156)
          at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:68)
          at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:53)
          at com.ni.ods.hadoop.utils.HBaseFsck.getMetaEntries(HBaseFsck.java:435)
          at com.ni.ods.hadoop.utils.HBaseFsck.doWork(HBaseFsck.java:109)
          at com.ni.ods.hadoop.utils.HBaseFsck.main(HBaseFsck.java:522)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.apache.hadoop.util.RunJar.main(RunJar.java:186)

          Show
          Luke Forehand added a comment - I just ran this utility on our corrupt META table and got this exception: Exception in thread "main" java.io.IOException: Two entries in META are same REGION => {NAME => 'feedData,20100717 0463ff1d352930dc354f35358d3d11997c2fa050,1281549071340.d905798b9e7dca1c24a09a908bd1081c.', STARTKEY => '20100717 0463ff1d352930dc354f35358d3d11997c2fa050', ENDKEY => '20100717 25beb50790d72e3e2bf1fbebc0656dbcd82cdbd3', ENCODED => d905798b9e7dca1c24a09a908bd1081c, TABLE => {{NAME => 'feedData', FAMILIES => [ {NAME => 'core', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'LZO', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '131072', IN_MEMORY => 'false', BLOCKCACHE => 'true'} , {NAME => 'tf', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'LZO', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '131072', IN_MEMORY => 'false', BLOCKCACHE => 'true'} , {NAME => 'tfidf', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'LZO', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '131072', IN_MEMORY => 'false', BLOCKCACHE => 'true'} ]}} at org.apache.hadoop.hbase.client.HBaseFsck$1.processRow(HBaseFsck.java:420) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:156) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:68) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:53) at com.ni.ods.hadoop.utils.HBaseFsck.getMetaEntries(HBaseFsck.java:435) at com.ni.ods.hadoop.utils.HBaseFsck.doWork(HBaseFsck.java:109) at com.ni.ods.hadoop.utils.HBaseFsck.main(HBaseFsck.java:522) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186)

            People

            • Assignee:
              dhruba borthakur
              Reporter:
              Jim Kellerman
            • Votes:
              2 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development