HBase
  1. HBase
  2. HBASE-3013

Tool to verify data in two clusters

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.90.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      It would be useful to have a tool to easily compare the data between tables in different clusters, at least to make sure that replication is working correctly. I'm thinking of building that inside CopyTable, kind of an option à là --verify that could be run independently or after the copy (or not at all). The fact that we can already pass start/stop times is pretty useful too when you don't want to check the whole tables, do incremental verifications, etc.

      1. HBASE-3013-v2.patch
        19 kB
        Jean-Daniel Cryans

        Activity

        Hide
        stack added a comment -

        Sorry J-D, I must have had a stale view.

        Show
        stack added a comment - Sorry J-D, I must have had a stale view.
        Hide
        Jean-Daniel Cryans added a comment -

        @Stack, I resolved it 2 days ago, am I missing something?

        Show
        Jean-Daniel Cryans added a comment - @Stack, I resolved it 2 days ago, am I missing something?
        Hide
        stack added a comment -

        Can we close this now?

        Show
        stack added a comment - Can we close this now?
        Jean-Daniel Cryans made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Resolution Fixed [ 1 ]
        Hide
        Jean-Daniel Cryans added a comment -

        Committed to trunk.

        Show
        Jean-Daniel Cryans added a comment - Committed to trunk.
        Jean-Daniel Cryans made changes -
        Field Original Value New Value
        Attachment HBASE-3013-v2.patch [ 12458294 ]
        Hide
        Jean-Daniel Cryans added a comment -

        Final patch I'm about to commit.

        Show
        Jean-Daniel Cryans added a comment - Final patch I'm about to commit.
        Hide
        HBase Review Board added a comment -

        Message from: stack@duboce.net

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.cloudera.org/r/1111/#review1693
        -----------------------------------------------------------

        Ship it!

        Excellent. Minor items to address on commit. So, is this new fancy tool doc'd in replication notes? I think it kinda critical. Would suggest opening a doc issue against 0.90 to mention presence of this tool in the replication chapter.... if only a sentence.

        /trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java
        <http://review.cloudera.org/r/1111/#comment5603>

        Missing '<p>' here?

        /trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java
        <http://review.cloudera.org/r/1111/#comment5602>

        Space around the '+'

        /trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java
        <http://review.cloudera.org/r/1111/#comment5604>

        Put this in Result?

        • stack
        Show
        HBase Review Board added a comment - Message from: stack@duboce.net ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1111/#review1693 ----------------------------------------------------------- Ship it! Excellent. Minor items to address on commit. So, is this new fancy tool doc'd in replication notes? I think it kinda critical. Would suggest opening a doc issue against 0.90 to mention presence of this tool in the replication chapter.... if only a sentence. /trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java < http://review.cloudera.org/r/1111/#comment5603 > Missing '<p>' here? /trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java < http://review.cloudera.org/r/1111/#comment5602 > Space around the '+' /trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java < http://review.cloudera.org/r/1111/#comment5604 > Put this in Result? stack
        Hide
        HBase Review Board added a comment -

        Message from: "Jean-Daniel Cryans" <jdcryans@apache.org>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.cloudera.org/r/1111/
        -----------------------------------------------------------

        Review request for hbase.

        Summary
        -------

        This new mapreduce job called VerifyReplication compares the data between two clusters that are replication-enabled. Its usage is relatively simple when you already use replication, and even let's you pass the peer id instead of the cluster key for the target cluster.

        This addresses bug HBASE-3013.
        http://issues.apache.org/jira/browse/HBASE-3013

        Diffs


        /trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/Driver.java 1028470
        /trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java PRE-CREATION
        /trunk/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java 1028470
        /trunk/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java 1028470

        Diff: http://review.cloudera.org/r/1111/diff

        Testing
        -------

        Unit test (one new included) and this has been running for a month here.

        Thanks,

        Jean-Daniel

        Show
        HBase Review Board added a comment - Message from: "Jean-Daniel Cryans" <jdcryans@apache.org> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1111/ ----------------------------------------------------------- Review request for hbase. Summary ------- This new mapreduce job called VerifyReplication compares the data between two clusters that are replication-enabled. Its usage is relatively simple when you already use replication, and even let's you pass the peer id instead of the cluster key for the target cluster. This addresses bug HBASE-3013 . http://issues.apache.org/jira/browse/HBASE-3013 Diffs /trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/Driver.java 1028470 /trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java PRE-CREATION /trunk/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java 1028470 /trunk/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java 1028470 Diff: http://review.cloudera.org/r/1111/diff Testing ------- Unit test (one new included) and this has been running for a month here. Thanks, Jean-Daniel
        Hide
        Jean-Daniel Cryans added a comment -

        No, I have to the tool here I just need to post the patch.

        Show
        Jean-Daniel Cryans added a comment - No, I have to the tool here I just need to post the patch.
        Hide
        Jonathan Gray added a comment -

        Punt to 0.92?

        Show
        Jonathan Gray added a comment - Punt to 0.92?
        Jean-Daniel Cryans created issue -

          People

          • Assignee:
            Jean-Daniel Cryans
            Reporter:
            Jean-Daniel Cryans
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development