Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-11715

HBase should provide a tool to compare 2 remote tables.

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: util
    • Labels:
      None

      Description

      As discussed in the mailing list, when a table is copied to another cluster and need to be validated against the first one, only VerifyReplication can be used. However, this can be very long since data need to be copied again.

      We should provide an easier and faster way to compare the tables.

      One option is to calculate hashs per ranges. User can define number of buckets, then we split the table into this number of buckets and calculate an hash for each (Like partitioner is already doing). We can also optionally calculate an overall CRC to reduce even more hash collision.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                jmspaggi Jean-Marc Spaggiari
              • Votes:
                0 Vote for this issue
                Watchers:
                15 Start watching this issue

                Dates

                • Created:
                  Updated: