Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-17871

scan#setBatch(int) call leads wrong result of VerifyReplication

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.4.0, 2.0.0
    • 1.4.0, 2.0.0
    • None
    • None
    • Reviewed

    Description

      VerifyReplication tool printed weird logs.

      2017-04-03 23:30:50,252 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: CONTENT_DIFFERENT_ROWS, rowkey=a00001001930000
      2017-04-03 23:30:50,280 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: ONLY_IN_PEER_TABLE_ROWS, rowkey=a00001001930000
      2017-04-03 23:30:50,387 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: CONTENT_DIFFERENT_ROWS, rowkey=a00001003850000
      2017-04-03 23:30:50,414 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: ONLY_IN_PEER_TABLE_ROWS, rowkey=a00001003850000
      2017-04-03 23:30:50,480 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: CONTENT_DIFFERENT_ROWS, rowkey=a00001005320000
      2017-04-03 23:30:50,508 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: ONLY_IN_PEER_TABLE_ROWS, rowkey=a00001005320000
      

      Here, each bad rows were marked as both CONTENT_DIFFERENT_ROWS and ONLY_IN_PEER_TABLE_ROWS.
      This should never happen so I took a look at code and found scan.setBatch call.

          @Override
          public void map(ImmutableBytesWritable row, final Result value,
                          Context context)
              throws IOException {
            if (replicatedScanner == null) {
      	    ...
              final Scan scan = new Scan();
              scan.setBatch(batch);
      

      As stated in HBASE-16376, scan#setBatch(int) call implicitly allows scan results to be partial.

      Since VerifyReplication is assuming each scanner.next() call returns entire row,
      partial results break compare logic.

      We should avoid setBatch call here.
      Thanks to RPC chunking (explained in this blog https://blogs.apache.org/hbase/entry/scan_improvements_in_hbase_1),
      it's safe and acceptable I think.

      Attachments

        1. HBASE-17871.master.004.patch
          4 kB
          Tomu Tsuruhara
        2. HBASE-17871.master.003.patch
          4 kB
          Tomu Tsuruhara
        3. HBASE-17871.master.003.patch
          4 kB
          Phil Yang
        4. HBASE-17871.master.002.patch
          4 kB
          Tomu Tsuruhara
        5. HBASE-17871.master.001.patch
          2 kB
          Tomu Tsuruhara
        6. beforethepatch.png
          113 kB
          Tomu Tsuruhara
        7. after.png
          90 kB
          Tomu Tsuruhara

        Activity

          People

            tomu.tsuruhara Tomu Tsuruhara
            tomu.tsuruhara Tomu Tsuruhara
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: