Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-16015

Usability - VerifyReplication performance is too slow

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Usability
    • Labels:
      None

      Description

      I see VerifyReplication is too slow in Geo replication cluster, then I dig into the code where default Input scanner caching set as 1 for target cluster request.
      This value should be optimal or could be exposed in usage command.
      -Dhbase.mapreduce.scan.cachedrows=100

      TableInputFormat.java
      public static final String SCAN_CACHEDROWS = "hbase.mapreduce.scan.cachedrows";
      
      VerifyReplication.java
      Configuration conf = context.getConfiguration();
      final Scan scan = new Scan();        scan.setCaching(conf.getInt(TableInputFormat.SCAN_CACHEDROWS, 1));
      

      If agree, then I will add this line into printUsage method as shown below,

      VerifyReplication.java
      System.err.println("For performance consider the following option, Input scanner caching for source to target cluster request\n"
                  + "-Dhbase.mapreduce.scan.cachedrows=100");
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              karthikShva123@gmail.com Karthik Palanisamy
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: