Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-18846

Accommodate the hbase-indexer/lily/SEP consumer deploy-type

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.0.0-alpha-4, 2.0.0
    • None
    • None
    • Reviewed
    • Hide
      Makes it so hbase-indexer/lily can move off dependence on internal APIs and instead move to public APIs.

      Adds being able to disable near-all HRegionServer services. This along with an existing plugin mechanism which allows configuring the RegionServer to host an alternate Connection implementation, makes it so we can put up a cluster of hollowed-out HRegionServers purposed to pose as a Replication Sink for a source HBase Cluster (Users do not need to figure our RPC, our PB encodings, build a distributed service, etc.). In the alternate supplied Connection implementation, hbase-indexer would install its own code to catch the Replication.

      Below and attached are sample hbase-server.xml files and alternate Connection implementations. To start up an HRegionServer as a sink, first make sure there is a ZooKeeper ensemble we can talk to. If none, just start one:
      {code}
      ./bin/hbase-daemon.sh start zookeeper
      {code}

      To start up a single RegionServer, put in place the below sample hbase-site.xml and a derviative of the below IndexerConnection on the CLASSPATH, and then start the RegionServer:
      {code}
      ./bin/hbase-daemon.sh start org.apache.hadoop.hbase.regionserver.HRegionServer
      {code}
      Stdout and Stderr will go into files under configured logs directory. Browse to localhost:16030 to find webui (unless disabled).

      DETAILS

      This patch adds configuration to disable RegionServer internal Services, Managers, Caches, etc., starting up.

      By default a RegionServer starts up an Admin and Client Service. To disable either or both, use the below booleans:
      {code}
      hbase.regionserver.admin.service
      hbase.regionserver.client.service
      {code}

      Both default true.

      To make a HRegionServer startup and stay up without expecting to communicate with a master, set the below boolean to false:

      {code}
      hbase.masterless
      {code]
      Default is false.

      h3. Sample hbase-site.xml that disables internal HRegionServer Services
      Below is an example hbase-site.xml that turns off most Services and that then installs an alternate Connection implementation, one that is nulled out in all regards except in being able to return a "Table" that can catch a Replication Stream in its {code}batch(List<? extends Row> actions, Object[] results){code} method. i.e. what the hbase-indexer wants. I also add the example alternate Connection implementation below (both of these files are also attached to this issue). Expects there to be an up and running zookeeper ensemble.

      {code}
      <configuration>
        <!-- This file is an example for hbase-indexer. It shuts down
             facility in the regionserver and interjects a special
             Connection implementation which is how hbase-indexer will
             receive the replication stream from source hbase cluster.
             See the class referenced in the config.

             Most of the config in here is booleans set to off and
             setting values to zero so services doon't start. Some of
             the flags are new via this patch.
      -->
        <!--Need this for the RegionServer to come up standalone-->
        <property>
          <name>hbase.cluster.distributed</name>
          <value>true</value>
        </property>

        <!--This is what you implement, a Connection that returns a Table that
             overrides the batch call. It is at this point you do your indexer inserts.
          -->
        <property>
          <name>hbase.client.connection.impl</name>
          <value>org.apache.hadoop.hbase.client.IndexerConnection</value>
          <description>A customs connection implementation just so we can interject our
            own Table class, one that has an override for the batch call which receives
            the replication stream edits; i.e. it is called by the replication sink
            #replicateEntries method.</description>
        </property>

        <!--Set hbase.regionserver.info.port to -1 for no webui-->

        <!--Below are configs to shut down unused services in hregionserver-->
        <property>
          <name>hbase.regionserver.admin.service</name>
          <value>false</value>
          <description>Do NOT stand up an Admin Service Interface on RPC</description>
        </property>
        <property>
          <name>hbase.regionserver.client.service</name>
          <value>false</value>
          <description>Do NOT stand up a client-facing Service on RPC</description>
        </property>
        <property>
          <name>hbase.wal.provider</name>
          <value>org.apache.hadoop.hbase.wal.DisabledWALProvider</value>
          <description>Set WAL service to be the null WAL</description>
        </property>
        <property>
          <name>hbase.regionserver.workers</name>
          <value>false</value>
          <description>Turn off all background workers, log splitters, executors, etc.</description>
        </property>
        <property>
          <name>hfile.block.cache.size</name>
          <value>0.0001</value>
          <description>Turn off block cache completely</description>
        </property>
        <property>
          <name>hbase.mob.file.cache.size</name>
          <value>0</value>
          <description>Disable MOB cache.</description>
        </property>
        <property>
          <name>hbase.masterless</name>
          <value>true</value>
          <description>Do not expect Master in cluster.</description>
        </property>
        <property>
          <name>hbase.regionserver.metahandler.count</name>
          <value>1</value>
          <description>How many priority handlers to run; we probably need none.
          Default is 20 which is too much on a server like this.</description>
        </property>
        <property>
          <name>hbase.regionserver.replication.handler.count</name>
          <value>1</value>
          <description>How many replication handlers to run; we probably need none.
          Default is 3 which is too much on a server like this.</description>
        </property>
        <property>
          <name>hbase.regionserver.handler.count</name>
          <value>10</value>
          <description>How many default handlers to run; tie to # of CPUs.
          Default is 30 which is too much on a server like this.</description>
        </property>
        <property>
          <name>hbase.ipc.server.read.threadpool.size</name>
          <value>3</value>
          <description>How many Listener request reaaders to run; tie to a portion # of CPUs (1/4?).
          Default is 10 which is too much on a server like this.</description>
        </property>
      </configuration>
      {code}

      h2. Sample Connection Implementation
      Has call-out for where an hbase-indexer would insert its capture code.
      {code}
      package org.apache.hadoop.hbase.client;

      import com.google.protobuf.Descriptors;
      import com.google.protobuf.Message;
      import com.google.protobuf.Service;
      import com.google.protobuf.ServiceException;
      import org.apache.hadoop.conf.Configuration;
      import org.apache.hadoop.hbase.CompareOperator;
      import org.apache.hadoop.hbase.HTableDescriptor;
      import org.apache.hadoop.hbase.TableName;
      import org.apache.hadoop.hbase.client.coprocessor.Batch;
      import org.apache.hadoop.hbase.filter.CompareFilter;
      import org.apache.hadoop.hbase.ipc.CoprocessorRpcChannel;
      import org.apache.hadoop.hbase.security.User;

      import java.io.IOException;
      import java.util.List;
      import java.util.Map;
      import java.util.concurrent.ExecutorService;


      /**
       * Sample class for hbase-indexer.
       * DO NOT COMMIT TO HBASE CODEBASE!!!
       * Overrides Connection just so we can return a Table that has the
       * method that the replication sink calls, i.e. Table#batch.
       * It is at this point that the hbase-indexer catches the replication
       * stream so it can insert into the lucene index.
       */
      public class IndexerConnection implements Connection {
        private final Configuration conf;
        private final User user;
        private final ExecutorService pool;
        private volatile boolean closed = false;

        public IndexerConnection(Configuration conf, ExecutorService pool, User user) throws IOException {
          this.conf = conf;
          this.user = user;
          this.pool = pool;
        }

        @Override
        public void abort(String why, Throwable e) {}

        @Override
        public boolean isAborted() {
          return false;
        }

        @Override
        public Configuration getConfiguration() {
          return this.conf;
        }

        @Override
        public BufferedMutator getBufferedMutator(TableName tableName) throws IOException {
          return null;
        }

        @Override
        public BufferedMutator getBufferedMutator(BufferedMutatorParams params) throws IOException {
          return null;
        }

        @Override
        public RegionLocator getRegionLocator(TableName tableName) throws IOException {
          return null;
        }

        @Override
        public Admin getAdmin() throws IOException {
          return null;
        }

        @Override
        public void close() throws IOException {
          if (!this.closed) this.closed = true;
        }

        @Override
        public boolean isClosed() {
          return this.closed;
        }

        @Override
        public TableBuilder getTableBuilder(final TableName tn, ExecutorService pool) {
          if (isClosed()) {
            throw new RuntimeException("IndexerConnection is closed.");
          }
          final Configuration passedInConfiguration = getConfiguration();
          return new TableBuilder() {
            @Override
            public TableBuilder setOperationTimeout(int timeout) {
              return null;
            }

            @Override
            public TableBuilder setRpcTimeout(int timeout) {
              return null;
            }

            @Override
            public TableBuilder setReadRpcTimeout(int timeout) {
              return null;
            }

            @Override
            public TableBuilder setWriteRpcTimeout(int timeout) {
              return null;
            }

            @Override
            public Table build() {
              return new Table() {
                private final Configuration conf = passedInConfiguration;
                private final TableName tableName = tn;

                @Override
                public TableName getName() {
                  return this.tableName;
                }

                @Override
                public Configuration getConfiguration() {
                  return this.conf;
                }

                @Override
                public void batch(List<? extends Row> actions, Object[] results)
                throws IOException, InterruptedException {
                  // Implementation goes here.
                }

                @Override
                public HTableDescriptor getTableDescriptor() throws IOException {
                  return null;
                }

                @Override
                public TableDescriptor getDescriptor() throws IOException {
                  return null;
                }

                @Override
                public boolean exists(Get get) throws IOException {
                  return false;
                }

                @Override
                public boolean[] existsAll(List<Get> gets) throws IOException {
                  return new boolean[0];
                }

                @Override
                public <R> void batchCallback(List<? extends Row> actions, Object[] results, Batch.Callback<R> callback) throws IOException, InterruptedException {

                }

                @Override
                public Result get(Get get) throws IOException {
                  return null;
                }

                @Override
                public Result[] get(List<Get> gets) throws IOException {
                  return new Result[0];
                }

                @Override
                public ResultScanner getScanner(Scan scan) throws IOException {
                  return null;
                }

                @Override
                public ResultScanner getScanner(byte[] family) throws IOException {
                  return null;
                }

                @Override
                public ResultScanner getScanner(byte[] family, byte[] qualifier) throws IOException {
                  return null;
                }

                @Override
                public void put(Put put) throws IOException {

                }

                @Override
                public void put(List<Put> puts) throws IOException {

                }

                @Override
                public boolean checkAndPut(byte[] row, byte[] family, byte[] qualifier, byte[] value, Put put) throws IOException {
                  return false;
                }

                @Override
                public boolean checkAndPut(byte[] row, byte[] family, byte[] qualifier, CompareFilter.CompareOp compareOp, byte[] value, Put put) throws IOException {
                  return false;
                }

                @Override
                public boolean checkAndPut(byte[] row, byte[] family, byte[] qualifier, CompareOperator op, byte[] value, Put put) throws IOException {
                  return false;
                }

                @Override
                public void delete(Delete delete) throws IOException {

                }

                @Override
                public void delete(List<Delete> deletes) throws IOException {

                }

                @Override
                public boolean checkAndDelete(byte[] row, byte[] family, byte[] qualifier, byte[] value, Delete delete) throws IOException {
                  return false;
                }

                @Override
                public boolean checkAndDelete(byte[] row, byte[] family, byte[] qualifier, CompareFilter.CompareOp compareOp, byte[] value, Delete delete) throws IOException {
                  return false;
                }

                @Override
                public boolean checkAndDelete(byte[] row, byte[] family, byte[] qualifier, CompareOperator op, byte[] value, Delete delete) throws IOException {
                  return false;
                }

                @Override
                public void mutateRow(RowMutations rm) throws IOException {

                }

                @Override
                public Result append(Append append) throws IOException {
                  return null;
                }

                @Override
                public Result increment(Increment increment) throws IOException {
                  return null;
                }

                @Override
                public long incrementColumnValue(byte[] row, byte[] family, byte[] qualifier, long amount) throws IOException {
                  return 0;
                }

                @Override
                public long incrementColumnValue(byte[] row, byte[] family, byte[] qualifier, long amount, Durability durability) throws IOException {
                  return 0;
                }

                @Override
                public void close() throws IOException {

                }

                @Override
                public CoprocessorRpcChannel coprocessorService(byte[] row) {
                  return null;
                }

                @Override
                public <T extends Service, R> Map<byte[], R> coprocessorService(Class<T> service, byte[] startKey, byte[] endKey, Batch.Call<T, R> callable) throws ServiceException, Throwable {
                  return null;
                }

                @Override
                public <T extends Service, R> void coprocessorService(Class<T> service, byte[] startKey, byte[] endKey, Batch.Call<T, R> callable, Batch.Callback<R> callback) throws ServiceException, Throwable {

                }

                @Override
                public <R extends Message> Map<byte[], R> batchCoprocessorService(Descriptors.MethodDescriptor methodDescriptor, Message request, byte[] startKey, byte[] endKey, R responsePrototype) throws ServiceException, Throwable {
                  return null;
                }

                @Override
                public <R extends Message> void batchCoprocessorService(Descriptors.MethodDescriptor methodDescriptor, Message request, byte[] startKey, byte[] endKey, R responsePrototype, Batch.Callback<R> callback) throws ServiceException, Throwable {

                }

                @Override
                public boolean checkAndMutate(byte[] row, byte[] family, byte[] qualifier, CompareFilter.CompareOp compareOp, byte[] value, RowMutations mutation) throws IOException {
                  return false;
                }

                @Override
                public boolean checkAndMutate(byte[] row, byte[] family, byte[] qualifier, CompareOperator op, byte[] value, RowMutations mutation) throws IOException {
                  return false;
                }

                @Override
                public void setOperationTimeout(int operationTimeout) {

                }

                @Override
                public int getOperationTimeout() {
                  return 0;
                }

                @Override
                public int getRpcTimeout() {
                  return 0;
                }

                @Override
                public void setRpcTimeout(int rpcTimeout) {

                }

                @Override
                public int getReadRpcTimeout() {
                  return 0;
                }

                @Override
                public void setReadRpcTimeout(int readRpcTimeout) {

                }

                @Override
                public int getWriteRpcTimeout() {
                  return 0;
                }

                @Override
                public void setWriteRpcTimeout(int writeRpcTimeout) {

                }
              };
            }
          };
        }
      }
      {code}
      Show
      Makes it so hbase-indexer/lily can move off dependence on internal APIs and instead move to public APIs. Adds being able to disable near-all HRegionServer services. This along with an existing plugin mechanism which allows configuring the RegionServer to host an alternate Connection implementation, makes it so we can put up a cluster of hollowed-out HRegionServers purposed to pose as a Replication Sink for a source HBase Cluster (Users do not need to figure our RPC, our PB encodings, build a distributed service, etc.). In the alternate supplied Connection implementation, hbase-indexer would install its own code to catch the Replication. Below and attached are sample hbase-server.xml files and alternate Connection implementations. To start up an HRegionServer as a sink, first make sure there is a ZooKeeper ensemble we can talk to. If none, just start one: {code} ./bin/hbase-daemon.sh start zookeeper {code} To start up a single RegionServer, put in place the below sample hbase-site.xml and a derviative of the below IndexerConnection on the CLASSPATH, and then start the RegionServer: {code} ./bin/hbase-daemon.sh start org.apache.hadoop.hbase.regionserver.HRegionServer {code} Stdout and Stderr will go into files under configured logs directory. Browse to localhost:16030 to find webui (unless disabled). DETAILS This patch adds configuration to disable RegionServer internal Services, Managers, Caches, etc., starting up. By default a RegionServer starts up an Admin and Client Service. To disable either or both, use the below booleans: {code} hbase.regionserver.admin.service hbase.regionserver.client.service {code} Both default true. To make a HRegionServer startup and stay up without expecting to communicate with a master, set the below boolean to false: {code} hbase.masterless {code] Default is false. h3. Sample hbase-site.xml that disables internal HRegionServer Services Below is an example hbase-site.xml that turns off most Services and that then installs an alternate Connection implementation, one that is nulled out in all regards except in being able to return a "Table" that can catch a Replication Stream in its {code}batch(List<? extends Row> actions, Object[] results){code} method. i.e. what the hbase-indexer wants. I also add the example alternate Connection implementation below (both of these files are also attached to this issue). Expects there to be an up and running zookeeper ensemble. {code} <configuration>   <!-- This file is an example for hbase-indexer. It shuts down        facility in the regionserver and interjects a special        Connection implementation which is how hbase-indexer will        receive the replication stream from source hbase cluster.        See the class referenced in the config.        Most of the config in here is booleans set to off and        setting values to zero so services doon't start. Some of        the flags are new via this patch. -->   <!--Need this for the RegionServer to come up standalone-->   <property>     <name>hbase.cluster.distributed</name>     <value>true</value>   </property>   <!--This is what you implement, a Connection that returns a Table that        overrides the batch call. It is at this point you do your indexer inserts.     -->   <property>     <name>hbase.client.connection.impl</name>     <value>org.apache.hadoop.hbase.client.IndexerConnection</value>     <description>A customs connection implementation just so we can interject our       own Table class, one that has an override for the batch call which receives       the replication stream edits; i.e. it is called by the replication sink       #replicateEntries method.</description>   </property>   <!--Set hbase.regionserver.info.port to -1 for no webui-->   <!--Below are configs to shut down unused services in hregionserver-->   <property>     <name>hbase.regionserver.admin.service</name>     <value>false</value>     <description>Do NOT stand up an Admin Service Interface on RPC</description>   </property>   <property>     <name>hbase.regionserver.client.service</name>     <value>false</value>     <description>Do NOT stand up a client-facing Service on RPC</description>   </property>   <property>     <name>hbase.wal.provider</name>     <value>org.apache.hadoop.hbase.wal.DisabledWALProvider</value>     <description>Set WAL service to be the null WAL</description>   </property>   <property>     <name>hbase.regionserver.workers</name>     <value>false</value>     <description>Turn off all background workers, log splitters, executors, etc.</description>   </property>   <property>     <name>hfile.block.cache.size</name>     <value>0.0001</value>     <description>Turn off block cache completely</description>   </property>   <property>     <name>hbase.mob.file.cache.size</name>     <value>0</value>     <description>Disable MOB cache.</description>   </property>   <property>     <name>hbase.masterless</name>     <value>true</value>     <description>Do not expect Master in cluster.</description>   </property>   <property>     <name>hbase.regionserver.metahandler.count</name>     <value>1</value>     <description>How many priority handlers to run; we probably need none.     Default is 20 which is too much on a server like this.</description>   </property>   <property>     <name>hbase.regionserver.replication.handler.count</name>     <value>1</value>     <description>How many replication handlers to run; we probably need none.     Default is 3 which is too much on a server like this.</description>   </property>   <property>     <name>hbase.regionserver.handler.count</name>     <value>10</value>     <description>How many default handlers to run; tie to # of CPUs.     Default is 30 which is too much on a server like this.</description>   </property>   <property>     <name>hbase.ipc.server.read.threadpool.size</name>     <value>3</value>     <description>How many Listener request reaaders to run; tie to a portion # of CPUs (1/4?).     Default is 10 which is too much on a server like this.</description>   </property> </configuration> {code} h2. Sample Connection Implementation Has call-out for where an hbase-indexer would insert its capture code. {code} package org.apache.hadoop.hbase.client; import com.google.protobuf.Descriptors; import com.google.protobuf.Message; import com.google.protobuf.Service; import com.google.protobuf.ServiceException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.CompareOperator; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.TableName; import org.apache.hadoop.hbase.client.coprocessor.Batch; import org.apache.hadoop.hbase.filter.CompareFilter; import org.apache.hadoop.hbase.ipc.CoprocessorRpcChannel; import org.apache.hadoop.hbase.security.User; import java.io.IOException; import java.util.List; import java.util.Map; import java.util.concurrent.ExecutorService; /**  * Sample class for hbase-indexer.  * DO NOT COMMIT TO HBASE CODEBASE!!!  * Overrides Connection just so we can return a Table that has the  * method that the replication sink calls, i.e. Table#batch.  * It is at this point that the hbase-indexer catches the replication  * stream so it can insert into the lucene index.  */ public class IndexerConnection implements Connection {   private final Configuration conf;   private final User user;   private final ExecutorService pool;   private volatile boolean closed = false;   public IndexerConnection(Configuration conf, ExecutorService pool, User user) throws IOException {     this.conf = conf;     this.user = user;     this.pool = pool;   }   @Override   public void abort(String why, Throwable e) {}   @Override   public boolean isAborted() {     return false;   }   @Override   public Configuration getConfiguration() {     return this.conf;   }   @Override   public BufferedMutator getBufferedMutator(TableName tableName) throws IOException {     return null;   }   @Override   public BufferedMutator getBufferedMutator(BufferedMutatorParams params) throws IOException {     return null;   }   @Override   public RegionLocator getRegionLocator(TableName tableName) throws IOException {     return null;   }   @Override   public Admin getAdmin() throws IOException {     return null;   }   @Override   public void close() throws IOException {     if (!this.closed) this.closed = true;   }   @Override   public boolean isClosed() {     return this.closed;   }   @Override   public TableBuilder getTableBuilder(final TableName tn, ExecutorService pool) {     if (isClosed()) {       throw new RuntimeException("IndexerConnection is closed.");     }     final Configuration passedInConfiguration = getConfiguration();     return new TableBuilder() {       @Override       public TableBuilder setOperationTimeout(int timeout) {         return null;       }       @Override       public TableBuilder setRpcTimeout(int timeout) {         return null;       }       @Override       public TableBuilder setReadRpcTimeout(int timeout) {         return null;       }       @Override       public TableBuilder setWriteRpcTimeout(int timeout) {         return null;       }       @Override       public Table build() {         return new Table() {           private final Configuration conf = passedInConfiguration;           private final TableName tableName = tn;           @Override           public TableName getName() {             return this.tableName;           }           @Override           public Configuration getConfiguration() {             return this.conf;           }           @Override           public void batch(List<? extends Row> actions, Object[] results)           throws IOException, InterruptedException {             // Implementation goes here.           }           @Override           public HTableDescriptor getTableDescriptor() throws IOException {             return null;           }           @Override           public TableDescriptor getDescriptor() throws IOException {             return null;           }           @Override           public boolean exists(Get get) throws IOException {             return false;           }           @Override           public boolean[] existsAll(List<Get> gets) throws IOException {             return new boolean[0];           }           @Override           public <R> void batchCallback(List<? extends Row> actions, Object[] results, Batch.Callback<R> callback) throws IOException, InterruptedException {           }           @Override           public Result get(Get get) throws IOException {             return null;           }           @Override           public Result[] get(List<Get> gets) throws IOException {             return new Result[0];           }           @Override           public ResultScanner getScanner(Scan scan) throws IOException {             return null;           }           @Override           public ResultScanner getScanner(byte[] family) throws IOException {             return null;           }           @Override           public ResultScanner getScanner(byte[] family, byte[] qualifier) throws IOException {             return null;           }           @Override           public void put(Put put) throws IOException {           }           @Override           public void put(List<Put> puts) throws IOException {           }           @Override           public boolean checkAndPut(byte[] row, byte[] family, byte[] qualifier, byte[] value, Put put) throws IOException {             return false;           }           @Override           public boolean checkAndPut(byte[] row, byte[] family, byte[] qualifier, CompareFilter.CompareOp compareOp, byte[] value, Put put) throws IOException {             return false;           }           @Override           public boolean checkAndPut(byte[] row, byte[] family, byte[] qualifier, CompareOperator op, byte[] value, Put put) throws IOException {             return false;           }           @Override           public void delete(Delete delete) throws IOException {           }           @Override           public void delete(List<Delete> deletes) throws IOException {           }           @Override           public boolean checkAndDelete(byte[] row, byte[] family, byte[] qualifier, byte[] value, Delete delete) throws IOException {             return false;           }           @Override           public boolean checkAndDelete(byte[] row, byte[] family, byte[] qualifier, CompareFilter.CompareOp compareOp, byte[] value, Delete delete) throws IOException {             return false;           }           @Override           public boolean checkAndDelete(byte[] row, byte[] family, byte[] qualifier, CompareOperator op, byte[] value, Delete delete) throws IOException {             return false;           }           @Override           public void mutateRow(RowMutations rm) throws IOException {           }           @Override           public Result append(Append append) throws IOException {             return null;           }           @Override           public Result increment(Increment increment) throws IOException {             return null;           }           @Override           public long incrementColumnValue(byte[] row, byte[] family, byte[] qualifier, long amount) throws IOException {             return 0;           }           @Override           public long incrementColumnValue(byte[] row, byte[] family, byte[] qualifier, long amount, Durability durability) throws IOException {             return 0;           }           @Override           public void close() throws IOException {           }           @Override           public CoprocessorRpcChannel coprocessorService(byte[] row) {             return null;           }           @Override           public <T extends Service, R> Map<byte[], R> coprocessorService(Class<T> service, byte[] startKey, byte[] endKey, Batch.Call<T, R> callable) throws ServiceException, Throwable {             return null;           }           @Override           public <T extends Service, R> void coprocessorService(Class<T> service, byte[] startKey, byte[] endKey, Batch.Call<T, R> callable, Batch.Callback<R> callback) throws ServiceException, Throwable {           }           @Override           public <R extends Message> Map<byte[], R> batchCoprocessorService(Descriptors.MethodDescriptor methodDescriptor, Message request, byte[] startKey, byte[] endKey, R responsePrototype) throws ServiceException, Throwable {             return null;           }           @Override           public <R extends Message> void batchCoprocessorService(Descriptors.MethodDescriptor methodDescriptor, Message request, byte[] startKey, byte[] endKey, R responsePrototype, Batch.Callback<R> callback) throws ServiceException, Throwable {           }           @Override           public boolean checkAndMutate(byte[] row, byte[] family, byte[] qualifier, CompareFilter.CompareOp compareOp, byte[] value, RowMutations mutation) throws IOException {             return false;           }           @Override           public boolean checkAndMutate(byte[] row, byte[] family, byte[] qualifier, CompareOperator op, byte[] value, RowMutations mutation) throws IOException {             return false;           }           @Override           public void setOperationTimeout(int operationTimeout) {           }           @Override           public int getOperationTimeout() {             return 0;           }           @Override           public int getRpcTimeout() {             return 0;           }           @Override           public void setRpcTimeout(int rpcTimeout) {           }           @Override           public int getReadRpcTimeout() {             return 0;           }           @Override           public void setReadRpcTimeout(int readRpcTimeout) {           }           @Override           public int getWriteRpcTimeout() {             return 0;           }           @Override           public void setWriteRpcTimeout(int writeRpcTimeout) {           }         };       }     };   } } {code}

    Description

      This is a follow-on from HBASE-10504, Define a Replication Interface. There we defined a new, flexible replication endpoint for others to implement but it did little to help the case of the lily hbase-indexer. This issue takes up the case of the hbase-indexer.

      The hbase-indexer poses to hbase as a 'fake' peer cluster (For why hbase-indexer is implemented so, the advantage to having the indexing done in a separate process set that can be independently scaled, can participate in the same security realm, etc., see discussion in HBASE-10504). The hbase-indexer will start up a cut-down "RegionServer" processes that are just an instance of hbase RpcServer hosting an AdminProtos Service. They make themselves 'appear' to the Replication Source by hoisting up an ephemeral znode 'registering' as a RegionServer. The source cluster then streams WALEdits to the Admin Protos method:

       public ReplicateWALEntryResponse replicateWALEntry(final RpcController controller,
            final ReplicateWALEntryRequest request) throws ServiceException {
      

      The hbase-indexer relies on other hbase internals like Server so it can get a ZooKeeperWatcher instance and know the 'name' to use for this cut-down server.

      Thoughts on how to proceed include:

      • Better formalize its current digestion of hbase internals; make it so rpcserver is allowed to be used by others, etc. This would be hard to do given they use basics like Server, Protobuf serdes for WAL types, and AdminProtos Service. Any change in this wide API breaks (again) hbase-indexer. We have made a 'channel' for Coprocessor Endpoints so they continue to work though they use 'internal' types. They can use protos in hbase-protocol. hbase-protocol protos are in a limbo currently where they are sort-of 'public'; a TODO. Perhaps the hbase-indexer could do similar relying on the hbase-protocol (pb2.5) content and we could do something to reveal rpcserver and zk for hbase-indexer safe use.
      • Start an actual RegionServer only have it register the AdminProtos Service only – not ClientProtos and the Service that does Master interaction, etc. [I checked, this is not as easy to do as I at first thought -- St.Ack] Then have the hbase-indexer implement an AdminCoprocessor to override the replicateWALEntry method (the Admin CP implementation may need work). This would narrow the hbase-indexer exposure to that of the Admin Coprocessor Interface
      • Over in HBASE-10504, enis suggested "... if we want to provide isolation for the replication services in hbase, we can have a simple host as another daemon which hosts the ReplicationEndpoint implementation. RS's will use a built-in RE to send the edits to this layer, and the host will delegate it to the RE implementation. The flow would be something like: RS --> RE inside RS --> Host daemon for RE --> Actual RE implementation --> third party system..."

      Other crazy notions occur including the setup of an Admin Interface Coprocessor Endpoint. A new ReplicationEndpoint would feed the replication stream to the remote cluster via the CPEP registered channel.

      But time is short. Hopefully we can figure something that will work in 2.0 timeframe w/o too much code movement.

      Attachments

        1. HBASE-18846.master.007.patch
          72 kB
          Michael Stack
        2. HBASE-18846.master.007.patch
          72 kB
          Michael Stack
        3. HBASE-18846.master.006.patch
          61 kB
          Michael Stack
        4. HBASE-18846.master.005.patch
          61 kB
          Michael Stack
        5. HBASE-18846.master.004.patch
          66 kB
          Michael Stack
        6. IndexerConnection.java
          10 kB
          Michael Stack
        7. hbase-site.xml
          4 kB
          Michael Stack
        8. HBASE-18846.master.003.patch
          75 kB
          Michael Stack
        9. HBASE-18846.master.002.patch
          82 kB
          Michael Stack
        10. HBASE-18846.master.001.patch
          63 kB
          Michael Stack
        11. javadoc.txt
          3 kB
          Michael Stack

        Issue Links

          Activity

            People

              stack Michael Stack
              stack Michael Stack
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: