Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-14261

Enhance Chaos Monkey framework by adding zookeeper and datanode fault injections.

    XMLWordPrintableJSON

    Details

    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      This change augments existing chaos monkey framework with actions for restarting underlying zookeeper quorum and hdfs nodes of distributed hbase cluster. One assumption made while creating zk actions are that zookeper ensemble is an independent external service and won't be managed by hbase cluster. For these actions to work as expected, the following parameters need to be configured appropriately.

      {code}
      <property>
        <name>hbase.it.clustermanager.hadoop.home</name>
        <value>$HADOOP_HOME</value>
      </property>
      <property>
        <name>hbase.it.clustermanager.zookeeper.home</name>
        <value>$ZOOKEEPER_HOME</value>
      </property>
      <property>
        <name>hbase.it.clustermanager.hbase.user</name>
        <value>hbase</value>
      </property>
      <property>
        <name>hbase.it.clustermanager.hadoop.hdfs.user</name>
        <value>hdfs</value>
      </property>
      <property>
        <name>hbase.it.clustermanager.zookeeper.user</name>
        <value>zookeeper</value>
      </property>
      {code}

      The service user related configurations are newly introduced since in prod/test environments each service is managed by different user. Once the above parameters are configured properly, you can start using them as needed. An example usage for invoking these new actions is:

      {{./hbase org.apache.hadoop.hbase.IntegrationTestAcidGuarantees -m serverAndDependenciesKilling}}



      Show
      This change augments existing chaos monkey framework with actions for restarting underlying zookeeper quorum and hdfs nodes of distributed hbase cluster. One assumption made while creating zk actions are that zookeper ensemble is an independent external service and won't be managed by hbase cluster. For these actions to work as expected, the following parameters need to be configured appropriately. {code} <property>   <name>hbase.it.clustermanager.hadoop.home</name>   <value>$HADOOP_HOME</value> </property> <property>   <name>hbase.it.clustermanager.zookeeper.home</name>   <value>$ZOOKEEPER_HOME</value> </property> <property>   <name>hbase.it.clustermanager.hbase.user</name>   <value>hbase</value> </property> <property>   <name>hbase.it.clustermanager.hadoop.hdfs.user</name>   <value>hdfs</value> </property> <property>   <name>hbase.it.clustermanager.zookeeper.user</name>   <value>zookeeper</value> </property> {code} The service user related configurations are newly introduced since in prod/test environments each service is managed by different user. Once the above parameters are configured properly, you can start using them as needed. An example usage for invoking these new actions is: {{./hbase org.apache.hadoop.hbase.IntegrationTestAcidGuarantees -m serverAndDependenciesKilling}}

      Description

      One of the shortcomings of existing ChaosMonkey framework is lack of fault injections for hbase dependencies like zookeeper, hdfs etc. This patch attempts to solve this problem partially by adding datanode and zk node fault injections.

        Attachments

        1. HBASE-14261.branch-1_v2.patch
          37 kB
          Srikanth Srungarapu
        2. HBASE-14261.patch
          39 kB
          Srikanth Srungarapu
        3. HBASE-14261-0.98-addendum.patch
          7 kB
          Andrew Kyle Purtell
        4. HBASE-14261-addendum.patch
          0.7 kB
          Srikanth Srungarapu
        5. HBASE-14261-branch-1_v3.patch
          38 kB
          Srikanth Srungarapu
        6. HBASE-14261-branch-1_v4.patch
          39 kB
          Srikanth Srungarapu
        7. HBASE-14261-branch-1.patch
          28 kB
          Srikanth Srungarapu

          Activity

            People

            • Assignee:
              srikanth235 Srikanth Srungarapu
              Reporter:
              srikanth235 Srikanth Srungarapu
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: