Details
-
Umbrella
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
beginner
Description
Our Chaos menu is "drawing room polite" given the variety of failures available out in the wild world of deploys.
Other possible items to add (could do as subtasks of this umbrella) taken from a recent interesting read on how TiDB does its chaos:
- Send SIGSTOP to hang or SIGCONT to resume the process.
Use `renice` to adjust the process priority or use `setpriority` for the threads of the process. - Max out the CPU.
- Use `iptables` or `tc` to drop or reject the network packages or delay the network packages.
- Use `tc` to reorder the network packages and use a proxy to reorder the gRPC requests.
- Use `iperf` to take all network throughput.
- Use `libfuse` to mount a file system and do the I/O fault injection.
- Link `libfiu` to do the I/O fault injection.
- Use `rm -rf` forcbily to remove all data.
- Use `echo 0 > file` to damage a file.
- Copy a huge file to create the `NoSpace` problem.
The article includes other interesting possibilities: exploiting the kernels fault injection mechanism or scripting systemtap to mess with nodes. It also describes how they automate their chaos-making.
Attachments
There are no Sub-Tasks for this issue.