I don't want to seem argumentative, because I really don't know if using this cache for the WAL is a good idea, or not. But I can think of some issues:
- hopefully, in your clusters, recovery is an unusual operation
- WAL has to write to disk to survive power loss, making it a bad candidate for RAM-only storage
- Others have purposefully turned off caching of WAL data to make memory available for other things, since reading them at all is unusual
We already know we can improve recovery time by reducing the largest WAL size, parallelizing read/sort, and computing a more optimal leaseRecovery timeout. I would strongly suggest a more in-depth look into recovery before even experimenting with HDFS caching.