I disagree. It is an explicit decision to not have the ZKFC act as a service supervisor, because it adds a lot of complexity. There already exist lots of solutions for service management - we assume that the user is already using something like puppet, daemontools, supervisord, cron, etc, to make sure the daemon restarts eventually.
I did not find a reference to an external monitoring tool in the HA design docs. So apologies there. If the scanning interval of the external tools is significant, it might still make sense for FC to restart the NN directly. With one of the NN processes down, the cluster is functioning in a degraded state and the longer it takes to restart the standby NN process, longer the recovery time is going to be.