I could make it the default, but I would like the hear the opinion of many people who are running hadoop clusters. Also, performance numbers could vary a lot based on the operating system (CentOs, Redhat, windows, ext4, xfs), etc., so it would be difficult to get it right based solely on performance. On the other hand, if the entire community thinks that it is better to have the default the prevents data loss at all costs, then this could be the default. If the debate on either side is fierce, then I would like to get this in first and then open another JIRA to debate the default settings.
We are definitely going to first deploy this first on our "archival" cluster. This is a cluster that is used purely to backup/restore data from mySQL databases.