If we use Dataset for initial loading when inferring the schema, there are advantages. Please refer
It seems JSON one was supposed to be fixed together but missed according to https://github.com/apache/spark/pull/15813
A similar problem also affects the JSON file format and this patch originally fixed that as well, but I've decided to split that change into a separate patch so as not to conflict with changes in another JSON PR.
Also, this affects some functionalities because it does not use FileScanRDD. This problem is described in
SPARK-19885 (but it was CSV's case).