Description
The issue is, input_file_name() function does not contain file paths when data sources use NewHadoopRDD. This is currently only supported for FileScanRDD and HadoopRDD.
To be clear, this does not affect Spark's internal data sources because currently they all do not use NewHadoopRDD.
However, there are several datasources using this. For example,
spark-redshift - here
spark-xml - here
Currently, using this functions shows the output below:
+-----------------+ |input_file_name()| +-----------------+ | | | | | | | | | | | | | | | | | | | | | | +-----------------+