[SPARK-16044] input_file_name() returns empty strings in data sources based on NewHadoopRDD. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.0.0
Fix Version/s: 1.6.3, 2.0.0
Component/s: SQL
Labels:
None

Description

The issue is, input_file_name() function does not contain file paths when data sources use NewHadoopRDD. This is currently only supported for FileScanRDD and HadoopRDD.

To be clear, this does not affect Spark's internal data sources because currently they all do not use NewHadoopRDD.

However, there are several datasources using this. For example,

spark-redshift - here
spark-xml - here

Currently, using this functions shows the output below:

+-----------------+
|input_file_name()|
+-----------------+
|                 |
|                 |
|                 |
|                 |
|                 |
|                 |
|                 |
|                 |
|                 |
|                 |
|                 |
+-----------------+

Attachments

Issue Links

links to

[Github] Pull Request #13759 (HyukjinKwon)

[Github] Pull Request #13806 (HyukjinKwon)

Activity

People

Assignee:: Hyukjin Kwon

Reporter:: Hyukjin Kwon

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 18/Jun/16 06:56

Updated:: 12/Dec/22 17:50

Resolved:: 21/Jun/16 04:55