[BEAM-12741] Read multiple files keeping track of file names (Python) - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: P3
Resolution: Duplicate
Affects Version/s: 2.31.0
Fix Version/s: Missing
Component/s: io-py-files
Labels:
- io
- python
- text

Language:
- Python

Description

When reading lines from text files with multiple patterns it is sometimes useful to keep track of the file names from which the lines originated. Example: read tab-delimited files and map their lines to column headers coming from separate files.

It would be nice to have a ReadAllFromTextWithFilename transform, which modifies ReadAllFromText transform in a similar way as ReadFromTextWithFilename modifies the ReadFromText transform to produce tuples of file names paired with text lines.

Attachments

Issue Links

duplicates

BEAM-12665 Add option to return filename from ReadAll transforms

Resolved

is related to

BEAM-6167 Create a Class to read content of a file keeping track of the file path (python)

Resolved

links to

GitHub Pull Request #15315

Activity

People

Assignee:: Unassigned

Reporter:: Eugene Nikolaiev

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 11/Aug/21 12:51

Updated:: 13/Jan/22 21:49

Resolved:: 11/Aug/21 18:29

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

1h 50m