Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
In the discussion on ARROW-15260, if we run the following code in R, we might expect it to push down the filter so we can just read in the relevant files:
filter = Expression$create( "match_substring", Expression$field_ref("__filename"), options = list(pattern = "cyl=8") )
As mentioned by westonpace:
"You might think we would get the hint and only read files matching that pattern. This is not the case. We will read the entire dataset and apply the "cyl=8" filter in memory.
If we want to pushdown filters on the filename column we will need to add some special logic."
Attachments
Issue Links
- is related to
-
ARROW-15260 [R] open_dataset - add file_name as column
-
- Resolved
-