Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-12945

[C++][Dataset] Refactor InMemoryDataset to inherit FragmentDataset

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • C++
    • None

    Description

      InMemoryDataset could inherit FragmentDataset. Actually it'd be beneficial if all datasets could have a vector of their fragments; this would allow subtree pruning to be used on any dataset when performing predicate pushdown.

      See also ARROW-8065 for which an unmet goal was to make Dataset a concrete class which contained fragments (essentially FragmentDataset), and have subclasses simply add guarantees on those fragments (FileSystemDataset contains only FIleFragments).

      See also ARROW-12891 (add support for subtree pruning to FragmentDataset)

      NB: This will require promotion of FragmentDataset to a public class or demotion of InMemoryDataset to an internal class (with public factories)

      Attachments

        Activity

          People

            Unassigned Unassigned
            bkietz Ben Kietzman
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: