Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-12231

[C++][Dataset] Separate datasets backed by readers from InMemoryDataset

    XMLWordPrintableJSON

Details

    Description

      From ARROW-10882/https://github.com/apache/arrow/pull/9802 

      • Backing an InMemoryDataset with a reader is misleading. Let's split that out into a separate class.
      • Dataset scanning can then use an I/O thread for the new class. (Note that for Python, we'll need to be careful to release the GIL before any operations so that the I/O thread can acquire the GIL to call into the underlying Python reader/file object.)
      • Longer-term, we should interface with Python's async.

      Attachments

        Issue Links

          Activity

            People

              lidavidm David Li
              westonpace Weston Pace
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 20m
                  2h 20m