Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-10788

[C++] Make S3 recursive walks parallel

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0
    • C++

    Description

      Doing a recursive S3 directory walk using GetFileInfo(Selector) currently lists all encountered directories serially, waiting for the results of one directory listing (or portion thereof) before launching the next one. Instead, we should use the Async APIs provided by the AWS SDK to parallelize HTTP requests as much as possible.

      Attachments

        Issue Links

          Activity

            People

              apitrou Antoine Pitrou
              apitrou Antoine Pitrou
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m