Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-18103 High performance vectored read API in Hadoop
  3. HADOOP-18347

Restrict vectoredIO threadpool to reduce memory pressure

    XMLWordPrintableJSON

Details

    Description

      https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L964-L967

      Currently, it fetches all the ranges with unbounded threadpool. This will not cause memory pressures with standard benchmarks like TPCDS. However, when large number of ranges are present with large files, this could potentially spike up memory usage of the task. Limiting the threadpool size could reduce the memory usage.

      Attachments

        Issue Links

          Activity

            People

              mthakur Mukund Thakur
              rajesh.balamohan Rajesh Balamohan
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: