Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.3.5
    • None
    • fs/s3
    • None

    Description

      we can improve stats collected in the s3a committer and saved to the JSON.

      key ones

      1. of task manifests read; duration of loads
      2. size of each manifest

      I think we would also benefit if we could set the commit thread pools to be big -but then shared across all jobs (i.e. demand-created thread pool in s3a fs). that would allow for a pool size of say, 500, but still support many jobs actively committing at same time (busy spark driver)
      finally: should file commit pool size be > size of pool of manifest readers. I think it could be, but the ratio should be fairly low.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: