Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-18855 VectorIO API tuning/stabilization
  3. HADOOP-19229

Vector IO on cloud storage: what is a good minimum seek size?

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 3.4.1
    • None
    • fs/s3

    Description

      vector iO has a max size to coalesce ranges, but it also needs a maximum gap between ranges to justify the merge. Right now we could have a read where two vectors of size 8 bytes can be merged with a 1 MB gap between them -and that's wasteful.

      We could also consider an "efficiency" metric which looks at the ratio of bytes-read to bytes-discarded. Not sure what we'd do with it, but we could track it as an IOStat

      Attachments

        Issue Links

          Activity

            People

              stevel@apache.org Steve Loughran
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: