Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-10444

use openFile() with sequential IO for localizing files.

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: In Progress
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.3.0
    • Fix Version/s: None
    • Component/s: nodemanager
    • Labels:
      None
    • Target Version/s:

      Description

      HADOOP-16202 adds standard options for declaring the read/seek
      Policy when reading a file. These should be set to sequential IO
      When localising resources, so that if the default/cluster settings
      For a file system are optimized for random IO, artifact downloads
      are still read at the maximum speed possible (one big GET to the EOF).

      Most of this happens in hadoop-common, but some tuning of FSDownload
      can assist

      • tar/jar download must also be sequential
      • if the FileStatus is passed around, that can be used
        in the open request to skip checks when loading the file.

      Together this can save 3 HEAD requests per resource, with the sequential
      IO avoiding any splitting of the big read into separate block GETs

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                stevel@apache.org Steve Loughran
                Reporter:
                stevel@apache.org Steve Loughran
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: