Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-17469

IOStatistics Phase II

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.3.1
    • None
    • fs, fs/azure, fs/s3
    • None

    Description

      Continue IOStatistics development with goals of

      • Easy adoption in applications
      • better instrumentation in hadoop codebase (distcp?)
      • more stats in abfs and s3a connectors

      A key has to be a thread level context for statistics so that app code doesn't have to explicitly ask for the stats for each worker thread. Instead

      filesystem components update the context stats as well as thread stats (when?) and then apps can pick up.

      • need to manage performance by minimising inefficient lookups, lock acquisition etc on what should be memory-only ops (read()), (write()),
      • and for duration tracking, cut down on calls to System.currentTime() so that only 1 should be made per operation,
      • need to propagate the context into worker threads

      Target uses

      I have a WiP Parquet branch too, to see what can be done there. This shows up how the thread context is needed as its unworkable to build up your own stats shapshot. Even if you collect it for listX and stream reads, it doesn't include FS operations (e.g. rename()) and you need to rework all your methods to pass the stats collector around

      Attachments

        Issue Links

          Activity

            People

              mehakmeetSingh Mehakmeet Singh
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 18h
                  18h