Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-16829 Über-jira: S3A Hadoop 3.3.1 features
  3. HADOOP-17318

S3A committer to support concurrent jobs with same app attempt ID & dest dir

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.3.0
    • 3.3.1
    • fs/s3

    Description

      Reported failure of magic committer block uploads as pending upload ID is unknown. Likely cause: it's been aborted by another job

      1. Make it possible to turn off cleanup of pending uploads in magic committer
      2. log more about uploads being deleted in committers
      3. and upload ID in the S3aBlockOutputStream errors

      There are other concurrency issues when you look close, see SPARK-33230

      • magic committer uses app attempt ID as path under __magic; if there are duplicate then they will conflict
      • staging committer local temp dir uses app attempt id

      Fix will be to have a job UUID which for spark will be picked up from the SPARK-33230 changes, (option to self-generate in job setup for hadoop 3.3.1+ older spark builds); fall back to app-attempt unless that fallback has been disabled

      MR: configure to use app attempt ID
      Spark: configure to fail job setup if app attempt ID is the source of a job uuid

      Attachments

        Issue Links

          Activity

            People

              stevel@apache.org Steve Loughran
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 8h 10m
                  8h 10m