Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-4448

Cannot submit Tez job when dag size is exceeds `ipc.maximum.data.length` and S3A is the filesystem

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.9.2, 0.10.2
    • 0.9.3, 0.10.3
    • None

    Description

      Submitting a Tez DAG with serialized size exceeding ipc.maximum.data.length in EMR/S3 environment results in a FileNotFoundException

      Stacktrace
      Caused by: java.io.FileNotFoundException: No such file or directory: s3://my-s3-bucket/my-job/.staging-a223d0fa-d315-4d84-9594-529c6ba34066/.tez/application_1655350046147_595990/tez-dag.pb1
              at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2310)
              at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2204)
              at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2143)
              at org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:888)
              at org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:676)
              at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588)
              ... 22 more
      
      Reproducing
      • set ipc.maximum.data.length to 64*1024 instead of the default value of 64*1024*1024
      • attempt to submit a Tez job (EMR/S3)

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            karel.kolman Karel Kolman
            karel.kolman Karel Kolman
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 0.5h
                0.5h

                Slack

                  Issue deployment