Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-5839 Flink Security problem collection
  3. FLINK-6020

Blob Server cannot handle multiple job submits (with same content) parallelly

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3.0, 1.4.0
    • Labels:
      None

      Description

      In yarn-cluster mode, if we submit one same job multiple times parallelly, the task will encounter class load problem and lease occuputation.

      Because blob server stores user jars in name with generated sha1sum of those, first writes a temp file and move it to finalialize. For recovery it also will put them to HDFS with same file name.

      In same time, when multiple clients sumit same job with same jar, the local jar files in blob server and those file on hdfs will be handled in multiple threads(BlobServerConnection), and impact each other.

      It's better to have a way to handle this, now two ideas comes up to my head:
      1. lock the write operation, or
      2. use some unique identifier as file name instead of ( or added up to) sha1sum of the file contents.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                till.rohrmann Till Rohrmann
                Reporter:
                WangTao Tao Wang
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: