Currently, the JobClient uploads all user code jars to the JobManager upon triggering of the job execution. All jars are uploaded ignoring whether the jar already exists on the JobManager or not. This can be especially painful if one has many small jobs, which use the same big user code jars and which one wants to execute quickly one after the other.
In order to avoid unnecessary file transfers, I propose to check before uploading files whether they've already been transferred to the JobManager or not. Only in the latter case the file is uploaded then. This should improve the deployment speed of jobs depending on the same user code jars.