The "magic" number of 4 is the default number of job init threads (mapred.jobinit.threads). You have to submit 4 (or precisely mapred.jobinit.threads) or more jobs as the jobtracker user at the same time to make sure the job init thread are initialized as the system user so they can access the mapred.system.dir (for security reasons, it must be 700). Otherwise, some of the job init threads will be initialized as whatever user who first submits a job. This can lead to seemingly more bizarre behavior: some time it works (the job is initialized by one of the system threads) and sometime it doesn't (the job is initialized by one of the user threads). Once you know the root cause, it's pretty trivial to come up with a patch. The default fifo scheduler and capacity scheduler do not have this bug.