"Output directory already exists" error is seen in gridmix when gridmix.output.directory is not defined. When gridmix.output.directory is not defined, then gridmix uses inputDir/gridmix/ as output path for gridmix run. Because gridmix is creating outputPath(in this case, inputDir/gridmix/) at the begining, the output path to generate-data-mapreduce-job(i.e. inputDir) already exists and becomes error from mapreduce.
There is need for creation of this outputPath in any case(whether user specifies the path using gridmix.output.directory OR gridmix itself considering inputDir/gridmix/ ) even though the paths are automatically created for output paths of mapreduce jobs(like mkdir -p), because gridmix needs to set 777 permissions for this outputPath sothat different users can create different output directories of different mapreduce jobs within this gridmix run.
The other case in which this problem is seen is when gridmix.output.directory is defined as a relative path. This is because in this case also, gridmix tries to create relative path under ioPath/ and thus the same issue.