Thanks for reviewing.
1. I will add these parameters in the config xml file.
2. By default hive.auto.convert.join = false right now, all the existing test cases won't be affected
3. I am also thinking about putting the backup task into task directly, which is the simplest way to implement this. My only concern is that it will take more than time de/serializing the task.
4. I will remove this the print statement.
5. The same as point 3.
6. I will fix it, some svn synchronization problem.
7. Right now the back up task is generated during the execution time. That's why it is not easy to work with explain task. But if we put backup task into task directly, we can solve this problem. Also we should set the backup task during the compile time instead of execution time. The only cost is the task serialization time.
8. Because we need to reuse the code of MapJoinProcessor, which uses join tree and row resolver to generate the new map join operator. So each time when generating a new map join operator, we need a deep copy of join tree and op context. Several classes need to be Serializable.
9. I generated these test cases output by set the hive.auto.convert.join = false first, then reset the flag as true. So I can compare whether the result is correct or not.
Since right now, the join result is correct, I can add explain into test case.
10.I will fix the conditional task to make it more generic.