Hi Min Zhou,
Thank you for sharing nice articles. I'm very sorry for late response due to preparing the 0.8.0 release work and the work as an employee. Also, for one month, I've investigated the background of this area that you listed.
I'd like to discuss some questions and suggestions that you threw. First of all, I agree with your suggestion for porting sparrow to Tajo. Now, we need the query scheduler to well support multiple users and multiple running queries. As you mentioned, sparrow is proper to these requirements while it is low latency.
Also, you concerned with the way we use Yarn, and you propose two ways. One of them is that we use yarn as higher layer resource management and sparrow-like scheduler as a lower layer scheduler. I'd like to throw +1 for this suggestion.
One of problems you concerned was that this approach will cause the radical change. Especially, there are two-types schedulers. The many dependent modules make this work very hard.
So, I suggest to comment out all TajoYarnResourceManager and the scheduler code related to Yarn. Then, we only focus on the sparrow-like scheduler for standalone scheduler. Later, we can recover the yarn scheduler. I think that this way makes our work more faster.
What do you think about my proposal? After we get some agreement, we can discuss several stages for this work. I'm looking forward to your response.