1) do you need to increase the VERSION number in HiveServer?
Good point. I've changed in the next patch.
2) is it better to put the setupSessionIO() in execute()? If it is already there, should we remove the one in the constructor? And cleanup the Driver at the end of execute()?
session IO cannot be cleaned up at the end of execute(). The data is copied back to the client by fetch* functions, so the client has to do the clean up. Also sessionIO is better to be set up in the constructors because out and err can be used by any function (not only execute). The execute() function is just doing a cleanup work.
3) the len and pos local var in cleanTmpFile is not used.
4) maybe not related to this jira: the SessionState in Hive is thread local object, is it guaranteed that the HiveServerHandler is also thread local, (so there is a 1-1 match)?
HiveServer constructs a new HiveServerHandler for each worker thread. So for each CLI remote connection there is a HiveServerHandler, which will create a thread local SessionState. I've manually tested 100 parallel runs of remote CLI and they are fine.