In order to support multiple data centers (different DFS, MR clusters) for hive, it is desirable to extend Hive database to be data center aware.
Currently Hive database is a logical concept and has no DFS or MR cluster info associated with it. Database has the location property indicating the default warehouse directory, but user cannot specify and change it. In order to make it data center aware, the following info need to be maintained:
1) data warehouse root location which is the default HDFS location for newly created tables (default=hive.metadata.warehouse.dir).
2) scratch dir which is the HDFS location where MR intermediate files are created (default=hive.exec.scratch.dir)
3) MR job tracker URI that jobs should be submitted to (default=mapred.job.tracker)
4) hadoop (bin) dir ($HADOOP_HOME/bin/hadoop)
These parameters should be saved in database.parameters (key, value) pair and they overwrite the jobconf parameters (so if the default database has no parameter it will get it from the hive-default.xml or hive-site.xml as it is now).