Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-1820

Make Hive database data center aware

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.7.0
    • Metastore
    • None
    • Reviewed

    Description

      In order to support multiple data centers (different DFS, MR clusters) for hive, it is desirable to extend Hive database to be data center aware.

      Currently Hive database is a logical concept and has no DFS or MR cluster info associated with it. Database has the location property indicating the default warehouse directory, but user cannot specify and change it. In order to make it data center aware, the following info need to be maintained:

      1) data warehouse root location which is the default HDFS location for newly created tables (default=hive.metadata.warehouse.dir).
      2) scratch dir which is the HDFS location where MR intermediate files are created (default=hive.exec.scratch.dir)
      3) MR job tracker URI that jobs should be submitted to (default=mapred.job.tracker)
      4) hadoop (bin) dir ($HADOOP_HOME/bin/hadoop)

      These parameters should be saved in database.parameters (key, value) pair and they overwrite the jobconf parameters (so if the default database has no parameter it will get it from the hive-default.xml or hive-site.xml as it is now).

      Attachments

        1. HIVE-1820.patch
          2 kB
          Ning Zhang

        Activity

          People

            nzhang Ning Zhang
            nzhang Ning Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: