- Server Installation from Source
-
- Prerequisites
-
- - machine to build the installation tar on
- - machine on which the server can be installed — this should have
- access to the Hadoop cluster in question, and be accessible from
- the machines you launch jobs from
- - an RDBMS — we recommend MySQL and provide instructions for it
- - Hadoop cluster
- - Unix user that the server will run as, and, if you are running your
- cluster in secure mode, an associated Kerberos service principal and keytabs.
- - Apache Ant 1.8 or greater. Version 1.7.x is not supported.
-
-
- Throughout these instructions when you see a word in italics it
- indicates a place where you should replace the word with a locally
- appropriate value such as a hostname or password.
-
- Building a tarball
-
- If you downloaded HCatalog from Apache or another site as a source release,
- you will need to first build a tarball to install. You can tell if you have
- a source release by looking at the name of the object you downloaded. If
- it is named hcatalog-src-0.5.0-incubating.tar.gz (notice the
- src in the name) then you have a source release.
-
- If you do not already have Apache Ant installed on your machine, you
- will need to obtain it. You can get it from the
- Apache Ant website. Once you download it, you will need to unpack it
- somewhere on your machine. The directory where you unpack it will be referred
- to as ant_home in this document.
-
- To produce a binary tarball from downloaded src tarball, execute the following steps:
- tar xzf hcatalog-src-0.5.0-incubating.tar.gz
- cd hcatalog-src-0.5.0-incubating
- ant_home/bin/ant package
- The tarball for installation should now be at
- build/hcatalog-0.5.0-incubating.tar.gz
-
- Database Setup
-
- If you do not already have Hive installed with MySQL, the following will
- walk you through how to do so. If you have already set this up, you can skip
- this step.
-
- Select a machine to install the database on. This need not be the same
- machine as the Thrift server, which we will set up later. For large
- clusters we recommend that they not be the same machine. For the
- purposes of these instructions we will refer to this machine as
- hivedb.acme.com
-
- Install MySQL server on hivedb.acme.com. You can obtain
- packages for MySQL from MySQL's
- download site. We have developed and tested with versions 5.1.46
- and 5.1.48. We suggest you use these versions or later.
- Once you have MySQL up and running, use the mysql command line
- tool to add the hive user and hivemetastoredb
- database. You will need to pick a password for your hive
- user, and replace dbpassword in the following commands with it.
-
- mysql -u root
- mysql> CREATE USER 'hive'@'hivedb.acme.com' IDENTIFIED BY 'dbpassword';
- mysql> CREATE DATABASE hivemetastoredb DEFAULT CHARACTER SET latin1 DEFAULT COLLATE latin1_swedish_ci;
- mysql> GRANT ALL PRIVILEGES ON hivemetastoredb.* TO 'hive'@'hivedb.acme.com' WITH GRANT OPTION;
- mysql> flush privileges;
- mysql> quit;
-
- Use the database installation script found in the Hive package to create the
- database. hive_home in the line below refers to the directory
- where you have installed Hive. If you are using Hive rpms, then this will
- be /usr/lib/hive.
-
- mysql -u hive -D hivemetastoredb -hhivedb.acme.com -p < hive_home/scripts/metastore/upgrade/mysql/hive-schema-0.10.0.mysql.sql
-
- Thrift Server Setup
-
- If you do not already have Hive running a metastore server using Thrift,
- you can use the following instructions to setup and run one. You may skip
- this step if you already are using a Hive metastore server.
-
- Select a machine to install your Thrift server on. For smaller and test
- installations this can be the same machine as the database. For the
- purposes of these instructions we will refer to this machine as
- hcatsvr.acme.com.
-
- If you have not already done so, install Hive 0.10.0 (say) on this machine. You
- can use the
- binary distributions
- provided by Hive or rpms available from
- Apache Bigtop. If you use
- the Apache Hive binary distribution, select a directory, henceforth
- referred to as hive_home, and untar the distribution there.
- If you use the rpms, hive_home will be
- /usr/lib/hive.
-
- Install the MySQL Java connector libraries on hcatsvr.acme.com.
- You can obtain these from
- MySQL's
- download site.
-
- Select a user to run the Thrift server as. This user should not be a
- human user, and must be able to act as a proxy for other users. We suggest
- the name "hive" for the user. Throughout the rest of this documentation
- we will refer to this user as hive. If necessary, add the user to
- hcatsvr.acme.com.
-
- Select a root directory for your installation of HCatalog. This
- directory must be owned by the hive user. We recommend
- /usr/local/hive. If necessary, create the directory. You will
- need to be the hive user for the operations described in the remainder
- of this Thrift Server Setup section.
-
- Copy the HCatalog installation tarball into a temporary directory, and untar
- it. Then change directories into the new distribution and run the HCatalog
- server installation script. You will need to know the directory you chose
- as root and the
- directory you installed the MySQL Java connector libraries into (referred
- to in the command below as dbroot). You will also need your
- hadoop_home, the directory where you have Hadoop installed, and
- the port number you wish HCatalog to operate on which you will use to set
- portnum.
-
- tar zxf hcatalog-0.5.0-incubating.tar.gz
- cd hcatalog-0.5.0-incubating
- share/hcatalog/scripts/hcat_server_install.sh -r root -d dbroot -h hadoop_home -p portnum
-
- Now you need to edit your hive_home/conf/hive-site.xml file.
- If there is no such file in hive conf directory, copy hcat_home/etc/hcatalog/proto-hive-site.xml
- and rename it hive-site.xml in hive_home/conf/.
- Open this file in your favorite text editor. The following table shows the
- values you need to configure.
-
-
-
- | Parameter |
- Value to Set it to |
-
-
- | hive.metastore.local |
- false |
-
-
- | javax.jdo.option.ConnectionURL |
- jdbc:mysql://hostname/hivemetastoredb?createDatabaseIfNotExist=true where hostname is the name of the machine you installed MySQL on. |
-
-
- | javax.jdo.option.ConnectionDriverName |
- com.mysql.jdbc.Driver |
-
-
-
- | javax.jdo.option.ConnectionUserName |
- hive |
-
-
- | javax.jdo.option.ConnectionPassword |
- dbpassword value you used in setting up the MySQL server
- above. |
-
-
- | hive.semantic.analyzer.factory.impl |
- org.apache.hcatalog.cli.HCatSemanticAnalyzerFactory |
-
-
- | hive.metastore.warehouse.dir |
- The directory can be a URI or an absolute file path. If it is an absolute file path, it will be resolved to a URI by the metastore:
- -- If default hdfs was specified in core-site.xml, path resolves to HDFS location.
- -- Otherwise, path is resolved as local file: URI.
- This setting becomes effective when creating new tables (it takes precedence over default DBS.DB_LOCATION_URI at the time of table creation).
- You only need to set this if you have not yet configured Hive to run on your system.
- |
-
-
- | hive.metastore.uris |
- thrift://hostname:portnum where hostname is the name of the machine hosting the Thrift server, and portnum is the port number
- used above in the installation script. |
-
-
- | hive.metastore.execute.setugi |
- true |
-
-
- | hive.metastore.sasl.enabled |
- Set to true if you are using Kerberos security with your Hadoop
- cluster, false otherwise. |
-
-
- | hive.metastore.kerberos.keytab.file |
- The path to the Kerberos keytab file containing the metastore
- Thrift server's service principal. Only required if you set
- hive.metastore.sasl.enabled above to true. |
-
-
- | hive.metastore.kerberos.principal |
- The service principal for the metastore Thrift server. You can
- reference your host as _HOST and it will be replaced with your
- actual hostname. Only required if you set
- hive.metastore.sasl.enabled above to true. |
-
-
-
- You can now proceed to starting the server.
-
-
-