Create a new HCatalog table like an existing one
+http://www.myserver.com/templeton/v1/ddl/database/:db/table/:existingtable/like/:newtable
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| :existingtable | +The existing table name | +Required | +None | +
| :newtable | +The new table name. | +Required | +None | +
| group | +The user group to use when creating a table | +Optional | +None | +
| permissions | +The permissions string to use when creating a table. | +Optional | +None | +
| external | +Allows you to specify a location so that Hive does not use the default + location for this table. | +Optional | +false | +
| ifNotExists | +If true, you will not receive an error if the table already exists. | +Optional | +false | +
| location | +The HDFS path | +Optional | +None | +
| Name | Description |
|---|---|
| table | +The new table name | +
| database | +The database name | +
Curl Command
+JSON Output
+Returns a list of the response types supported by Templeton.
+http://www.myserver.com/templeton/:version
| Name | Description | Required? | Default |
|---|---|---|---|
| :version | +The Templeton version number. (Currently this must be "v1") | +Required | +None | +
| Name | Description |
|---|---|
| responseTypes | +A list of all supported response types | +
Curl Command
+JSON Output
+JSON Output (error)
+Check the status of a job and get related job information given its job ID. + Substitute ":jobid" with the job ID received when the job was created.
+http://www.myserver.com/templeton/v1/queue/:jobid
| Name | Description | Required? | Default |
|---|---|---|---|
| :jobid | +The job ID to check. This is the ID received when the + job was created. | +Required | +None | +
| Name | Description |
|---|---|
| status | +A JSON object containing the job status information. + See the Hadoop documentation + (Class + JobStatus) for more information. | +
| profile | +A JSON object containing the job profile information. + See the Hadoop documentation + (Class + JobProfile) for more information. + | +
| id | +The job ID. | +
| parentId | +The parent job ID. | +
| percentComplete | +The job completion percentage, for example "75% complete". | +
| exitValue | +The job's exit value. | +
| user | +User name of the job creator. | +
| callback | +The callback URL, if any. | +
| completed | +A string representing completed status, for example "done". | +
Curl Command
+JSON Output
+List all the partitions in an HCatalog table.
+http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/partition
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| :table | +The table name | +Required | +None | +
| Name | Description |
|---|---|
| partitions | +A list of partition values and of partition names | +
| database | +The database name | +
| table | +The table name | +
Curl Command
+JSON Output
+Runs a Hive query or set of commands.
+http://www.myserver.com/templeton/v1/hive
| Name | Description | Required? | Default |
|---|---|---|---|
| execute | +String containing an entire, short hive program to run. | +One of either "execute" or "file" is required | +None | +
| file | +HDFS file name of a hive program to run. | +One of either "exec" or "file" is required | +None | +
| define | +Set a Hive configuration variable using the syntax
+ define=NAME=VALUE. |
+ Optional | +None | +
| statusdir | +A directory where Templeton will write the status of the + Hive job. If provided, it is the caller's responsibility + to remove this directory when done. | +Optional | +None | +
| callback | +Define a URL to be called upon job completion. You may embed a specific
+ job ID into this URL using $jobId. This tag
+ will be replaced in the callback URL with this job's job ID. |
+ Optional | +None | +
| Name | Description |
|---|---|
| id | +A string containing the job ID similar to "job_201110132141_0001". | +
| info | +A JSON object containing the information returned when the job was queued. + See the Hadoop documentation + (Class + TaskController) for more information. | +
Curl Command
+JSON Output
+Results
+Delete (drop) a partition in an HCatalog table.
+http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/partition/:partition
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| :table | +The table name | +Required | +None | +
| :partition | +The partition name, col_name='value' list. Be careful to properly + encode the quote for http, for example, country=%27algeria%27. | +Required | +None | +
| ifExists | +Hive returns an error if the partition specified does not exist, + unless ifExists is set to true. | +Optional | +false | +
| group | +The user group to use | +Optional | +None | +
| permissions | +The permissions string to use. The format is
+ "rwxrw-r-x". |
+ Optional | +None | +
| Name | Description |
|---|---|
| partition | +The partition name | +
| table | +The table name | +
| database | +The database name | +
Curl Command
+JSON Output
+Add a single property on an HCatalog table. + This will also reset an existing property.
+http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/property/:property
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| :table | +The table name | +Required | +None | +
| :property | +The property name | +Required | +None | +
| group | +The user group to use | +Optional | +None | +
| permissions | +The permissions string to use | +Optional | +None | +
| value | +The property value | +Required | +None | +
| Name | Description |
|---|---|
| database | +The database name | +
| table | +The table name | +
| property | +The property name | +
Curl Command
+JSON Output
+Performs an + HCatalog DDL command. The command is executed immediately upon request. + Responses are limited to 1MB. For requests which may return longer results + consider using the Hive resource as an alternative.
+http://www.myserver.com/templeton/v1/ddl
| Name | Description | Required? | Default |
|---|---|---|---|
| exec | +The HCatalog ddl string to execute | +Required | +None | +
| group | +The user group to use when creating a table | +Optional | +None | +
| permissions | +The permissions string to use when creating a table. The format is
+ "rwxrw-r-x". |
+ Optional | +None | +
| Name | Description |
|---|---|
| stdout | +A string containing the result HCatalog sent to standard out (possibly empty). | +
| stderr | +A string containing the result HCatalog sent to standard error + (possibly empty). | +
| exitcode | +The exitcode HCatalog returned. | +
Curl Command
+JSON Output
+JSON Output (error)
+List the tables in an HCatalog database.
+http://www.myserver.com/templeton/v1/ddl/database/:db/table
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| like | +List only tables whose names match the specified pattern | +Optional | +"*" (List all tables) | +
| Name | Description |
|---|---|
| tables | +A list of table names | +
| database | +The database name | +
Curl Command
+JSON Output
+JSON Output (error)
+Describe a single partition in an HCatalog table.
+http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/partition/:partition
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| :table | +The table name | +Required | +None | +
| :partition | +The partition name, col_name='value' list. Be careful to properly + encode the quote for http, for example, country=%27algeria%27. | +Required | +None | +
| Name | Description |
|---|---|
| database | +The database name | +
| table | +The table name | +
| partition | +The partition name | +
| partitioned | +True if the table is partitioned | +
| location | +Location of table | +
| outputFormat | +Output format | +
| columns | +list of column names, types, and comments | +
| owner | +The owner's user name | +
| partitionColumns | +List of the partition columns | +
| inputFormat | +Input format | +
Curl Command
+JSON Output
+Delete a database.
+http://www.myserver.com/templeton/v1/ddl/database/:db
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| ifExists | +Hive returns an error if the database specified does not exist, + unless ifExists is set to true. | +Optional | +false | +
| option | +Parameter set to either "restrict" or "cascade". Restrict will remove the + schema if all the tables are empty. Cascade removes everything including + data and definitions. | +Optional | +None | +
| group | +The user group to use | +Optional | +None | +
| permissions | +The permissions string to use. The format is
+ "rwxrw-r-x". |
+ Optional | +None | +
| Name | Description |
|---|---|
| database | +The database name | +
Curl Command
+JSON Output
+JSON Output (error)
+Create a database.
+http://www.myserver.com/templeton/v1/ddl/database/:db
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| group | +The user group to use | +Optional | +None | +
| permissions | +The permissions string to use | +Optional | +None | +
| location | +The database location | +Optional | +None | +
| comment | +A comment for the database, like a description | +Optional | +None | +
| properties | +The database properties | +Optional | +None | +
| Name | Description |
|---|---|
| database | +The database name | +
Curl Command
+JSON Output
+Delete (drop) an HCatalog table.
+http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| :table | +The table name | +Required | +None | +
| ifExists | +Hive 0.70 and later returns an error if the table specified does not exist, + unless ifExists is set to true. | +Optional | +false | +
| group | +The user group to use | +Optional | +None | +
| permissions | +The permissions string to use. The format is
+ "rwxrw-r-x". |
+ Optional | +None | +
| Name | Description |
|---|---|
| table | +The table name | +
| database | +The database name | +
Curl Command
+JSON Output
+| Resource | Description |
|---|---|
| :version | +Returns a list of supported response types. |
| status | +Returns the Templeton server status. |
| version | +Returns the a list of supported versions and the current version. |
| ddl | +Performs an HCatalog DDL command. |
| ddl/database | +List HCatalog databases. |
| ddl/database/:db (GET) | +Describe an HCatalog database. |
| ddl/database/:db (PUT) | +Create an HCatalog database. |
| ddl/database/:db (DELETE) | +Delete (drop) an HCatalog database. |
| ddl/database/:db/table | +List the tables in an HCatalog database. |
| ddl/database/:db/table/:table (GET) | +Describe an HCatalog table. |
| ddl/database/:db/table/:table (PUT) | +Create a new HCatalog table. |
| ddl/database/:db/table/:table (POST) | +Rename an HCatalog table. |
| ddl/database/:db/table/:table (DELETE) | +Delete (drop) an HCatalog table. |
| ddl/database/:db/table/:existingtable/like/:newtable (PUT) | +Create a new HCatalog table like an existing one. |
| ddl/database/:db/table/:table/partion | +List all partitions in an HCatalog table. |
| ddl/database/:db/table/:table/partion/:partition (GET) | +Describe a single partition in an HCatalog table. |
| ddl/database/:db/table/:table/partion/:partition (PUT) | +Create a partition in an HCatalog table. |
| ddl/database/:db/table/:table/partion/:partition (DELETE) | +Delete (drop) a partition in an HCatalog table. |
| ddl/database/:db/table/:table/column | +List the columns in an HCatalog table. |
| ddl/database/:db/table/:table/column/:column (GET) | +Describe a single column in an HCatalog table. |
| ddl/database/:db/table/:table/column/:column (PUT) | +Create a column in an HCatalog table. |
| ddl/database/:db/table/:table/property (GET) | +List table properties. |
| ddl/database/:db/table/:table/property/:property (GET) | +Return the value of a single table property. |
| ddl/database/:db/table/:table/property/:property (PUT) | +Set a table property. |
| mapreduce/streaming | +Creates and queues Hadoop streaming MapReduce jobs. |
| mapreduce/jar | +Creates and queues standard Hadoop MapReduce jobs. |
| pig | +Creates and queues Pig jobs. |
| hive | +Runs Hive queries and commands. |
| queue | +Returns a list of all jobids registered for the user. |
| queue/:jobid (GET) | +Returns the status of a job given its ID. |
| queue/:jobid (DELETE) | +Kill a job given its ID. |
Create and queue an + Hadoop + streaming MapReduce job.
+http://www.myserver.com/templeton/v1/mapreduce/streaming
| Name | Description | Required? | Default |
|---|---|---|---|
| input | +Location of the input data in Hadoop. | +Required | +None | +
| output | +Location in which to store the output data. If not specified, + Templeton will store the output in a location that can be discovered + using the queue resource. | +Optional | +See description | +
| mapper | +Location of the mapper program in Hadoop. | +Required | +None | +
| reducer | +Location of the reducer program in Hadoop. | +Required | +None | +
| file | +Add an HDFS file to the distributed cache. | +Optional | +None | +
| define | +Set an Hadoop configuration variable using the syntax
+ define=NAME=VALUE |
+ Optional | +None | +
| cmdenv | +Set an environment variable using the syntax
+ cmdenv=NAME=VALUE |
+ Optional | +None | +
| arg | +Set a program argument. | +Optional | +None | +
| statusdir | +A directory where Templeton will write the status of the + Map Reduce job. If provided, it is the caller's responsibility + to remove this directory when done. | +Optional | +None | +
| callback | +Define a URL to be called upon job completion. You may embed a specific
+ job ID into this URL using $jobId. This tag
+ will be replaced in the callback URL with this job's job ID. |
+ Optional | +None | +
| Name | Description |
|---|---|
| id | +A string containing the job ID similar to "job_201110132141_0001". | +
| info | +A JSON object containing the information returned when the job was queued. + See the Hadoop documentation + (Class + TaskController) for more information. | +
Code and Data Setup
+Curl Command
+JSON Output
+Results
+Rename an HCatalog table.
+http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| :table | +The existing (old) table name | +Required | +None | +
| rename | +The new table name | +Required | +None | +
| group | +The user group to use | +Optional | +None | +
| permissions | +The permissions string to use. The format is
+ "rwxrw-r-x". |
+ Optional | +None | +
| Name | Description |
|---|---|
| table | +The new table name | +
| database | +The database name | +
Curl Command
+JSON Output
+JSON Output (error)
+Return a list of all job IDs registered to the user.
+http://www.myserver.com/templeton/v1/queue
Only the standard parameters + are accepted.
+| Name | Description |
|---|---|
| ids | +A list of all job IDs registered to the user. | +
Curl Command
+JSON Output
+Create a column in an HCatalog table.
+http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/column/:column
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| :table | +The table name | +Required | +None | +
| :column | +The column name | +Required | +None | +
| group | +The user group to use | +Optional | +None | +
| permissions | +The permissions string to use | +Optional | +None | +
| type | +The type of column to add, like "string" or "int" | +Required | +None | +
| comment | +The column comment, like a description | +Optional | +None | +
| Name | Description |
|---|---|
| column | +The column name | +
| table | +The table name | +
| database | +The database name | +
Curl Command
+JSON Output
+| Resource | Description |
|---|---|
| ddl | +Performs an HCatalog DDL command. |
| ddl/database | +List HCatalog databases. |
| ddl/database/:db (GET) | +Describe an HCatalog database. |
| ddl/database/:db (PUT) | +Create an HCatalog database. |
| ddl/database/:db (DELETE) | +Delete (drop) an HCatalog database. |
| ddl/database/:db/table | +List the tables in an HCatalog database. |
| ddl/database/:db/table/:table (GET) | +Describe an HCatalog table. |
| ddl/database/:db/table/:table (PUT) | +Create a new HCatalog table. |
| ddl/database/:db/table/:table (POST) | +Rename an HCatalog table. |
| ddl/database/:db/table/:table (DELETE) | +Delete (drop) an HCatalog table. |
| ddl/database/:db/table/:existingtable/like/:newtable (PUT) | +Create a new HCatalog table like an existing one. |
| ddl/database/:db/table/:table/partion | +List all partitions in an HCatalog table. |
| ddl/database/:db/table/:table/partion/:partition (GET) | +Describe a single partition in an HCatalog table. |
| ddl/database/:db/table/:table/partion/:partition (PUT) | +Create a partition in an HCatalog table. |
| ddl/database/:db/table/:table/partion/:partition (DELETE) | +Delete (drop) a partition in an HCatalog table. |
| ddl/database/:db/table/:table/column | +List the columns in an HCatalog table. |
| ddl/database/:db/table/:table/column/:column (GET) | +Describe a single column in an HCatalog table. |
| ddl/database/:db/table/:table/column/:column (PUT) | +Create a column in an HCatalog table. |
| ddl/database/:db/table/:table/property (GET) | +List table properties. |
| ddl/database/:db/table/:table/property/:property (GET) | +Return the value of a single table property. |
| ddl/database/:db/table/:table/property/:property (PUT) | +Set a table property. |
Return the value of a single table property.
+http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/property/:property
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| :table | +The table name | +Required | +None | +
| :property | +The property name | +Required | +None | +
| Name | Description |
|---|---|
| property | +The requested property's name: value pair | +
| database | +The database name | +
| table | +The table name | +
Curl Command
+JSON Output
+JSON Output (error)
+Returns the current status of the Templeton server. + Useful for heartbeat monitoring.
+http://www.myserver.com/templeton/v1/status
Only the standard parameters + are accepted.
+| Name | Description |
|---|---|
| status | +"ok" if the Templeton server was contacted. | +
| version | +String containing the version number similar to "v1". | +
Curl Command
+JSON Output
+Describe a database. (Note: this resource has a "format=extended" parameter however + the output structure does not change if it is used.)
+http://www.myserver.com/templeton/v1/ddl/database/:db
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| Name | Description |
|---|---|
| location | +The database location | +
| params | +The database parameters | +
| comment | +The database comment | +
| database | +The database name | +
Curl Command
+JSON Output
+JSON Output (error)
+Describe an HCatalog table. Normally returns a simple list of columns + (using "desc table"), but the extended format will show more information (using + "show table extended like").
+http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table?format=extended
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| :table | +The table name | +Required | +None | +
| format | +Set "format=extended" to see additional information (using "show table + extended like") | +Optional | +Not extended | +
| Name | Description |
|---|---|
| columns | +A list of column names and types | +
| database | +The database name | +
| table | +The table name | +
| partitioned (extended only) | +True if the table is partitioned | +
| location (extended only) | +Location of table | +
| outputFormat (extended only) | +Output format | +
| owner (extended only) | +The owner's user name | +
| partitionColumns (extended only) | +List of the partition columns | +
| inputFormat (extended only) | +Input format | +
Curl Command (simple)
+JSON Output (simple)
+Curl Command (extended)
+JSON Output (extended)
+JSON Output (error)
+List all the properties of an HCatalog table.
+http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/property
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| :table | +The table name | +Required | +None | +
| Name | Description |
|---|---|
| properties | +A list of the tables properties in name: value pairs | +
| database | +The database name | +
| table | +The table name | +
Curl Command
+JSON Output
+TEMPLETON_HOME environment variable to the base of the
+ HCatalog REST server installation. This will usually be HCATALOG_HOME/webhcat.
+ This is used to find the Templeton
+ configuration.templeton-site.xml as required. Ensure that
+ site specific component installation locations are accurate, especially
+ the Hadoop configuration path. Configuration variables that use a filesystem
+ path try to have reasonable defaults, but it's always safe to specify a full
+ and complete path.hcat
+ executable is in the PATH.ant jar from the
+ top level HCatalog directory.bin/templeton_server.sh start.bin/templeton_server.sh startbin/templeton_server.sh stopant e2eThe server requires some files be accessible on the + + Hadoop distributed cache. For example, to avoid the installation of Pig and Hive + everywhere on the cluster, the server gathers a version of Pig or Hive from the + Hadoop distributed cache whenever those resources are invoked. After placing the + following components into HDFS please update the site configuration as required for + each.
+ +hadoop-streaming.jar into HDFS. For example, use the
+ following command, substituting your path to the jar for the one below.
+templeton/src/hadoop_temp_fix/ugi.jar)
+ and should be placed into HDFS, as reflected in the current default configuration.
+The location of these files in the cache, and the location + of the installations inside the archives, can be specified using the following + Templeton configuration variables. (See the + Configuration documentation for more information + on changing Templeton configuration parameters.)
+ +| Name | Default | Description |
|---|---|---|
| templeton.pig.archive | +hdfs:///user/templeton/pig-0.9.2.tar.gz |
+ The path to the Pig archive. | +
| templeton.pig.path | +pig-0.9.2.tar.gz/pig-0.9.2/bin/pig |
+ The path to the Pig executable. | +
| templeton.hive.archive | +hdfs:///user/templeton/hcatalog-0.3.0.tar.gz |
+ The path to the Hive archive. | +
| templeton.hive.path | +hcatalog-0.3.0.tar.gz/hcatalog-0.3.0/bin/hive |
+ The path to the Hive executable. | +
| templeton.streaming.jar | +hdfs:///user/templeton/hadoop-streaming.jar |
+ The path to the Hadoop streaming jar file. | +
| templeton.override.jars | +hdfs:///user/templeton/ugi.jar |
+ Jars to add to the HADOOP_CLASSPATH for all Map Reduce jobs. + These jars must exist on HDFS. | +
+ Permission must given for the user running the templeton + executable to run jobs for other users. That is, the templeton + server will impersonate users on the Hadoop cluster. +
+ ++ Create (or assign) a Unix user who will run the Templeton server. + Call this USER. See the Secure Cluster section below for choosing + a user on a Kerberos cluster. +
+ ++ Modify the Hadoop core-site.xml file and set these properties: +
+ +| Variable | Value |
|---|---|
| hadoop.proxyuser.USER.groups | ++ A comma separated list of the Unix groups whose users will be + impersonated. + | +
| hadoop.proxyuser.USER.hosts | ++ A comma separated list of the hosts that will run the hcat and + JobTracker servers. + | +
+ To run Templeton on a secure cluster follow the Permissions
+ instructions above but create a Kerberos principal for the
+ Templeton server with the name USER/host@realm
+
+ Also, set the templeton configuration variables
+ templeton.kerberos.principal and
+ templeton.kerberos.keytab
+
The following example, extracted from the HCatalog documentation, shows how people + might use HCatalog along with various other Hadoop tools to move data from the grid + into a database and ultimately analyze it.
+ +Without Templeton there are three main steps to completing + the task.
+ +First, Joe in data acquisition uses distcp to get data + onto the grid.
+ +Second, Sally in data processing uses Pig to cleanse and prepare the + data. Oozie will be notified by HCatalog that data is available and can then + start the Pig job
+ +Third, Robert in client management uses Hive to analyze his + clients' results.
+ +With Templeton all these steps can be easily performed programatcally + upon receipt of the initial data. Sally and Robert can still maintain their own scripts + and simply push them into HDFS to be accessed when required by Templeton.
+ +Describe a single column in an HCatalog table.
+http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/column/:column
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| :table | +The table name | +Required | +None | +
| :column | +The column name | +Required | +None | +
| Name | Description |
|---|---|
| database | +The database name | +
| table | +The table name | +
| column | +A JSON object containing the column name, type, and comment (if any) | +
Curl Command
+JSON Output
+Create a partition in an HCatalog table.
+http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/partition/:partition
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| :table | +The table name | +Required | +None | +
| :partition | +The partition name, col_name='value' list. Be careful to properly + encode the quote for http, for example, country=%27algeria%27. | +Required | +None | +
| group | +The user group to use | +Optional | +None | +
| permissions | +The permissions string to use | +Optional | +None | +
| location | +The location for partition creation | +Required | +None | +
| ifNotExists | +If true, return an error if the partition already exists. | +Optional | +False | +
| Name | Description |
|---|---|
| partition | +The partition name | +
| table | +The table name | +
| database | +The database name | +
Curl Command
+JSON Output
+Creates and queues a standard + + Hadoop MapReduce job.
+http://www.myserver.com/templeton/v1/mapreduce/jar
| Name | Description | Required? | Default |
|---|---|---|---|
| jar | +Name of the jar file for Map Reduce to use. | +Required | +None | +
| class | +Name of the class for Map Reduce to use. | +Required | +None | +
| libjars | +Comma separated jar files to include in the classpath. | +Optional | +None | +
| files | +Comma separated files to be copied to the map reduce cluster | +Optional | +None | +
| arg | +Set a program argument. | +Optional | +None | +
| define | +Set an Hadoop configuration variable using the syntax
+ define=NAME=VALUE |
+ Optional | +None | +
| statusdir | +A directory where Templeton will write the status of the + Map Reduce job. If provided, it is the caller's responsibility + to remove this directory when done. | +Optional | +None | +
| callback | +Define a URL to be called upon job completion. You may embed a specific
+ job ID into this URL using $jobId. This tag
+ will be replaced in the callback URL with this job's job ID. |
+ Optional | +None | +
| Name | Description |
|---|---|
| id | +A string containing the job ID similar to "job_201110132141_0001". | +
| info | +A JSON object containing the information returned when the job was queued. + See the Hadoop documentation + (Class + TaskController) for more information. | +
Code and Data Setup
+Curl Command
+JSON Output
+List the databases in HCatalog.
+http://www.myserver.com/templeton/v1/ddl/database
| Name | Description | Required? | Default |
|---|---|---|---|
| like | +List only databases whose names match the specified pattern | +Optional | +"*" (List all) | +
| Name | Description |
|---|---|
| databases | +A list of database names | +
Curl Command
+JSON Output
+The configuration for Templeton merges the normal Hadoop configuration with + the Templeton specific variables. Because Templeton is designed to connect services + that are not normally connected, the configuration is more complex than might be + desirable.
+ +The Templeton specific configuration is split into two layers:
+ +The configuration files are loaded in this order with later files overriding + earlier ones.
+ +Note: the Templeton server will require restart + after any change to the configuration.
+ +To find the configuration files, Templeton first attempts to load a file from the
+ CLASSPATH and then looks in the directory specified in the
+ TEMPLETON_HOME environment variable.
Configuration files may access the special environment variable
+ env for all environment variables. For example, the pig executable
+ could be specified using:
Configuration variables that use a filesystem path try to have reasonable defaults. + However, it's always safe to specify the full and complete path if there is any + uncertainty.
+ +Note: The location of the log files created by Templeton and some other properties + of the logging system are set in the templeton-log4j.properties file.
+ +| Name | Default | Description |
|---|---|---|
| templeton.port | +50111 |
+ The HTTP port for the main server. | +
| templeton.hadoop.config.dir | +$(env.HADOOP_CONFIG_DIR) |
+ The path to the Hadoop configuration. | +
| templeton.jar | +${env.TEMPLETON_HOME}/templeton/templeton-0.1.0-dev.jar |
+ The path to the Templeton jar file. | +
| templeton.libjars | +${env.TEMPLETON_HOME}/lib/zookeeper-3.3.4.jar |
+ Jars to add to the classpath. | +
| templeton.override.jars | +hdfs:///user/templeton/ugi.jar |
+ Jars to add to the HADOOP_CLASSPATH for all Map Reduce jobs. + These jars must exist on HDFS. | +
| templeton.override.enabled | +true |
+ Enable the override path in templeton.override.jars | +
| templeton.streaming.jar | +hdfs:///user/templeton/hadoop-streaming.jar |
+ The hdfs path to the Hadoop streaming jar file. | +
| templeton.hadoop | +${env.HADOOP_PREFIX}/bin/hadoop |
+ The path to the Hadoop executable. | +
| templeton.pig.archive | +hdfs:///user/templeton/pig-0.9.2.tar.gz |
+ The path to the Pig archive. | +
| templeton.pig.path | +pig-0.9.2.tar.gz/pig-0.9.2/bin/pig |
+ The path to the Pig executable. | +
| templeton.hcat | +${env.HCAT_PREFIX}/bin/hcat |
+ The path to the Hcatalog executable. | +
| templeton.hive.archive | +hdfs:///user/templeton/hcatalog-0.3.0.tar.gz |
+ The path to the Hive archive. | +
| templeton.hive.path | +hcatalog-0.3.0.tar.gz/hcatalog-0.3.0/bin/hive |
+ The path to the Hive executable. | +
| templeton.hive.properties | +
+hive.metastore.local=false,
+hive.metastore.uris=thrift://localhost:9933,
+hive.metastore.sasl.enabled=false |
+ Properties to set when running hive. | +
| templeton.exec.encoding | +UTF-8 |
+ The encoding of the stdout and stderr data. | +
| templeton.exec.timeout | +10000 |
+ How long in milliseconds a program is allowed to run on the + Templeton box. + | +
| templeton.exec.max-procs | +16 |
+ The maximum number of processes allowed to run at once. | +
| templeton.exec.max-output-bytes | +1048576 |
+ The maximum number of bytes from stdout or stderr stored in ram. | +
| templeton.exec.envs | +HADOOP_PREFIX,HADOOP_HOME,JAVA_HOME |
+ The environment variables passed through to exec. | +
| templeton.zookeeper.hosts | +127.0.0.1:2181 |
+ ZooKeeper servers, as comma separated host:port pairs | +
| templeton.zookeeper.session-timeout | +30000 |
+ ZooKeeper session timeout in milliseconds | +
| templeton.callback.retry.interval | +10000 |
+ How long to wait between callback retry attempts in milliseconds | +
| templeton.callback.retry.attempts | +5 |
+ How many times to retry the callback | +
| templeton.storage.class | +org.apache.hcatalog.templeton.tool.ZooKeeperStorage |
+ The class to use as storage | +
| templeton.storage.root | +/templeton-hadoop |
+ The path to the directory to use for storage | +
| templeton.hdfs.cleanup.interval | +43200000 |
+ The maximum delay between a thread's cleanup checks | +
| templeton.hdfs.cleanup.maxage | +604800000 |
+ The maximum age of a templeton job | +
| templeton.zookeeper.cleanup.interval | +43200000 |
+ The maximum delay between a thread's cleanup checks | +
| templeton.zookeeper.cleanup.maxage | +604800000 |
+ The maximum age of a templeton job | +
| templeton.kerberos.secret | +A random value | +The secret used to sign the HTTP cookie value. The default + value is a random value. Unless multiple Templeton instances + need to share the secret the random value is adequate. | +
| templeton.kerberos.principal | +None | +The Kerberos principal to used by the server. As stated by the
+ Kerberos SPNEGO specification, it should be
+ USER/${HOSTNAME}@{REALM}. It does not have a
+ default value. |
+
| templeton.kerberos.keytab | +None | +The keytab file containing the credentials for the Kerberos + principal. | +
Kill a job given its job ID. + Substitute ":jobid" with the job ID received when the job was created.
+http://www.myserver.com/templeton/v1/queue/:jobid
| Name | Description | Required? | Default |
|---|---|---|---|
| :jobid | +The job ID to delete. This is the ID received when the job + job was created. | +Required | +None | +
| Name | Description |
|---|---|
| status | +A JSON object containing the job status information. + See the Hadoop documentation + (Class + JobStatus) for more information. | +
| profile | +A JSON object containing the job profile information. + See the Hadoop documentation + (Class + JobProfile) for more information. + | +
| id | +The job ID. | +
| parentId | +The parent job ID. | +
| percentComplete | +The job completion percentage, for example "75% complete". | +
| exitValue | +The job's exit value. | +
| user | +User name of the job creator. | +
| callback | +The callback URL, if any. | +
| completed | +A string representing completed status, for example "done". | +
Curl Command
+JSON Output
+Note: The job is not immediately deleted, therefore the
+ information returned may not reflect deletion, as in our example.
+ Use GET queue/:jobid
+ to monitor the job and confirm that it is eventually deleted.
Returns a list of supported versions and the current version.
+http://www.myserver.com/templeton/v1/version
Only the standard parameters + are accepted.
+| Name | Description |
|---|---|
| supportedVersions | +A list of all supported versions. | +
| version | +The current version. | +
Curl Command
+JSON Output
+Create a new HCatalog table. For more information, please refer to the + Hive documentation.
+http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| :table | +The new table name | +Required | +None | +
| group | +The user group to use when creating a table | +Optional | +None | +
| permissions | +The permissions string to use when creating a table. | +Optional | +None | +
| external | +Allows you to specify a location so that Hive does not use the default + location for this table. | +Optional | +false | +
| ifNotExists | +If true, you will not receive an error if the table already exists. | +Optional | +false | +
| comment | +Comment for the table | +Optional | +None | +
| columns | +A list of column descriptions, including name, type, and an optional comment. | +Optional | +None | +
| partitionedBy | +A list of column descriptions used to partition the table. Like the columns + parameter this is a list of name, type, and comment fields. | +Optional | +None | +
| clusteredBy | +An object describing how to cluster the table including the parameters + columnNames, sortedBy, numberOfBuckets. The sortedBy parameter includes + the parameters columnName and order. For further information + please refer to the examples below or to the + + Hive documentation. | +Optional | +None | +
| format | +Storage format description including paraeters for rowFormat, storedAs + and storedBy. For further information please refer to the examples below or to the + + Hive documentation. | +Optional | +None | +
| location | +The HDFS path | +Optional | +None | +
| tableProperties | +A list of table property names and values (key/value pairs) | +Optional | +None | +
| Name | Description |
|---|---|
| table | +The new table name | +
| database | +The database name | +
Curl Command
+Curl Command (using clusteredBy)
+JSON Output
+JSON Output (error)
+List the columns in an HCatalog table.
+http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/column
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | +The database name | +Required | +None | +
| :table | +The table name | +Required | +None | +
| Name | Description |
|---|---|
| columns | +A list of column names and types | +
| database | +The database name | +
| table | +The table name | +
Curl Command
+JSON Output
+This document describes HCatalog REST API. + As shown in the figure below, developers make HTTP requests to access + Hadoop MapReduce, + Pig, + Hive, and + + HCatalog DDL from within applications. + Data and code used by this API is maintained in + HDFS. HCatalog DDL commands + are executed directly when requested. + MapReduce, Pig, and Hive jobs are placed in queue by + and can be monitored for progress or stopped as required. + Developers specify a location + in HDFS into which Pig, Hive, and MapReduce results should be placed.
+HCatalog's REST resources are accessed using the following URL format:
+http://yourserver/templeton/v1/resource
where "yourserver" is replaced with your server name, and + "resource" is replaced by the HCatalog + resource name.
+For example, to check if the server is running you could + access the following URL:
+http://www.myserver.com/templeton/v1/status
The current version supports two types of security:
+Every REST resource can accept the following parameters to + aid in authentication:
+If the user.name parameter is not supplied when required, + the following error will be returned:
+Data and code that are used by HCatalog's REST resources must first be placed in + Hadoop. When placing files into HDFS is required you can use + whatever method is most convienient for you. We suggest WebHDFS since it provides + a REST interface for moving files into and out of HDFS.
+The server returns the following HTTP status codes.
+Other data returned directly by the server is returned in JSON format. + JSON responses are limited to 1MB in size. Responses over this limit must be + stored into HDFS using provided options instead of being directly returned. + If an HCatalog DDL command might return results greater than 1MB, it's + suggested that a corresponding Hive request be executed instead.
+The server creates three log files when in operation:
+In the tempelton-log4j.properties file you can set the location of these logs using the + variable templeton.log.dir. This log4j.properties file is set in the server startup script.
+The original work to add REST APIs to HCatalog was called Templeton. For backward compatibility + the name still appears in URLs, log file names, etc. The Templeton name is taken from a character + in the award-winning children's novel Charlotte's Web, by E. B. White. The novel's protagonist is a pig named + Wilber. Templeton is a rat who helps Wilber by running errands and making deliveries as + requested by Charlotte while spinning her web.
+Create and queue a Pig job.
+http://www.myserver.com/templeton/v1/pig
| Name | Description | Required? | Default |
|---|---|---|---|
| execute | +String containing an entire, short pig program to run. | +One of either "execcute" or "file" is required | +None | +
| file | +HDFS file name of a pig program to run. | +One of either "exec" or "file" is required | +None | +
| arg | +Set a program argument. | +Optional | +None | +
| files | +Comma separated files to be copied to the map reduce cluster | +Optional | +None | +
| statusdir | +A directory where Templeton will write the status of the + Pig job. If provided, it is the caller's responsibility + to remove this directory when done. | +Optional | +None | +
| callback | +Define a URL to be called upon job completion. You may embed a specific
+ job ID into this URL using $jobId. This tag
+ will be replaced in the callback URL with this job's job ID. |
+ Optional | +None | +
| Name | Description |
|---|---|
| id | +A string containing the job ID similar to "job_201110132141_0001". | +
| info | +A JSON object containing the information returned when the job was queued. + See the Hadoop documentation + (Class + TaskController) for more information. | +
Code and Data Setup
+Curl Command
+JSON Output
+