The HCatalog IMPORT and EXPORT commands enable you to:
+The output location of the exported dataset is a directory that has the following structure:
+Note that this directory structure can be created using the EXPORT as well as HCatEximOuptutFormat for MapReduce or HCatPigStorer for Pig. And the data can be consumed using the IMPORT command as well as HCatEximInputFormat for MapReduce or HCatPigLoader for Pig.
+Exports a table to a specified location.
+ +|
+ EXPORT TABLE tablename [PARTITION (partcol1=val1, partcol2=val2, ...)] TO 'filepath' + |
+
|
+ TABLE tablename + |
+
+ The table to be exported. The table can be a simple table or a partitioned table. +If the table is partitioned, you can specify a specific partition of the table by specifying values for all of the partitioning columns or specifying a subset of the partitions of the table by specifying a subset of the partition column/value specifications. In this case, the conditions are implicitly ANDed to filter the partitions to be exported. + |
+
|
+ PARTITION (partcol=val ...) + |
+
+ The partition column/value specifications. + |
+
|
+ TO 'filepath' + |
+
+ The filepath (in single quotes) designating the location for the exported table. The file path can be: +
|
+
The EXPORT command exports a table's data and metadata to the specified location. Because the command actually copies the files defined for the table/partions, you should be aware of the following:
+Also, note the following:
+The examples assume the following tables:
+Example 1
+This example exports the entire table to the target location. The table and the exported copy are now independent; any further changes to the table (data or metadata) do not impact the exported copy. The exported copy can be manipulated/deleted w/o any effect on the table.
+Example 2
+This example exports the entire table including all the partitions' data/metadata to the target location.
+Example 3
+This example exports a subset of the partitions - those which have country = in - to the target location.
+Example 4
+This example exports a single partition - that which has country = in, state = tn - to the target location.
+Imports a table from a specified location.
+ +|
+ IMPORT [[EXTERNAL] TABLE tablename [PARTITION (partcol1=val1, partcol2=val2, ...)]] FROM 'filepath' [LOCATION 'tablepath'] + |
+
|
+ EXTERNAL + |
+
+ Indicates that the imported table is an external table. + |
+
|
+ TABLE tablename + |
+
+ The target to be imported, either a table or a partition. +If the table is partitioned, you can specify a specific partition of the table by specifying values for all of the partitioning columns, or specify all the (exported) partitions by not specifying any of the partition parameters in the command. + |
+
|
+ PARTITION (partcol=val ...) + |
+
+ The partition column/value specifications. + |
+
|
+ FROM 'filepath' + |
+
+ The filepath (in single quotes) designating the source location the table will be copied from. The file path can be: +
|
+
|
+ LOCATION 'tablepath' + |
+
+ (optional) The tablepath (in single quotes) designating the target location the table will be copied to. +If not specified, then: +
|
+
The IMPORT command imports a table's data and metadata from the specified location. The table can be a managed table (data and metadata are both removed on drop table/partition) or an external table (only metadata is removed on drop table/partition). For more information, see Hive's Create/Drop Table.
+ +Because the command actually copies the files defined for the table/partions, you should be aware of the following:
+Also, note the following:
+The examples assume the following tables:
+Example 1
+This example imports the table as a managed target table, default location. The metadata is stored in the metastore and the table's data files in the warehouse location of the current database.
+ +Example 2
+This example imports the table as a managed target table, default location. The imported table is given a new name.
+ + +Example 3
+This example imports the table as an external target table, imported in-place. The metadata is copied to the metastore.
+ + +Example 4
+This example imports the table as an external target table, imported to another location. The metadata is copied to the metastore.
+ + +Example 5
+This example imports the table as a managed target table, non-default location. The metadata is copied to the metastore.
+ + +Example 6
+This example imports all the exported partitions since the source was a partitioned table.
+ + +Example 7
+This example imports only the specified partition.
+HCatEximOutputFormat and HCatEximInputFormat can be used in Hadoop environments where there is no HCatalog instance available. HCatEximOutputFormat can be used to create an 'exported table' dataset, which later can be imported into a HCatalog instance. It can also be later read via HCatEximInputFormat or HCatEximLoader.
+ +The user can specify the parameters of the table to be created by means of the setOutput method. The metadata and the data files are created in the specified location.
+The target location must be empty and the user must have write access.
+The user specifies the data collection location and optionally a filter for the partitions to be loaded via the setInput method. Optionally, the user can also specify the projection columns via the setOutputSchema method.
+The source location should have the correct layout as for a exported table, and the user should have read access.
+HCatEximStorer and HCatEximLoader can be used in hadoop/pig environments where there is no HCatalog instance available. HCatEximStorer can be used to create an 'exported table' dataset, which later can be imported into a HCatalog instance. It can also be later read via HCatEximInputFormat or HCatEximLoader.
+ +The HCatEximStorer is initialized with the output location for the exported table. Optionally the user can specify the partition specification for the data, plus rename the schema elements as part of the storer.
+The rest of the storer semantics use the same design as HCatStorer.
+ +Example
+The HCatEximLoader is passed the location of the exported table as usual by the LOAD statement. The loader loads the metadata and data as required from the location. Note that partition filtering is not done efficiently when eximloader is used; the filtering is done at the record level rather than at the file level.
+The rest of the loader semantics use the same design as HCatLoader.
+Example
+Use Case 1
+Transfer data between different HCatalog/hadoop instances, with no renaming of tables.
+Use Case 2
+Transfer data to a hadoop instance which does not have HCatalog and process it there.
+Use Case 3
+Create an exported dataset in a hadoop instance which does not have HCatalog and then import into HCatalog in a different instance.
+