Issue Details (XML | Word | Printable)

Key: HADOOP-5887
Type: New Feature New Feature
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Aaron Kimball
Reporter: Aaron Kimball
Votes: 1
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Sqoop should create tables in Hive metastore after importing to HDFS

Created: 21/May/09 09:22 PM   Updated: 25/Sep/09 09:36 PM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: 0.21.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works HADOOP-5887.2.patch 2009-06-03 05:21 PM Aaron Kimball 52 kB
Text File Licensed for inclusion in ASF works HADOOP-5887.patch 2009-05-21 09:23 PM Aaron Kimball 52 kB
Issue Links:
Dependants
 

Hadoop Flags: Reviewed
Release Note: New Sqoop argument --hive-import facilitates loading data into Hive.
Resolution Date: 23/Jun/09 04:34 PM


 Description  « Hide
Sqoop (HADOOP-5815) imports tables into HDFS; it is a straightforward enhancement to then generate a Hive DDL statement to recreate the table definition in the Hive metastore and move the imported table into the Hive warehouse directory from its upload target.

This feature enhancement makes this process automatic. An import is performed with sqoop in the usual way; providing the argument "--hive-import" will cause it to then issue a CREATE TABLE .. LOAD DATA INTO statement to a Hive shell. It generates a script file and then attempts to run "$HIVE_HOME/bin/hive" on it, or failing that, any "hive" on the $PATH; $HIVE_HOME can be overridden with --hive-home. As a result, no direct linking against Hive is necessary.

The unit tests provided with this enhancement use a mock implementation of 'bin/hive' that compares the script it's fed with one from a directory full of "expected" scripts. The exact script file referenced is controlled via an environment variable. It doesn't actually load into a proper Hive metastore, but manual testing has shown that this process works in practice, so the mock implementation is a reasonable unit testing tool.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Repository Revision Date User Message
ASF #787746 Tue Jun 23 16:33:57 UTC 2009 tomwhite HADOOP-5887. Sqoop should create tables in Hive metastore after importing to HDFS. Contributed by Aaron Kimball.
Files Changed
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/testdata/hive/scripts/numericImport.q
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/src/java/org/apache/hadoop/sqoop/hive
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/src/java/org/apache/hadoop/sqoop/util/Executor.java
MODIFY /hadoop/mapreduce/trunk/CHANGES.txt
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/testdata/hive
MODIFY /hadoop/mapreduce/trunk/src/contrib/sqoop/src/java/org/apache/hadoop/sqoop/manager/ConnManager.java
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/testdata/hive/scripts/normalImport.q
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/testdata/hive/bin/hive
MODIFY /hadoop/mapreduce/trunk/src/contrib/sqoop/src/java/org/apache/hadoop/sqoop/ImportOptions.java
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/testdata/hive/scripts/dateImport.q
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/src/test/org/apache/hadoop/sqoop/hive/TestHiveImport.java
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/src/java/org/apache/hadoop/sqoop/util/LoggingStreamHandlerFactory.java
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/src/java/org/apache/hadoop/sqoop/hive/HiveImport.java
MODIFY /hadoop/mapreduce/trunk/src/contrib/sqoop/src/test/org/apache/hadoop/sqoop/AllTests.java
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/src/java/org/apache/hadoop/sqoop/util/NullStreamHandlerFactory.java
MODIFY /hadoop/mapreduce/trunk/src/contrib/sqoop/src/test/org/apache/hadoop/sqoop/testutil/ImportJobTestCase.java
MODIFY /hadoop/mapreduce/trunk/src/contrib/sqoop/src/java/org/apache/hadoop/sqoop/Sqoop.java
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/src/test/org/apache/hadoop/sqoop/hive
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/src/java/org/apache/hadoop/sqoop/hive/HiveTypes.java
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/src/java/org/apache/hadoop/sqoop/util/StreamHandlerFactory.java
MODIFY /hadoop/mapreduce/trunk/src/contrib/sqoop/build.xml
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/testdata/hive/scripts/failingImport.q
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/src/java/org/apache/hadoop/sqoop/hive/TableDefWriter.java
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/testdata/hive/bin
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/testdata/hive/scripts
MODIFY /hadoop/mapreduce/trunk/src/contrib/sqoop/src/java/org/apache/hadoop/sqoop/orm/ClassWriter.java
ADD /hadoop/mapreduce/trunk/src/contrib/sqoop/testdata
MODIFY /hadoop/mapreduce/trunk/src/contrib/sqoop/src/java/org/apache/hadoop/sqoop/manager/SqlManager.java