Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
1.3.0, 1.4.3
-
None
-
None
-
Oracle 11gR2, Linux, Oracle Thin Client
Description
Trying to get data from Oracle 11gR2 to HBase. The import works, but CLOB columns are not making it into HBase.
Simplest testcase:
In Oracle:
CREATE TABLE TABLE1 ( NUMCOL NUMBER, STRCOL VARCHAR2(20 BYTE), CLOBCOL CLOB );
INSERT INTO TABLE1 (NUMCOL, STRCOL, CLOBCOL) VALUES (1, 'strval','clobval');
The sqoop command I run is following (the connect parameter is shortened, but works):
sqoop import --connect="jdbc:oracle:thin:..." --table TABLE1 --hbase-table table1 --hbase-create-table --hbase-row-key NUMCOL --column-family d -m 1
The job runs OK, the only surprising is the second to last line:
13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 7.3188 seconds (0 bytes/sec)
13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Retrieved 1 records.
Anyway, after looking at the table in HBase:
- hbase shell
Version 0.90.6-cdh3u4, r, Mon May 7 13:14:00 PDT 2012
hbase(main):001:0> scan 'table1'
ROW COLUMN+CELL
1 column=d:STRCOL, timestamp=1371070804479, value=strval
1 row(s) in 0.6070 seconds
The CLOBCOL is not there.
The problem can be worked around by appending mapping parameter:
--map-column-java CLOBCOL=String
With this parameter, the data gets into HBase.
hbase(main):001:0> scan 'table1'
ROW COLUMN+CELL
1 column=d:CLOBCOL, timestamp=1371135224197, value=clobval
1 column=d:STRCOL, timestamp=1371135224197, value=strval
1 row(s) in 0.5260 seconds