[SQOOP-685] Support HBase bulk loading as another way to load data into HBase - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 2.0.0
Fix Version/s: 2.0.0
Component/s: hbase-integration
Labels:
None

Description

HBase has a bulk loading feature that can be used by Sqoop to stage files and then bulk load them into HBase. This is preferable for large amounts of data as the normal CRUD based API is otherwise quickly overloaded. See the HBase suppied ImportTsv.java and its used of the "importtsv.bulk.output" command line option. It shows how to easily switch between direct API import and bulk file staging.

It might be necessary to add an additional step into Sqoop that allows to sample the data and presplit the table into the right amount of regions before the initial loading. This could be done here, or as another issue.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Lars George

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 07/Nov/12 09:34

Updated:: 07/Nov/12 09:34