[HIVE-2590] HBase bulk load wiki page improvements - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.8.0
Component/s: Documentation, HBase Handler
Labels:
- wiki

Tags:
wiki

Description

Some suggestions on the page https://cwiki.apache.org/confluence/display/Hive/HBaseBulkLoad which seems kind of out of date:

1. It seems like it's required that the number of reduce tasks in the "Sort Data" phase be one more than the number of keys selected in the "Range Partitioning" step, or else you get an error like this:

Caused by: java.lang.IllegalArgumentException: Can't read partitions file
at org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:91)
... 15 more
Caused by: java.io.IOException: Wrong number of partitions in keyset
at org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:72)
... 15 more

If so, it would be helpful if this was explicitly pointed out.

2. It recommends that you should use the "loadtable" ruby script to put data into hbase, but if you run this on newer versions of HBase (e.g. 0.90.3) it errors:

DISABLED!!!! Use completebulkload instead. See tail of http://hbase.apache.org/bulk-loads.html

The instructions should probably be changed to use completebulkload instead of this script.

Attachments

Activity

People

Assignee:: Ben West

Reporter:: Ben West

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 17/Nov/11 18:51

Updated:: 02/Aug/13 11:31

Resolved:: 17/Nov/11 21:02