Uploaded image for project: 'HCatalog'
  1. HCatalog
  2. HCATALOG-623

Understanding how to use the HBase bulk import feature

    XMLWordPrintableJSON

Details

    • Documentation
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.5
    • None
    • hbase
    • None

    Description

      I'm working through use of the HBaseBulkOutputFormat and I'm getting stuck. I have a simple example that replicates the ImportTsv example from the HBase documentation. The end result is the ImportSequenceFile job failing due to jars missing from its classpath. Presumably I've not configured something correctly. In this example I'm using Pig.

      Here's the error message and also the command files and commands I use to run them.

      $ hadoop fs -put simple.tsv /tmp/
      $ HCAT_CLASSPATH=$(hbase classpath) hcat -f simple.ddl
      $ PIG_CLASSPATH=$(hbase classpath) pig -v -useHCatalog simple.bulkload.pig
      

      Error message:

      2013-02-19 19:55:30,354 WARN org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for class class org.apache.zookeeper.ZooKeeper in order to ship it to the cluster.
      2013-02-19 19:55:30,355 WARN org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for class class org.apache.hadoop.hbase.client.HTable in order to ship it to the cluster.
      2013-02-19 19:55:30,357 WARN org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for class class org.apache.hadoop.hive.ql.metadata.HiveException in order to ship it to the cluster.
      2013-02-19 19:55:30,358 WARN org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for class class org.apache.hcatalog.mapreduce.HCatOutputFormat in order to ship it to the cluster.
      2013-02-19 19:55:30,359 WARN org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for class class org.apache.hcatalog.hbase.HBaseHCatStorageHandler in order to ship it to the cluster.
      2013-02-19 19:55:30,360 WARN org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for class class org.apache.hadoop.hive.hbase.HBaseSerDe in order to ship it to the cluster.
      2013-02-19 19:55:30,361 WARN org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for class class org.apache.hadoop.hive.metastore.api.Table in order to ship it to the cluster.
      2013-02-19 19:55:30,363 WARN org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for class interface org.apache.thrift.TBase in order to ship it to the cluster.
      2013-02-19 19:55:30,364 WARN org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for class class org.apache.hadoop.hbase.util.Bytes in order to ship it to the cluster.
      2013-02-19 19:55:30,365 WARN org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for class class com.facebook.fb303.FacebookBase in order to ship it to the cluster.
      2013-02-19 19:55:30,366 WARN org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for class class com.google.common.util.concurrent.ThreadFactoryBuilder in order to ship it to the cluster.
      

      Attachments

        1. simple.tsv
          0.1 kB
          Nick Dimiduk
        2. simple.ddl
          0.2 kB
          Nick Dimiduk
        3. simple.bulkload.pig
          0.2 kB
          Nick Dimiduk

        Issue Links

          Activity

            People

              ndimiduk Nick Dimiduk
              ndimiduk Nick Dimiduk
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: