Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2921

Provide a bulkloadable option in HBaseStorage

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.9.2
    • None
    • data
    • None

    Description

      Right now, the Pig HBaseStorage writes Puts directly into HBase. This is slow for bulk operations (such as the ones Pig exactly does). The Puts/Deletes are more meant for realtime operations, so it would be nice if Pig had an automatic mechanism to prepare bulkloadable HFiles for the target table, and bulkload it in right at the end of the job.

      For compatibility reasons, this can be optional and turned off by default until it is agreed that this must be default (but can continue to provide a turn-off option).

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              qwertymaniac Harsh J
              Votes:
              2 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated: