Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-1828

HBaseStorage has problems with processing multiregion tables

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.8.0
    • None
    • None
    • None
    • Hadoop 0.20.2, Hbase 0.20.6, Distributed mode

    • pig, hbase, hbasestorage

    Description

      As brought up in the pig user mailing list (http://www.mail-archive.com/user%40pig.apache.org/msg00606.html) Pig does sometime not scan the full HBase table.
      It seems that HBaseStorage has problems scanning large tables. It issues just one mapper job instead of one mapper job per table region.
      Ian Stevens, who brought this issue up in the mailing list, attached a script to reproduce the problem (https://gist.github.com/766929).
      However, in my case, the problem only occurred, after the table was split into more than one regions.

      Attachments

        Activity

          People

            dvryaboy Dmitriy V. Ryaboy
            mr_luk Lukas
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: