Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3603

Enable client-side caching for scans on HBase

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 0.12.0
    • HBase Handler
    • None

    Description

      HBaseHandler sets up a TableInputFormat MR job against HBase to read data in. The underlying implementation (in HBaseHandler.java) makes an RPC call per row-key, which makes it very inefficient. Need to specify a client side cache size on the scan.

      Note that HBase currently only supports num-rows based caching (no way to specify a memory limit). Created HBASE-6770 to address this.

      Attachments

        1. HIVE-3603.D7761.1.patch
          9 kB
          Phabricator

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            navis Navis Ryu Assign to me
            karthik.ranga Karthik Ranganathan
            Votes:
            0 Vote for this issue
            Watchers:
            14 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment