Uploaded image for project: 'Apache Gora'
  1. Apache Gora
  2. GORA-117

gora hbase does not have a mechanism to set the caching on a scanner, which makes for poor performance on map/reduce jobs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.2
    • 0.4
    • gora-hbase
    • None

    Description

      goraci runs a map/reduce job over all the data that it generates. The hbase storage uses a scanner that doesn't cache rows, which means every fetch requires an RPC call. I experimented with

      scan.setCaching(1000);

      and goraci Verify ran about 30x faster.

      Attachments

        1. GORA-117.patch
          4 kB
          alfonso.nishikawa

        Activity

          People

            stack Michael Stack
            ecn Eric C. Newton
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: