Uploaded image for project: 'Apache Gora'
  1. Apache Gora
  2. GORA-130

gora-accumulo caches tablet locations between map reduce jobs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.2
    • 0.2.1
    • gora-accumulo
    • None

    Description

      Enis added a new Loop program to goraci that continually runs Generation and Verification map reduce jobs. So you have one process launching multiple map reduce jobs. I was running this I noticed an issue. After the first round of generation, the table had 16 tablets. So verification ran with 16 mappers, one per tablet. Then more data was inserted and the table split to 32 tablets. When verification ran again it started 16 mappers instead of 32. Turns out the gora-accumulo store was using stale cached information about the table to create the input splits for the map reduce job.

      This issues will not affect the simple usage pattern of a single java process launching one map reduce job that reads from accumulo.

      Attachments

        Activity

          People

            kturner Keith Turner
            kturner Keith Turner
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: