Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7926 long-lived daemons for query fragment execution, I/O and caching
  3. HIVE-10617

LLAP: allocator occasionally has a spurious failure to allocate due to "partitioned" locking and has to retry

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • llap
    • None

    Description

      See HIVE-10482 and the comment in code. Right now this is worked around by retrying.
      Simple case - thread can reserve memory from manager and bounce between checking arena 1 and arena 2 for memory as other threads allocate and deallocate from respective arenas in reverse order, making it look like there's no memory. More importantly this can happen when buddy blocks are split when lots of stuff is allocated.

      This can be solved either with some form of helping (esp. for split case) or by making allocator an "actor" (or set of actors, one per 1-N arenas that they would own), to satisfy alloc requests more deterministically (and also get rid of most sync).

      Attachments

        Activity

          People

            sershe Sergey Shelukhin
            sershe Sergey Shelukhin
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: