Uploaded image for project: 'Apache YuniKorn'
  1. Apache YuniKorn
  2. YUNIKORN-6

Deadlock in the scheduler when restoring a node with reservations

    XMLWordPrintableJSON

Details

    • 0.8

    Description

      During testing we saw a dead lock in the scheduler that stopped all scheduling.
      The scheduler is doing two thing at the time:

      • normal scheduling routine
      • restoring reservations from a node

      The scheduler is compiled by default with race detection and during the same test runs we also saw a data race logged.

      Stack and data race log info are attached (cleaned up versions)

      Attachments

        1. datarace.log
          16 kB
          Wilfred Spiegelenburg
        2. deadlock_stack.log
          7 kB
          Wilfred Spiegelenburg

        Issue Links

          Activity

            People

              wilfreds Wilfred Spiegelenburg
              wilfreds Wilfred Spiegelenburg
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m