Uploaded image for project: 'Myriad'
  1. Myriad
  2. MYRIAD-137

Resources offered by mesos are blocked with Myriad FWK on NullPointerException and FlexDown FGS NM.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • Myriad 0.1.0
    • Scheduler
    • None

    Description

      Observed this issue on 2 instances when I did a flex down of FGS NM & On another instance, this happened when NullPointerException occurred (JIRA Myriad-135).

      From Mesos UI, observed that no resources are left to offer, when there was no utilization happening in the cluster, except 3 NMs (2 MP, 1 ZP).

      On debugging RM logs, found the NullPointerException which caused the OfferEventHandler thread to exit and no more offers from mesos to myriad after that.

      Then, I tried restarting RM again, and resources are back to mesos again

      Then, I tried running few mapreduce jobs and observed the issue with Flexing down FGS NM which caused the whole resources offered to myriad to block completely and myriad didn't release any resources after that.

      So, it seems that Flexing down NMs procedure only cleanup the active containers & NM itself, but doesn't clean up outstanding offers incase offers are saved to OfferLifeCycle for future task by FGS NMs.

      Resources (From mesos-master UI)
      =========

      CPUs Mem
      Total 84 253.9 GB
      Used 3.300 6.1 GB
      Offered 80.700 247.8 GB
      Idle 1.4210854715202004e-14 0 B <------ No Resources available.

      Here is the active Offers (blocked) shown on mesos UI for offers:

      Offers
      =====

      ID Framework Host CPUs Mem
      …5050-3270-O4151 MyriadAlpha node101-116 0.5 64 MB
      …5050-3270-O4149 MyriadAlpha node101-116 0.200 282 MB
      …5050-3270-O4147 MyriadAlpha node101-116 1 1.0 GB
      …5050-3270-O4145 MyriadAlpha node101-116 1 1.0 GB
      …5050-3270-O4143 MyriadAlpha node101-116 1 1.0 GB
      …5050-3270-O4141 MyriadAlpha node101-116 1 1.0 GB
      …5050-3270-O4139 MyriadAlpha node101-117 24.5 87.8 GB
      …5050-3270-O4137 MyriadAlpha node101-116 22.9 87.4 GB
      …5050-3270-O4135 MyriadAlpha node101-117 3 3.0 GB
      …5050-3270-O4134 MyriadAlpha node101-137 25.6 65.2 GB

      Attachments

        Activity

          People

            smarella Santosh Marella
            sarjeet Sarjeet Singh
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: