Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-7711

Master updates registry for reregistering agents even when they haven't been unreachable

    Details

    • Type: Bug
    • Status: Reviewable
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: master
    • Labels:
      None

      Description

      During a master failover we observed many registry updates, on average one per two agents, as indicated by the log line

      I0609 04:46:25.220196 48864 registrar.cpp:550] Successfully updated the registry in 42.904064ms
      

      code

      In this case few agents were ever unreachable so most of them are redundant. Associated with each registry update is also the time spent on applying the operations

      I0609 04:46:26.475761 48897 registrar.cpp:493] Applied 1 operations in 11.673082ms; attempting to update the registry
      

      code

      Even though not consuming the time of the Master actor, all agent reregistrations are guarded and delayed by these operations, and this could be easily avoided by checking with the slaves.recovered field in Master.

        Activity

        Show
        xujyan Yan Xu added a comment - https://reviews.apache.org/r/60854/ https://reviews.apache.org/r/60400/ https://reviews.apache.org/r/60898/

          People

          • Assignee:
            xujyan Yan Xu
            Reporter:
            xujyan Yan Xu
            Shepherd:
            James Peach
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development