Uploaded image for project: 'Apache Helix'
  1. Apache Helix
  2. HELIX-683

Clean monitoring cache upon helix controller enable monitoring

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      We found a bug in reporting cluster status, partition masterless duration.

      The root cause is that the duration is calculated based on controller cache. And currently, this cache is not cleaned when leadership is changed. As a result, if controller A start a mastership handoff but was interrupted once, the start time will be kept in cache until next mastership handoff on the same partition happens. Then the later handoff duration will be calculated based on the stale start time. This could be super large.

      To fix it, we might consider clean cache when leadership changed.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              hzzh0301 Harry Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: