Cassandra
  1. Cassandra
  2. CASSANDRA-5746

HHOM.countPendingHints is a trap for the unwary

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: 2.0 beta 2
    • Component/s: Core, Tools
    • Labels:
      None

      Description

      countPendingHints can OOM the server fairly easily since it does a per-target seq scan without paging.

      More generally, countPendingHints is far too slow to be useful for routine monitoring.

        Activity

        Hide
        Jonathan Ellis added a comment -

        If we want it to be fast enough for monitoring use, we need to denormalize the count. So the question is, do we use a Counters table, a normal int table with manual lock-and-read-before-write, or just one-off it with AtomicInteger and sync to disk occasionally?

        Personally I'd be inclined towards the last; it's okay if under- or over-count (because of periodic CL sync for instance), as long as we reliably distinguish between zero and non-zero hints for a given target. We could sanity check that with an approach like listEndpointsPendingHints on startup, which would be fairly low-overhead.

        Show
        Jonathan Ellis added a comment - If we want it to be fast enough for monitoring use, we need to denormalize the count. So the question is, do we use a Counters table, a normal int table with manual lock-and-read-before-write, or just one-off it with AtomicInteger and sync to disk occasionally? Personally I'd be inclined towards the last; it's okay if under- or over-count (because of periodic CL sync for instance), as long as we reliably distinguish between zero and non-zero hints for a given target. We could sanity check that with an approach like listEndpointsPendingHints on startup, which would be fairly low-overhead.
        Hide
        Tyler Hobbs added a comment -

        Doesn't the TTL on hints cells complicate all of those strategies? I can't think of a cheap way to schedule all of those future decrements to a counter.

        Instead, I suppose we could recount on demand if the time since the last count is greater than the smallest TTL we've seen, but without throttling of some sort recounts would still happen frequently under some circumstances.

        Alternatively, it seems like a fair amount of work, but perhaps a get_range_counts() implementation with internal auto-paging (like get_count) is a decent option?

        Show
        Tyler Hobbs added a comment - Doesn't the TTL on hints cells complicate all of those strategies? I can't think of a cheap way to schedule all of those future decrements to a counter. Instead, I suppose we could recount on demand if the time since the last count is greater than the smallest TTL we've seen, but without throttling of some sort recounts would still happen frequently under some circumstances. Alternatively, it seems like a fair amount of work, but perhaps a get_range_counts() implementation with internal auto-paging (like get_count) is a decent option?
        Hide
        Jonathan Ellis added a comment -

        What if we redefine the problem to be "how many hints have I generated for node X?" and get rid of countPendingHints entirely?

        ISTM that's a more important indicator of cluster health than how many hints X expects to get when he comes back up. I can live without that.

        Show
        Jonathan Ellis added a comment - What if we redefine the problem to be "how many hints have I generated for node X?" and get rid of countPendingHints entirely? ISTM that's a more important indicator of cluster health than how many hints X expects to get when he comes back up. I can live without that.
        Hide
        Tyler Hobbs added a comment -

        > What if we redefine the problem to be "how many hints have I generated for node X?" and get rid of countPendingHints entirely?

        That seems like a reasonable replacement metric for most purposes, but I can see where countPendingHints() might still be useful. However, in its current state, it seems too dangerous to leave in. Maybe pull it out in this ticket and open a new ticket for a better implementation if there's interest?

        I'm assuming we don't want to persist this one to disk. (Most RRD-style metric systems handle normally monotonically increasing metrics resetting to 0 occasionally, so just counting hints created since the node has been up should be fine.)

        Show
        Tyler Hobbs added a comment - > What if we redefine the problem to be "how many hints have I generated for node X?" and get rid of countPendingHints entirely? That seems like a reasonable replacement metric for most purposes, but I can see where countPendingHints() might still be useful. However, in its current state, it seems too dangerous to leave in. Maybe pull it out in this ticket and open a new ticket for a better implementation if there's interest? I'm assuming we don't want to persist this one to disk. (Most RRD-style metric systems handle normally monotonically increasing metrics resetting to 0 occasionally, so just counting hints created since the node has been up should be fine.)
        Hide
        Jonathan Ellis added a comment -

        Sounds good to me. Let's try to get this into b2 so we don't rip it out of a stable release.

        Show
        Jonathan Ellis added a comment - Sounds good to me. Let's try to get this into b2 so we don't rip it out of a stable release.
        Hide
        Tyler Hobbs added a comment -

        0001 removes countPendingHints().

        0002 adds a per-endpoint count through the new metrics system. I made HHOM.hintFor() non-static, as I couldn't see why the singleton couldn't be used. Let me know if there was some motivation for that.

        Show
        Tyler Hobbs added a comment - 0001 removes countPendingHints(). 0002 adds a per-endpoint count through the new metrics system. I made HHOM.hintFor() non-static, as I couldn't see why the singleton couldn't be used. Let me know if there was some motivation for that.
        Hide
        Jonathan Ellis added a comment -

        LGTM, committed. (Tweaked to use LoadingCache.getUnchecked instead of manually throwing RTE ourselves, also in the existing incrPastWindow.)

        Show
        Jonathan Ellis added a comment - LGTM, committed. (Tweaked to use LoadingCache.getUnchecked instead of manually throwing RTE ourselves, also in the existing incrPastWindow.)

          People

          • Assignee:
            Tyler Hobbs
            Reporter:
            Jonathan Ellis
            Reviewer:
            Jonathan Ellis
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development