Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-4414

Ship the exact cause for timeout and unavailable exception back to the client

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: 1.2.0 beta 1
    • Component/s: None
    • Labels:
      None

      Description

      Currently when a client gets a timeout exception, it doesn't know what happened. But sever side we usually know a little more than that. For example, for 99% of the timeouts at CL > ONE, we will have some replica that have acknowlege. I think it could be useful to send that information to the client with the exception. That could help with monitoring, "post-mortem" analysis and debugging.

      Unavailable exceptions could also ship which nodes were alive and which ones where not.

        Issue Links

          Activity

          Hide
          jeromatron Jeremy Hanna added a comment -

          Would be great to have more information in TimedOutException to the client. Since the hadoop support is a 'client', having more detail there is invaluable.

          Show
          jeromatron Jeremy Hanna added a comment - Would be great to have more information in TimedOutException to the client. Since the hadoop support is a 'client', having more detail there is invaluable.
          Hide
          jeromatron Jeremy Hanna added a comment -

          Any reason this has to be targeted at 1.2 rather than 1.1.x or even 1.0.x? Seems like it would be a pretty applicable patch.

          Show
          jeromatron Jeremy Hanna added a comment - Any reason this has to be targeted at 1.2 rather than 1.1.x or even 1.0.x? Seems like it would be a pretty applicable patch.
          Hide
          brandon.williams Brandon Williams added a comment -

          The thing is, if we keep the exception returning the same type you'd have to do string parsing on it (which I've done a lot of with IRE, and agree it sucks.) If we change the return type, checked exceptions screw java clients.

          Show
          brandon.williams Brandon Williams added a comment - The thing is, if we keep the exception returning the same type you'd have to do string parsing on it (which I've done a lot of with IRE, and agree it sucks.) If we change the return type, checked exceptions screw java clients.
          Hide
          slebresne Sylvain Lebresne added a comment -

          I haven't checked but maybe thrift will gentle enough to allow adding optional fields to the exceptions, which would be my preferred option. But yes, that is one of the reason why I've targeted to 1.2, but don't take it at face value, it's not definitive.

          Show
          slebresne Sylvain Lebresne added a comment - I haven't checked but maybe thrift will gentle enough to allow adding optional fields to the exceptions, which would be my preferred option. But yes, that is one of the reason why I've targeted to 1.2, but don't take it at face value, it's not definitive.
          Hide
          jbellis Jonathan Ellis added a comment -

          I do think we should break this apart into a separate exception; string parsing sucks, and checking exception subfields sucks only a little less. (I'm also thinking that one exception = one error code for CASSANDRA-3979.)

          For CASSANDRA-4285 I proposed InProgressException but I'm not a big fan of that. IncompleteException may be better... but one of the things I'd like to avoid is the perception that this is a "failure," and InProgress is better in that respect than Incomplete.

          Show
          jbellis Jonathan Ellis added a comment - I do think we should break this apart into a separate exception; string parsing sucks, and checking exception subfields sucks only a little less. (I'm also thinking that one exception = one error code for CASSANDRA-3979 .) For CASSANDRA-4285 I proposed InProgressException but I'm not a big fan of that. IncompleteException may be better... but one of the things I'd like to avoid is the perception that this is a "failure," and InProgress is better in that respect than Incomplete.
          Hide
          jbellis Jonathan Ellis added a comment -

          I do think we should break this apart into a separate exception

          I still would like to do this in an ideal world but I think our hands might be tied by the client community... unless we're willing to break everyone who doesn't add IPE, the optional field is a better approach.

          Show
          jbellis Jonathan Ellis added a comment - I do think we should break this apart into a separate exception I still would like to do this in an ideal world but I think our hands might be tied by the client community... unless we're willing to break everyone who doesn't add IPE, the optional field is a better approach.
          Hide
          jbellis Jonathan Ellis added a comment -
          Show
          jbellis Jonathan Ellis added a comment - patchset at https://github.com/jbellis/cassandra/tree/4414
          Hide
          slebresne Sylvain Lebresne added a comment -

          The patch doesn't compile due to missing TimeoutException -> TimedOutException renaming in HintedHandoffManager.

          Otherwise the patch lgtm.

          nit: Maybe in WriteResponseHandler, we could keep the blockFor value as a field instead of the table, which would avoid having to care about null checks

          Show
          slebresne Sylvain Lebresne added a comment - The patch doesn't compile due to missing TimeoutException -> TimedOutException renaming in HintedHandoffManager. Otherwise the patch lgtm. nit: Maybe in WriteResponseHandler, we could keep the blockFor value as a field instead of the table, which would avoid having to care about null checks
          Hide
          jbellis Jonathan Ellis added a comment -

          fixed compile + nit, committed

          Show
          jbellis Jonathan Ellis added a comment - fixed compile + nit, committed

            People

            • Assignee:
              jbellis Jonathan Ellis
              Reporter:
              slebresne Sylvain Lebresne
              Reviewer:
              Sylvain Lebresne
            • Votes:
              2 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development