43229 – mod_jk 1.2.25 different fallback behavior on reply_timeout happens between 1.2.23 and 1.2.25

Bug 43229 - mod_jk 1.2.25 different fallback behavior on reply_timeout happens between 1.2.23 and 1.2.25

Summary: mod_jk 1.2.25 different fallback behavior on reply_timeout happens between 1...

Status:	RESOLVED FIXED

Alias:	None

Product:	Tomcat Connectors
Classification:	Unclassified
Component:	Common (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	P2 major (vote)
Target Milestone:	---
Assignee:	Tomcat Developers Mailing List

URL:
Keywords:

Depends on:
Blocks:

Reported:	2007-08-28 05:51 UTC by Motonobu Ichimura
Modified:	2008-10-05 03:10 UTC (History)
CC List:	0 users

Attachments
proposal patch. (504 bytes, patch) 2007-08-28 05:52 UTC, Motonobu Ichimura	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Motonobu Ichimura 2007-08-28 05:51:36 UTC

mod_jk 1.2.25 comes with JK_REPLY_TIMEOUT as new error status.

on 1.2.25, when reply_timeout is happend on getting replies from tomcat,
JK_REPLY_TIMEOUT is returned with http error code JK_HTTP_GATEWAY_TIME_OUT(504)
at ajp_service@jk_ajp_common.c(line 2038,2107)

Currently, mod_jk can handle a fallback operation (send a request to another
worker) by lb_worker only when http error code is JK_HTTP_SERVER_BUSY.
(see service@jk_lb_worker.c(line 1101))

so, handling JK_HTTP_GATEWAY_TIMEOUT with JK_REPLY_TIMEOUT status, the value of
rc is always JK_FALSE, and lb_worker doesn't try next one.

1.2.23 can fall back to the next one because of JK_HTTP_SERVER_BUSY is always
returned when reply_timeout occurs.

my proposal is that 

when reply_timeout happenes and op->recoverable is set,
return JK_HTTP_SERVER_BUSY as a http error code with status JK_REPLY_TIMEOUT
instead of JK_HTTP_GATEWAY_TIMEOUT in order to make lb_worker handle the
fallback behavior.

thanks in advance.

Comment 1 Motonobu Ichimura 2007-08-28 05:52:32 UTC

Created attachment 20721 [details]
proposal patch.

Comment 2 Rainer Jung 2007-09-02 16:08:22 UTC

Thank you for analyzing this problem. Yes, reply timeouts should allow
retries/failover, at least unles recovery_options disable them.

The interface between the service() method of an lb member and the lb itself
consists of the service() return code and the additional is_error, which is
meant to indicate the HTTP return code.

The lb needs to decide, if it should do a failover, and if the member needs to
be put into error state. The interface is not really rich enough to help with
these decisions.

Either we end up in using more fine grained return codes from service(), or we
add recoverability(=failover) and member error info as side effects,
additionally to is_error.

I'm actively investigating this. As a first step, I added some code comments,
which return codes to expect from the service() methods.

Please stay tuned.