After the fix for
TS-1962, there are some scenarios where Traffic Server can decide to reuse an origin server connection (because of keep-alive), even when the destination port is different. I encountered this with 4.0.2, though this bug is still in trunk, but it is masked by another fix.
When using Traffic Server 4.0.2 as a transparent proxy, two requests that are made to the same origin server but on different ports, may be sent to the same port.
All GETs work as expected:
- GET to origin:80 arrives at origin:80
- GET to origin:80, followed by GET to origin:9999 arrive at origin:80 and
- GET to origin:9999, followed by GET to origin:80 arrive at origin:9999 and
But stuff gets weird after a POST:
- POST to origin:9999, followed by GET to origin:80, arrive at origin:9999 and
This results in requests being sent to incorrect origin server ports, which may
result in the HTML app getting unexpected errors, or unexpected data. A workaround is to disable keep-alive, but that is of course bad for performance.
It looks like the problem is in proxy/http/HttpSM.cc, in function HttpSM::do_http_server_open():
When the condition in line 4540 is hit, and get_server_session() returns an existing session, the addresses of the existing session and the current one are compared (line 4545), but not the port numbers, as the comment in line 4544 indicates. This is incorrect, as an existing session to a different port on the same origin server should never be reused.
The most obvious fix is to add a comparison of the ports, e.g.:
I have added this locally as a patch, and with that, the scenario described at the top of this bug does not occur anymore.
Note that the issue in this particular scenario seems to be masked by the fix for
TS-312. Initially, I tried reproducing with trunk, but I found it only occurs before that fix. After TS-312, the condition in line 4540 above is not hit anymore, but the first one in line 4510 is hit instead. However, there may still be ways to make the execution flow branch to line 4545, and that could still trigger the bug.