Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
The error message printed when a netty-client cannot connect to another worker is worded in a way that our users are interpreting as a failure with storm.
There are times, such as at topology launch when such messages are normal as not all of the workers have been launched on all of the supervisors yet.
Other times, it is indicative of a failure (uncaught exception, OOM) on another worker, but the end user believes that this client is failing, due to the error message.
eg:
2015-12-03 12:28:53.338 b.s.m.n.Client [ERROR] connection attempt 10 to Netty-Client-host1.grid.myco.com/10.1.2.3:6710 failed: java.net.ConnectException: Connection refused: host1.grid.myco.com/10.1.2.3:6710
We should change the message to be more informative to our end users as to what happened, and it should not be an ERROR, but a Warning, as there are occasions when one would expect to see this.