I am seeing following exception in Knox gateway logs:
I added more debug logs in Knox to analyze the issue and observed following was happening:
- Knox generates proper URL for backend websocket connection
- Websocket upgrade successful at HTTP protocol layer. Knox starts setting up its session management internal data structures for this connection
- A data frame arrives on the connection from CDAP backend and Knox starts processing it in a separate thread but internal data structures have not yet been established fully:
- As a result of this exception, Knox starts closing this connection as well as connection with front end.
- Internal data structures are now setup in the other thread
- Fronted UI sees connection closed and send a new connection request.
Looks like there is a race condition/improper synchronization between two Knox threads: one thread opening a websocket connection, setting up the connection session and the other thread processing a message/data packet sent by UI service backend.