Knox should support load balancing multiple clients across multiple backend service instances.
The HA mechanism in Knox today only allows failover. What this means is that for every client that connects to Knox, the same backend is chosen even if there are multiple. This single backend will be used until it fails and then Knox will chose another backend sending all clients to this new backend.
The above design is simple, but has a few limitations:
- Even if the number of backends scale - there is no way for Knox to utilize this
- If any client causes a backend error - it causes Knox to failover all clients to a new backend
There are 2 types of backend services - stateful and stateless. Stateful backend services (like HiveServer2) require that once a client initiates a session that client is always sent to that same backend. Stateless backend services (like HBase REST) don't require clients to always be sent to the same backend. When stateful backend services are being used, load balancing can only be done across multiple clients.
There are two options to do stateful load balancing:
- Maintain the state at the server level and match clients to state
- Maintain the state at the client level and use that on the server side
Maintaining the state on the client side is more beneficial since then it can be used across multiple Knox instances even in the case of Knox failover.
One option to maintaining client state, that is described in the original design doc here: KNOX-843.pdf, is to store state in a cookie that gets sent back on the response to the client. The general flow is as follows:
- clientA makes a request no KNOX_BACKEND cookie
- Knox checks if HA is enabled and dispatches request to a random backend
- On response to clientA - sets KNOX_BACKEND cookie to hash of server who serviced the response
- On second request clientA sends back the KNOX_BACKEND cookie
- Knox looks at request and dispatches request to backend that matches hash of KNOX_BACKEND cookie
- if backend is down, Knox updates KNOX_BACKEND cookie with server that handled request on response
This approach requires that clients support cookies and many do since this is how many load balancers typically work today .
Some solution benefits include:
- this approach is used by load balancers like haproxy  and Traefik 
- Multiple clients can effectively use all the available backend services.
- A single backend going down won't affect all clients using Knox. Only affects clients using that single backend.
- Client side cookie could be used against different Knox servers. This allows for Knox restarts where clients bounce to another Knox server and maintain the backend session state.
Some solution limitations include:
- clients must support cookies - this should be the case today since many LB require cookies for sticky sessions. So if a user is concerned with HA, there is most likely a sticky session LB in front of Knox today.
The cookie being sent back to the client needs to hide backend service information so internal host details are not leaked. One option is to hash the backend before placing in the cookie.
This shouldn't affect failover since failover should be internal to Knox. The KNOX_BACKEND cookie should be updated on the response to the client with the server that handled the request in the case of a failover. The state changes should be on the client side response from Knox. There are some edge cases here were failover will need to be decided potentially per client instead of for the entire backend, but that shouldn't cause a big issue here.