Details
Description
Repro step:
Use gremlinpython 3.4.0 to issue 'g.V()' (or any gremlin query) on a sufficiently large graph, so that the response is large enough that the server needs to send multiple partial responses (status code: 206).
The 'g.v()' needs to be the first query immediately after establishing the connection.
Behavior:
The query only returns the first partial response from the gremlin server to the user. Even though the gremlinpython client ends up reading all the data from the gremlin server.
Why Critical:
- This is a correctness issue from the end-user point of view.
- The client is only getting partial data while paying the cost of running the entire query on the gremlin server.
Diagnosis:
There was a recent change in gremlinpython to stop using a recursive call to read from a steaming response. The change made the caller (def _receive(self): in connection.py) use a while loop to read the streaming response.
While this change is fine, on the first request to a WebSocket connection, after the authentication is done, we are still making a recursive call to read data from the response stream. Ideally, after the authentication is done, we should return control to the def _receive(self) method so that it can do the read.
Such mixing of recursive call and while loop is causing the following behavior:
1. If the first response immediately after the authentication is a streaming response, we are only reading the first chunk of the result.
Fix:
Add a 'return' before the recursive call to read data after auth (see attachment below)
--> This means that the caller now can use the while loop read the streaming response.
--> Otherwise the caller gets back a status code 'None' and thinks that it is not a partial response.