Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
Impala 2.0
-
None
Description
Even though the client api is intended to be non-blocking for statement execution requests, Impala still blocks for some part (or parts) of query execution. The client cannot know how long a request will block which makes setting a good timeout value very difficult.
Clients now have these two options:
1) Set an long enough timeout to allow for execution. The value could be 5 or more mins. But if a timeout occurs, the user needs to investigate if the query was still executing. Also a real network timeout means a very long wait interval before retrying the request.
2) Don't set a timeout. Most of the time this works well but when a real networking problem happens the client will hang forever.
https://github.com/cloudera/impyla/issues/7 shows an example of such problems. Impyla ended up going with option #2.
I've found #2 to be a big burden. To avoid hanging, any request to the server needs to be done in a separate thread which is then monitored and timed out. Option #1 is an easier way of avoiding hangs.
Attachments
Issue Links
- blocks
-
IMPALA-718 ExecuteStatement (HS2) and query (beeswax) are supposed to be non-blocking RPC but for DDL, they block
- Open
-
IMPALA-915 Ability to cancel queries while in the FE.
- In Progress
- is related to
-
IMPALA-7555 impala-shell can hang in connect in certain cases
- Resolved
-
IMPALA-10811 RPC to submit query getting stuck for AWS NLB forever.
- Resolved
- relates to
-
IMPALA-7312 Non-blocking mode for Fetch() RPC
- Resolved
-
IMPALA-9239 Add timeout for query analysis/planning
- Open
1.
|
Return a profile for queries during planning | Open | Unassigned | |
2.
|
ExecuteStatement (HS2) and query (beeswax) are supposed to be non-blocking RPC but for DDL, they block | Open | Unassigned | |
3.
|
Ability to cancel queries while in the FE. | In Progress | Michael Smith | |
4.
|
Analysis and planning should happen asynchronously | Open | Michael Smith |