Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4488

Impala session times out prematurely - even if there is a running query

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: Impala 2.6.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend, Clients
    • Labels:
      None

      Description

      Setting --idle_session_timeout to a lower value (for example 60 seconds) may cause queries to fail (to be cancelled) with "Session expired due to inactivity".

      Currently we are experiencing this only through 3rd party clients (through Cloudera Impala ODBC connector), impala-shell and Hue/Impala query editor works well.

      On the ODBC client side the error message is

      Oct 20 13:43:41 ERROR 3933976352 Statement::SQLExecDirectW: [Cloudera][ImpalaODBC] (110) Error while executing a query in Impala: [HY000] : Unknown error and could not get runtime log: Client session expired due to more than 60s of inactivity (last activity was at: 2016-10-20 13:42:33). 
      

      Based on the Impala documentation running queries should not be killed because of the "idle session timeout":

      The --idle_session_timeout option specifies the time in seconds after which an idle session is expired. A session is idle when no activity is occurring for any of the queries in that session, and the session has not started any new queries.

      To reproduce:

      1. Set --idle_session_timeout=60
      2. Set up Cloudera Impala ODBC driver
      3. Through a 3rd party ODBC client run a bigger query, lasting longer than the session timeout
      1. profile.txt
        51 kB
        Jim Apple
      2. profile.txt
        51 kB
        Miklos Szurap

        Activity

        Hide
        henryr Henry Robinson added a comment -

        Can you post the profile from a query that suffers this issue?

        Show
        henryr Henry Robinson added a comment - Can you post the profile from a query that suffers this issue?
        Hide
        mszurap_impala_de3d Miklos Szurap added a comment -

        Yes, I have attached it.

        Show
        mszurap_impala_de3d Miklos Szurap added a comment - Yes, I have attached it.
        Hide
        henryr Henry Robinson added a comment -

        Thanks! Is it always INSERT statements that fail? That could be the bug.

        Show
        henryr Henry Robinson added a comment - Thanks! Is it always INSERT statements that fail? That could be the bug.
        Hide
        mszurap_impala_de3d Miklos Szurap added a comment -

        Tried with select as well and it's the same. Should I upload the query profile for that too?

        Show
        mszurap_impala_de3d Miklos Szurap added a comment - Tried with select as well and it's the same. Should I upload the query profile for that too?
        Hide
        henryr Henry Robinson added a comment -

        Yes please.

        Show
        henryr Henry Robinson added a comment - Yes please.
        Hide
        henryr Henry Robinson added a comment -

        My guess is a problem with the HS2 implementation that maybe misses taking a session reference. Thanks for the report, we should definitely fix this soon.

        Show
        henryr Henry Robinson added a comment - My guess is a problem with the HS2 implementation that maybe misses taking a session reference. Thanks for the report, we should definitely fix this soon.
        Hide
        henryr Henry Robinson added a comment -

        The bug is a missing 'keepalive' for the session in the GetOperationStatus() request. The reason this affects *DBC drivers is because they use GetOperationStatus() to decide whether to fetch, rather than sitting on a blocking FetchResults() call like (I think) Hue does.

        Show
        henryr Henry Robinson added a comment - The bug is a missing 'keepalive' for the session in the GetOperationStatus() request. The reason this affects *DBC drivers is because they use GetOperationStatus() to decide whether to fetch, rather than sitting on a blocking FetchResults() call like (I think) Hue does.
        Show
        henryr Henry Robinson added a comment - Fixed in https://github.com/apache/incubator-impala/commit/e4fc5bd5c515a73358cdc83e7eed5cd128262380
        Hide
        mala_ck Mala Chikka Kempanna added a comment -

        I have 2 questions related to above change

        1. When will the idle session timeout start ticking after this change now?

        2. If there is a ODBC BI tool on which user launched a query, and query result is say 10,000 rows, and client has fetched only first 1000 rows and leaves the rest unfetched.
        In this case, will server continues to see query as running for ever with the keepalive in place or does the idle_query_timeout kick in and the query and result-set be expired?

        Request that it be document, on how this new keep alive change affects idle_session_timeout and idle_query_timeout settings

        Show
        mala_ck Mala Chikka Kempanna added a comment - I have 2 questions related to above change 1. When will the idle session timeout start ticking after this change now? 2. If there is a ODBC BI tool on which user launched a query, and query result is say 10,000 rows, and client has fetched only first 1000 rows and leaves the rest unfetched. In this case, will server continues to see query as running for ever with the keepalive in place or does the idle_query_timeout kick in and the query and result-set be expired? Request that it be document, on how this new keep alive change affects idle_session_timeout and idle_query_timeout settings
        Hide
        henryr Henry Robinson added a comment -

        Mala Chikka Kempanna:

        1. The idle session timeout starts the same as it did before - at the moment that a client operation finishes. The change here is that the GetOperationStatus() rpc should have reset the timer, but did not.

        2. Typically the client will not send any RPCs to the server between fetches. So the session timeout could be hit after the first 1000 rows have been fetched. So could the query timeout.

        What would you like to be documented? This issue fixed a bug, but didn't change the basic behaviour.

        Show
        henryr Henry Robinson added a comment - Mala Chikka Kempanna : 1. The idle session timeout starts the same as it did before - at the moment that a client operation finishes. The change here is that the GetOperationStatus() rpc should have reset the timer, but did not. 2. Typically the client will not send any RPCs to the server between fetches. So the session timeout could be hit after the first 1000 rows have been fetched. So could the query timeout. What would you like to be documented? This issue fixed a bug, but didn't change the basic behaviour.
        Hide
        mala_ck Mala Chikka Kempanna added a comment -

        Thanks Henry for clarifying.
        I just wanted get confirmation that behavior of idle-session and idle-query timeouts have not changed.

        Show
        mala_ck Mala Chikka Kempanna added a comment - Thanks Henry for clarifying. I just wanted get confirmation that behavior of idle-session and idle-query timeouts have not changed.

          People

          • Assignee:
            henryr Henry Robinson
            Reporter:
            mszurap_impala_de3d Miklos Szurap
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development