Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10811

RPC to submit query getting stuck for AWS NLB forever.

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • Impala 4.1.0
    • None
    • None

    Description

      Initial RPC to submit a query and fetch the query handle can take quite long time to return as it can do various operations for planning and submission that involve executing  Catalog Operations like Rename, Alter Table Recover partition  that can take time on tables with many partitions(https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92). Attached is the profile of one such DDL query (with few fields hidden).

      These RPCs are: 

      1. Beeswax:

      https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57

      2. HS2:

      https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462

       

      One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout. Hence the impala-shell waiting for RPC to return gets stuck indefinitely.

      Hence, we may need to evaluate techniques for RPCs to return query handle after

      1. Creating Driver: https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150
      2. Register Query: https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168

       and execute later parts of RPC asynchronously in different thread without blocking the RPC. That way clients can get query handle and poll for it for state and results.

       

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            sql_forever Qifan Chen
            amargoor Amogh Margoor
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment