Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3575

Impala should retry backend connection request and apply a send timeout

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 2.2
    • Fix Version/s: Impala 2.7.0
    • Component/s: Distributed Exec
    • Labels:
      None

      Description

      Impala doesn't retry backend thrift connection request. If for any reason (another end too busy, high concurrent requests, crapy network), it cannot open a connection, it will return failure and the query will fail.

      It would be nice to add a configurable retry number to let impala retry connecting to increase the connection success in above mentioned situation.

      Also there is no timeout for sending and receiving data. Impala will wait forever if send or recv don't return. This is a reasonable choice for recv because we want to use back pressure to slow down downstream operator if upstream one cannot catch up. but for I don't think it makes sense for send. send could hung. A relative large send timeout could help Impala detect connection issues and fail query asap.

      ClientCache already supports both retry and timeout parameter when creating a connection. it's very easy to add some configs for retry and timeout.

        Issue Links

          Activity

          Hide
          jyu@cloudera.com Juan Yu added a comment -

          IMPALA-3575: Add retry to backend connection request and rpc timeout
          This patch adds a configurable timeout for all backend client
          RPC to avoid query hang issue.

          Prior to this change, Impala doesn't set socket send/recv timeout for
          backend client. RPC will wait forever for data. In extreme cases
          of bad network or destination host has kernel panic, sender will not
          get response and RPC will hang. Query hang is hard to detect. If
          hang happens at ExecRemoteFragment() or CancelPlanFragments(), query
          cannot be canelled unless you restart coordinator.

          Added send/recv timeout to all RPCs to avoid query hang. For catalog
          client, keep default timeout to 0 (no timeout) because ExecDdl()
          could take very long time if table has many partitons, mainly waiting
          for HMS API call.

          Added a wrapper RetryRpcRecv() to wait for receiver response for
          longer time. This is needed by certain RPCs. For example, TransmitData()
          by DataStreamSender, receiver could hold response to add back pressure.

          If an RPC fails, the connection is left in an unrecoverable state.
          we don't put the underlying connection back to cache but close it. This
          is to make sure broken connection won't cause more RPC failure.

          Added retry for CancelPlanFragment RPC. This reduces the chance that cancel
          request gets lost due to unstable network, but this can cause cancellation
          takes longer time. and make test_lifecycle.py more flaky.
          The metric num-fragments-in-flight might not be 0 yet due to previous tests.
          Modified the test to check the metric delta instead of comparing to 0 to
          reduce flakyness. However, this might not capture some failures.

          Besides the new EE test, I used the following iptables rule to
          inject network failure to verify RPCs never hang.
          1. Block network traffic on a port completely
          iptables -A INPUT -p tcp -m tcp --dport 22002 -j DROP
          2. Randomly drop 5% of TCP packets to slowdown network
          iptables -A INPUT -p tcp -m tcp --dport 22000 -m statistic --mode random --probability 0.05 -j DROP

          Change-Id: Id6723cfe58df6217f4a9cdd12facd320cbc24964
          Reviewed-on: http://gerrit.cloudera.org:8080/3343
          Reviewed-by: Juan Yu <jyu@cloudera.com>
          Tested-by: Internal Jenkins

          Show
          jyu@cloudera.com Juan Yu added a comment - IMPALA-3575 : Add retry to backend connection request and rpc timeout This patch adds a configurable timeout for all backend client RPC to avoid query hang issue. Prior to this change, Impala doesn't set socket send/recv timeout for backend client. RPC will wait forever for data. In extreme cases of bad network or destination host has kernel panic, sender will not get response and RPC will hang. Query hang is hard to detect. If hang happens at ExecRemoteFragment() or CancelPlanFragments(), query cannot be canelled unless you restart coordinator. Added send/recv timeout to all RPCs to avoid query hang. For catalog client, keep default timeout to 0 (no timeout) because ExecDdl() could take very long time if table has many partitons, mainly waiting for HMS API call. Added a wrapper RetryRpcRecv() to wait for receiver response for longer time. This is needed by certain RPCs. For example, TransmitData() by DataStreamSender, receiver could hold response to add back pressure. If an RPC fails, the connection is left in an unrecoverable state. we don't put the underlying connection back to cache but close it. This is to make sure broken connection won't cause more RPC failure. Added retry for CancelPlanFragment RPC. This reduces the chance that cancel request gets lost due to unstable network, but this can cause cancellation takes longer time. and make test_lifecycle.py more flaky. The metric num-fragments-in-flight might not be 0 yet due to previous tests. Modified the test to check the metric delta instead of comparing to 0 to reduce flakyness. However, this might not capture some failures. Besides the new EE test, I used the following iptables rule to inject network failure to verify RPCs never hang. 1. Block network traffic on a port completely iptables -A INPUT -p tcp -m tcp --dport 22002 -j DROP 2. Randomly drop 5% of TCP packets to slowdown network iptables -A INPUT -p tcp -m tcp --dport 22000 -m statistic --mode random --probability 0.05 -j DROP Change-Id: Id6723cfe58df6217f4a9cdd12facd320cbc24964 Reviewed-on: http://gerrit.cloudera.org:8080/3343 Reviewed-by: Juan Yu <jyu@cloudera.com> Tested-by: Internal Jenkins

            People

            • Assignee:
              jyu@cloudera.com Juan Yu
              Reporter:
              jyu@cloudera.com Juan Yu
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development