Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2908

Find a permanent fix to avoid a cluster hang / incorrect results on a failed EOS RPC

    XMLWordPrintableJSON

Details

    Description

      In DataStreamSender::Channel::CloseInternal(), if the EOS RPC fails, the receiver side of the RPC will remain open indefinitely causing the cluster to hang. If the sending of the last row-batch fails during CloseInternal(), then the query can succeed but return incorrect results. Only an error would be logged via LogError() in this case.

      This is an incremental task after IMPALA-2592 which was a temporary fix to reduce the window of vulnerability. However, this needs a more well thought of permanent fix.

      Attachments

        Activity

          People

            henryr Henry Robinson
            sailesh Sailesh Mukil
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: