Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6048

Queries make very slow progress and report WaitForRPC() stuck for too long

    XMLWordPrintableJSON

    Details

      Description

      When running 32 concurrent queries from TPCDS a couple of instances from TPC-DS Q78 9 hours to finish and it appeared to be hung.

      On an idle cluster the query finished in under 5 minutes, profiles attached.
      When the query ran for long fragments reported +16 hours of network send/receive time

      The logs show there is a lot of messages like the one below, there are incidents for this log message where a node waited too long from an RPC from itself

      W1012 00:47:57.633549 117475 krpc-data-stream-sender.cc:360] XXX: WaitForRPC() stuck for too long address=10.17.234.37:29000 fragment_instace_id_=1e48ef897e797131:2f05789b000005eb dest_node_id_=24 sender_id_=81
      

        Attachments

        1. Archive 2.zip
          3.46 MB
          Mostafa Mokhtar

          Issue Links

            Activity

              People

              • Assignee:
                kwho Michael Ho
                Reporter:
                mmokhtar Mostafa Mokhtar
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: