Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6048

Queries make very slow progress and report WaitForRPC() stuck for too long

    XMLWordPrintableJSON

Details

    Description

      When running 32 concurrent queries from TPCDS a couple of instances from TPC-DS Q78 9 hours to finish and it appeared to be hung.

      On an idle cluster the query finished in under 5 minutes, profiles attached.
      When the query ran for long fragments reported +16 hours of network send/receive time

      The logs show there is a lot of messages like the one below, there are incidents for this log message where a node waited too long from an RPC from itself

      W1012 00:47:57.633549 117475 krpc-data-stream-sender.cc:360] XXX: WaitForRPC() stuck for too long address=10.17.234.37:29000 fragment_instace_id_=1e48ef897e797131:2f05789b000005eb dest_node_id_=24 sender_id_=81
      

      Attachments

        1. Archive 2.zip
          3.46 MB
          Mostafa Mokhtar

        Issue Links

          Activity

            People

              kwho Michael Ho
              mmokhtar Mostafa Mokhtar
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: