Uploaded image for project: 'Apache YuniKorn'
  1. Apache YuniKorn
  2. YUNIKORN-2067

Test_With_Spark_Jobs e2e test wait for app state Running after Spark job completed

    XMLWordPrintableJSON

Details

    Description

      The e2e test 'Test_With_Spark_Jobs' waits in a row for the 3 Spark applications to reach the 'Running' state, which is incorrect. We can’t ensure the jobs are still in running by the time we perform the check.

      We should check spark driver pod state through KubeCtl Client instead of YuniKorn’s RestClient because the application will be removed from the core after it has completed.

      Link of code: test/e2e/spark_jobs_scheduling/spark_jobs_scheduling_test.go#L147-L149
      Failed e2e test link: https://github.com/apache/yunikorn-k8shim/actions/runs/6596046649/job/17926552721#step:5:2098

      Failed e2e test log analysis:

      • 17:18:09Z Pod for app spark-e27dd9a2140844828fdfb3d80e9fa1b4 created
      • 17:18:11.725869Z (PodEvent in Log) PodEvent ‘Scheduling’ received
      • 17:18:11.727811Z (PodEvent in Log) PodEvent ‘Scheduled’ received
      • 17:18:11.735646Z (PodEvent in Log) PodEvent ‘PodBindSuccessful’ received
      • 17:20:10.965501Z (PodEvent in Log) PodEvent ‘TaskCompleted’ received
        (Complete before check.)
      • 17:20:20.159 (Ginkgo) Waiting for application spark-e27dd9a2140844828fdfb3d80e9fa1b4 to Running
      • 17:26:25.9749 (Ginkgo) timeout

       

      Attachments

        Issue Links

          Activity

            People

              Yu-Lin Chen Yu-Lin Chen
              Yu-Lin Chen Yu-Lin Chen
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: