Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-4255

Unstable test WebRuntimeMonitorITCase.testNoEscape

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.2.0, 1.3.0
    • Component/s: None
    • Labels:

      Description

      An instance of the problem can be found here:

      https://s3.amazonaws.com/archive.travis-ci.org/jobs/146615994/log.txt

        Issue Links

          Activity

          Hide
          Zentol Chesnay Schepler added a comment -

          The same exception occurred here: https://issues.apache.org/jira/browse/FLINK-3746

          Show
          Zentol Chesnay Schepler added a comment - The same exception occurred here: https://issues.apache.org/jira/browse/FLINK-3746
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user BorisOsipov opened a pull request:

          https://github.com/apache/flink/pull/3019

          FLINK-4255 Unstable test WebRuntimeMonitorITCase.testNoEscape

          Fix two issues:
          FLINK-4255 Unstable test WebRuntimeMonitorITCase.testNoEscape
          FLINK-3746 WebRuntimeMonitorITCase.testNoCopyFromJar failing intermittently

          • wait until JobManager will be available
          • code simplification

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/BorisOsipov/flink WebRuntimeMonitorITCase

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/flink/pull/3019.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #3019


          commit cf94bab9a7b1426c5048d6b126742375567de791
          Author: Boris Osipov <boris_osipov@epam.com>
          Date: 2016-12-16T07:30:33Z

          FLINK-4255 Unstable test WebRuntimeMonitorITCase.testNoEscape
          FLINK-3746 WebRuntimeMonitorITCase.testNoCopyFromJar failing intermittently

          • wait until JobManager will be available
          • code simplification

          Show
          githubbot ASF GitHub Bot added a comment - GitHub user BorisOsipov opened a pull request: https://github.com/apache/flink/pull/3019 FLINK-4255 Unstable test WebRuntimeMonitorITCase.testNoEscape Fix two issues: FLINK-4255 Unstable test WebRuntimeMonitorITCase.testNoEscape FLINK-3746 WebRuntimeMonitorITCase.testNoCopyFromJar failing intermittently wait until JobManager will be available code simplification You can merge this pull request into a Git repository by running: $ git pull https://github.com/BorisOsipov/flink WebRuntimeMonitorITCase Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3019.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3019 commit cf94bab9a7b1426c5048d6b126742375567de791 Author: Boris Osipov <boris_osipov@epam.com> Date: 2016-12-16T07:30:33Z FLINK-4255 Unstable test WebRuntimeMonitorITCase.testNoEscape FLINK-3746 WebRuntimeMonitorITCase.testNoCopyFromJar failing intermittently wait until JobManager will be available code simplification
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user zentol commented on a diff in the pull request:

          https://github.com/apache/flink/pull/3019#discussion_r92803992

          — Diff: flink-runtime-web/src/test/java/org/apache/flink/runtime/webmonitor/WebRuntimeMonitorITCase.java —
          @@ -75,32 +78,14 @@
          public void testStandaloneWebRuntimeMonitor() throws Exception {
          final Deadline deadline = TestTimeout.fromNow();

          • TestingCluster flink = null;
            + TestingCluster testingCluster = null;
              • End diff –

          please undo the renaming

          Show
          githubbot ASF GitHub Bot added a comment - Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/3019#discussion_r92803992 — Diff: flink-runtime-web/src/test/java/org/apache/flink/runtime/webmonitor/WebRuntimeMonitorITCase.java — @@ -75,32 +78,14 @@ public void testStandaloneWebRuntimeMonitor() throws Exception { final Deadline deadline = TestTimeout.fromNow(); TestingCluster flink = null; + TestingCluster testingCluster = null; End diff – please undo the renaming
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user zentol commented on a diff in the pull request:

          https://github.com/apache/flink/pull/3019#discussion_r92804170

          — Diff: flink-runtime-web/src/test/java/org/apache/flink/runtime/webmonitor/WebRuntimeMonitorITCase.java —
          @@ -262,8 +247,8 @@ public void testRedirectToLeader() throws Exception {
          response = followingClient.getNextResponse(deadline.timeLeft());
          assertEquals(HttpResponseStatus.OK, response.getStatus());
          assertEquals(response.getType(), MimeTypes.getMimeTypeForExtension("json"));

          • assertTrue(response.getContent().contains("\"taskmanagers\":1") ||
          • response.getContent().contains("\"taskmanagers\":0"));
            + assertThat(response.getContent(), either(containsString("\"taskmanagers\":1"))
              • End diff –

          Why did you change this line? (Note that this applies to all lines that replace assertTrue with assertThat, contains with containsString and | with or.)

          Show
          githubbot ASF GitHub Bot added a comment - Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/3019#discussion_r92804170 — Diff: flink-runtime-web/src/test/java/org/apache/flink/runtime/webmonitor/WebRuntimeMonitorITCase.java — @@ -262,8 +247,8 @@ public void testRedirectToLeader() throws Exception { response = followingClient.getNextResponse(deadline.timeLeft()); assertEquals(HttpResponseStatus.OK, response.getStatus()); assertEquals(response.getType(), MimeTypes.getMimeTypeForExtension("json")); assertTrue(response.getContent().contains("\"taskmanagers\":1") || response.getContent().contains("\"taskmanagers\":0")); + assertThat(response.getContent(), either(containsString("\"taskmanagers\":1")) End diff – Why did you change this line? (Note that this applies to all lines that replace assertTrue with assertThat, contains with containsString and | with or.)
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user zentol commented on a diff in the pull request:

          https://github.com/apache/flink/pull/3019#discussion_r92803960

          — Diff: flink-runtime-web/src/test/java/org/apache/flink/runtime/webmonitor/WebRuntimeMonitorITCase.java —
          @@ -491,6 +445,37 @@ public void testNoCopyFromJar() throws Exception {
          }
          }

          + private WebRuntimeMonitor startWebRuntimeMonitor(
          + TestingCluster flink) throws Exception {
          +
          + ActorSystem jmActorSystem = flink.jobManagerActorSystems().get().head();
          + ActorRef jmActor = flink.jobManagerActors().get().head();
          +
          + // Needs to match the leader address from the leader retrieval service
          + String jobManagerAddress = AkkaUtils.getAkkaURL(jmActorSystem, jmActor);
          +
          + File logDir = temporaryFolder.newFolder("log");
          + Path logFile = Files.createFile(new File(logDir, "jobmanager.log").toPath());
          + Files.createFile(new File(logDir, "jobmanager.out").toPath());
          +
          + // Web frontend on random port
          + Configuration config = new Configuration();
          + config.setInteger(ConfigConstants.JOB_MANAGER_WEB_PORT_KEY, 0);
          + config.setString(ConfigConstants.JOB_MANAGER_WEB_LOG_PATH_KEY, logFile.toString());
          +
          + WebRuntimeMonitor webMonitor = new WebRuntimeMonitor(
          + config,
          + flink.createLeaderRetrievalService(),
          + jmActorSystem);
          +
          + webMonitor.start(jobManagerAddress);
          + JobManagerRetriever retriever = Whitebox
          + .getInternalState(webMonitor, "retriever");
          +
          + retriever.awaitJobManagerGatewayAndWebPort();
          — End diff –

          instead you can call `flink.waitForActorsToBeAlive()`

          Show
          githubbot ASF GitHub Bot added a comment - Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/3019#discussion_r92803960 — Diff: flink-runtime-web/src/test/java/org/apache/flink/runtime/webmonitor/WebRuntimeMonitorITCase.java — @@ -491,6 +445,37 @@ public void testNoCopyFromJar() throws Exception { } } + private WebRuntimeMonitor startWebRuntimeMonitor( + TestingCluster flink) throws Exception { + + ActorSystem jmActorSystem = flink.jobManagerActorSystems().get().head(); + ActorRef jmActor = flink.jobManagerActors().get().head(); + + // Needs to match the leader address from the leader retrieval service + String jobManagerAddress = AkkaUtils.getAkkaURL(jmActorSystem, jmActor); + + File logDir = temporaryFolder.newFolder("log"); + Path logFile = Files.createFile(new File(logDir, "jobmanager.log").toPath()); + Files.createFile(new File(logDir, "jobmanager.out").toPath()); + + // Web frontend on random port + Configuration config = new Configuration(); + config.setInteger(ConfigConstants.JOB_MANAGER_WEB_PORT_KEY, 0); + config.setString(ConfigConstants.JOB_MANAGER_WEB_LOG_PATH_KEY, logFile.toString()); + + WebRuntimeMonitor webMonitor = new WebRuntimeMonitor( + config, + flink.createLeaderRetrievalService(), + jmActorSystem); + + webMonitor.start(jobManagerAddress); + JobManagerRetriever retriever = Whitebox + .getInternalState(webMonitor, "retriever"); + + retriever.awaitJobManagerGatewayAndWebPort(); — End diff – instead you can call `flink.waitForActorsToBeAlive()`
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user BorisOsipov commented on a diff in the pull request:

          https://github.com/apache/flink/pull/3019#discussion_r92807445

          — Diff: flink-runtime-web/src/test/java/org/apache/flink/runtime/webmonitor/WebRuntimeMonitorITCase.java —
          @@ -262,8 +247,8 @@ public void testRedirectToLeader() throws Exception {
          response = followingClient.getNextResponse(deadline.timeLeft());
          assertEquals(HttpResponseStatus.OK, response.getStatus());
          assertEquals(response.getType(), MimeTypes.getMimeTypeForExtension("json"));

          • assertTrue(response.getContent().contains("\"taskmanagers\":1") ||
          • response.getContent().contains("\"taskmanagers\":0"));
            + assertThat(response.getContent(), either(containsString("\"taskmanagers\":1"))
              • End diff –

          I've replaced assertTrue/False because assertThat and hamcrest matchers provide much better error messages. Instead of weird 'java.lang.AssertionError' we can get a more readable message in the log without effort.
          Of cource that's a matter of opinion and I can revert it. Should I do?

          Show
          githubbot ASF GitHub Bot added a comment - Github user BorisOsipov commented on a diff in the pull request: https://github.com/apache/flink/pull/3019#discussion_r92807445 — Diff: flink-runtime-web/src/test/java/org/apache/flink/runtime/webmonitor/WebRuntimeMonitorITCase.java — @@ -262,8 +247,8 @@ public void testRedirectToLeader() throws Exception { response = followingClient.getNextResponse(deadline.timeLeft()); assertEquals(HttpResponseStatus.OK, response.getStatus()); assertEquals(response.getType(), MimeTypes.getMimeTypeForExtension("json")); assertTrue(response.getContent().contains("\"taskmanagers\":1") || response.getContent().contains("\"taskmanagers\":0")); + assertThat(response.getContent(), either(containsString("\"taskmanagers\":1")) End diff – I've replaced assertTrue/False because assertThat and hamcrest matchers provide much better error messages. Instead of weird 'java.lang.AssertionError' we can get a more readable message in the log without effort. Of cource that's a matter of opinion and I can revert it. Should I do?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user BorisOsipov commented on a diff in the pull request:

          https://github.com/apache/flink/pull/3019#discussion_r92832659

          — Diff: flink-runtime-web/src/test/java/org/apache/flink/runtime/webmonitor/WebRuntimeMonitorITCase.java —
          @@ -262,8 +247,8 @@ public void testRedirectToLeader() throws Exception {
          response = followingClient.getNextResponse(deadline.timeLeft());
          assertEquals(HttpResponseStatus.OK, response.getStatus());
          assertEquals(response.getType(), MimeTypes.getMimeTypeForExtension("json"));

          • assertTrue(response.getContent().contains("\"taskmanagers\":1") ||
          • response.getContent().contains("\"taskmanagers\":0"));
            + assertThat(response.getContent(), either(containsString("\"taskmanagers\":1"))
              • End diff –

          Ok. Let's save consistency.

          Show
          githubbot ASF GitHub Bot added a comment - Github user BorisOsipov commented on a diff in the pull request: https://github.com/apache/flink/pull/3019#discussion_r92832659 — Diff: flink-runtime-web/src/test/java/org/apache/flink/runtime/webmonitor/WebRuntimeMonitorITCase.java — @@ -262,8 +247,8 @@ public void testRedirectToLeader() throws Exception { response = followingClient.getNextResponse(deadline.timeLeft()); assertEquals(HttpResponseStatus.OK, response.getStatus()); assertEquals(response.getType(), MimeTypes.getMimeTypeForExtension("json")); assertTrue(response.getContent().contains("\"taskmanagers\":1") || response.getContent().contains("\"taskmanagers\":0")); + assertThat(response.getContent(), either(containsString("\"taskmanagers\":1")) End diff – Ok. Let's save consistency.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user BorisOsipov commented on the issue:

          https://github.com/apache/flink/pull/3019

          @zentol Thank you for the review. I've resolved your comments.

          Show
          githubbot ASF GitHub Bot added a comment - Github user BorisOsipov commented on the issue: https://github.com/apache/flink/pull/3019 @zentol Thank you for the review. I've resolved your comments.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user zentol commented on the issue:

          https://github.com/apache/flink/pull/3019

          Merging.

          Show
          githubbot ASF GitHub Bot added a comment - Github user zentol commented on the issue: https://github.com/apache/flink/pull/3019 Merging.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/flink/pull/3019

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/3019
          Hide
          Zentol Chesnay Schepler added a comment -

          1.2: b50bbcc8853c1c2ebcdba9c74a70bfdfbe6557ab
          1.3: 411fff58405804a7f7f79536f8c6885a491dbef6

          Show
          Zentol Chesnay Schepler added a comment - 1.2: b50bbcc8853c1c2ebcdba9c74a70bfdfbe6557ab 1.3: 411fff58405804a7f7f79536f8c6885a491dbef6

            People

            • Assignee:
              Boris_Osipov Boris Osipov
              Reporter:
              kkl0u Kostas Kloudas
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development