Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.8.0
-
None
-
None
-
Reviewed
Description
MockAM#waitForState sleep duration (500 ms) is too long. Also, there is significant duplication with MockRM#waitForState.
Attachments
Attachments
- YARN-4807.001.patch
- 62 kB
- Yufei Gu
- YARN-4807.002.patch
- 65 kB
- Yufei Gu
- YARN-4807.003.patch
- 64 kB
- Yufei Gu
- YARN-4807.004.patch
- 68 kB
- Yufei Gu
- YARN-4807.005.patch
- 79 kB
- Yufei Gu
- YARN-4807.006.patch
- 80 kB
- Yufei Gu
- YARN-4807.007.patch
- 81 kB
- Yufei Gu
- YARN-4807.008.patch
- 81 kB
- Yufei Gu
- YARN-4807.009.patch
- 83 kB
- Yufei Gu
- YARN-4807.010.patch
- 83 kB
- Yufei Gu
- YARN-4807.011.patch
- 84 kB
- Yufei Gu
- YARN-4807.012.patch
- 84 kB
- Yufei Gu
- YARN-4807.013.patch
- 85 kB
- Yufei Gu
- YARN-4807.014.patch
- 86 kB
- Yufei Gu
- YARN-4807.015.patch
- 86 kB
- Yufei Gu
Issue Links
Activity
Actually, the problem seems worse. We have multiple waitForState methods in MockRM - for app, appattempt, container. There is some duplication there, but what is worse each method has its own sleepTime and cumulative_wait_time. It would be nice to use a constant for sleep time instead of hard coded values.
If it is not a problem, I would like for us to start with a really small value like 10 ms. It seems to be working for one of the waitForState methods in MockRM.
Regarding removing MockAM#waitForState methods altogether, I am not so sure if that will be straight-forward or desired. I have seen test cases where there is no reference to an RM or MockRM.
I'd prefer to start with Karthik's approach. Let's make the sleep time as low as possible.
Yes kasha. Many of the waiteState has cumulative timeout value hardcoded and also per-round waitSecs also is hardcoded differently from method to method. So total no: of rounds to wait for resource allocations were found by (timeout/per-round-wait) value. As I see its, no: of times we wait OR total number of rounds also can contribute little bit to the duration of the tests.
So if we are starting with 10ms, we can define how many rounds are also needed for a minimum try. With few dry-runs, a good figure could be calculated.
I have seen test cases where there is no reference to an RM or MockRM.
Yes, there are places even from MokRM we use this. SO as you suggested, its better to focus on improving test time first and later some cleanup can be tried.
1.
I have seen test cases where there is no reference to an RM or MockRM.
I've checked all usages of waitForState in MockAM. All no reference to an RM or MockRm places are inside the class MockAM. So that, we can make waitForState in MockAM private, and change all outside reference to waitForState in MockRM.
2. I agree to have some constants for waitMsPerLoop, minWaitMsecs and timeoutMsecs, shared by MockAM and MockRM.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 17s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 19 new or modified test files. |
+1 | mvninstall | 7m 24s | trunk passed |
+1 | compile | 0m 37s | trunk passed with JDK v1.8.0_74 |
+1 | compile | 0m 33s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 19s | trunk passed |
+1 | mvnsite | 0m 37s | trunk passed |
+1 | mvneclipse | 0m 14s | trunk passed |
+1 | findbugs | 1m 9s | trunk passed |
+1 | javadoc | 0m 30s | trunk passed with JDK v1.8.0_74 |
+1 | javadoc | 0m 29s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 31s | the patch passed |
+1 | compile | 0m 33s | the patch passed with JDK v1.8.0_74 |
+1 | javac | 0m 33s | the patch passed |
+1 | compile | 0m 30s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 30s | the patch passed |
+1 | checkstyle | 0m 15s | the patch passed |
+1 | mvnsite | 0m 37s | the patch passed |
+1 | mvneclipse | 0m 13s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 20s | the patch passed |
+1 | javadoc | 0m 26s | the patch passed with JDK v1.8.0_74 |
+1 | javadoc | 0m 26s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 71m 5s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. |
-1 | unit | 71m 13s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. |
+1 | asflicense | 0m 18s | Patch does not generate ASF License warnings. |
160m 40s |
Reason | Tests |
---|---|
JDK v1.8.0_74 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
hadoop.yarn.server.resourcemanager.TestAMAuthorization |
This message was automatically generated.
yufeigu, thanks for the patch. I have a few comments:
MockRM
108: static final int timeoutMsecsForApp = 80000; 109: static final int timeoutMsecs = 40000; 110: static final int minWaitMsecs = 1000; 111: static final int waitMsPerLoop = 10;
Those should all be snake case, e.g. TIMEOUT_MSECS and probably protected so that subclasses can use them. More importantly, I don't think the first three are helping anything. timeoutMsecsForApp and minWaitMsecs are only used once, so moving them out to a constant just makes the code harder to follow. The additional clarity gained from giving the number a name is pretty minimal since that information is already provided by context. Making them constants means I have to go look up the value of the constant, though, so the net result is that the code is a tiny bit harder to read.
Using timeoutMsecs instead of the timeout parameter of the waitForState() method is a bad idea. You've replaced a configurable timeout that was 10s, 30s, or 40s in practice with a fixed 40s timeout. That seems against the goals of the JIRA.
I'm a little suspicious of dropping the waitForContainer...() methods in favor of waitForState(), but if the unit tests still pass, who am I to complain?
MockAM
80: final int timeoutMsecs = MockRM.timeoutMsecs; 81: final int minWaitMsecs = MockRM.minWaitMsecs; 82: final int waitMsPerLoop = MockRM.waitMsPerLoop;
You should just use the constants directly instead of assigning their values to intermediate variables.
Hi templedf, Thanks a lot for reviewing my code. I uploaded patch 002.
MockRM
Changed all to snake case. timeoutMsecsForApp is used once, so I put it back to the function. minWaitMsecs is used several places. I keep it a constant.
There are four method signatures of waitForState. Two of waitForState have a fixed timeout. One is 80s another is 40s. I didn't change these timeout. The other two is consolidated from 3 waitForState and 2 waitForContainerState and 1 waitForContainerAllocated. (Seems like a lot redundancy here.) Most of them use 10s timeout originally and I didn't change that. The only timeout I changed is two of original 30s is changed to 10s and test cases passed. I've tried varargs in waitForState to have a default timeout and enable to input a configurable one as you like(30s, 40s ...). I suspect this is a overdesign here. So I just remove it and keep it simple. But I include it in patch 002. You have a look and comment it.
MockAM
Modified as suggested.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 11s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 19 new or modified test files. |
+1 | mvninstall | 6m 39s | trunk passed |
+1 | compile | 0m 25s | trunk passed with JDK v1.8.0_74 |
+1 | compile | 0m 28s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 18s | trunk passed |
+1 | mvnsite | 0m 33s | trunk passed |
+1 | mvneclipse | 0m 14s | trunk passed |
+1 | findbugs | 1m 6s | trunk passed |
+1 | javadoc | 0m 22s | trunk passed with JDK v1.8.0_74 |
+1 | javadoc | 0m 27s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 31s | the patch passed |
+1 | compile | 0m 25s | the patch passed with JDK v1.8.0_74 |
+1 | javac | 0m 25s | the patch passed |
+1 | compile | 0m 26s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 26s | the patch passed |
+1 | checkstyle | 0m 15s | the patch passed |
+1 | mvnsite | 0m 32s | the patch passed |
+1 | mvneclipse | 0m 13s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 17s | the patch passed |
+1 | javadoc | 0m 18s | the patch passed with JDK v1.8.0_74 |
+1 | javadoc | 0m 23s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 61m 25s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. |
-1 | unit | 62m 27s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. |
+1 | asflicense | 0m 17s | Patch does not generate ASF License warnings. |
140m 13s |
Reason | Tests |
---|---|
JDK v1.8.0_74 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
hadoop.yarn.server.resourcemanager.TestAMAuthorization |
This message was automatically generated.
Thanks, yufeigu.
Here's an example of what I mean about extending the timeouts. This is what was there:
rm.waitForState(appAttemptId, RMAppAttemptState.LAUNCHED, 500);
and this is what it became:
rm.waitForState(appAttemptId, RMAppAttemptState.LAUNCHED);
What was waiting for up to 500ms is now waiting for up to {{TIMEOUT_MSECS=40000}}ms. There are lots and lots of other examples.
Looks like MockRM.waitForState(ApplicationAttemptId, RMAppAttemptState) and MockAM.waitForState(ApplicationAttemptId, RMAppAttemptState) are more or less identical. Can we get rid of one of them? After you do that, MIN_WAIT_MSECS can go back to being a locally scoped variable.
Hi templedf,
Thank you very much for the second review. I uploaded patch 003. Modified as suggested.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 10s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 19 new or modified test files. |
+1 | mvninstall | 6m 36s | trunk passed |
+1 | compile | 0m 26s | trunk passed with JDK v1.8.0_74 |
+1 | compile | 0m 29s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 18s | trunk passed |
+1 | mvnsite | 0m 34s | trunk passed |
+1 | mvneclipse | 0m 14s | trunk passed |
+1 | findbugs | 1m 7s | trunk passed |
+1 | javadoc | 0m 20s | trunk passed with JDK v1.8.0_74 |
+1 | javadoc | 0m 26s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 29s | the patch passed |
+1 | compile | 0m 23s | the patch passed with JDK v1.8.0_74 |
+1 | javac | 0m 23s | the patch passed |
+1 | compile | 0m 25s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 25s | the patch passed |
+1 | checkstyle | 0m 15s | the patch passed |
+1 | mvnsite | 0m 31s | the patch passed |
+1 | mvneclipse | 0m 13s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 17s | the patch passed |
+1 | javadoc | 0m 19s | the patch passed with JDK v1.8.0_74 |
+1 | javadoc | 0m 24s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 61m 1s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. |
-1 | unit | 62m 13s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. |
-1 | asflicense | 0m 17s | Patch generated 1 ASF License warnings. |
139m 30s |
Reason | Tests |
---|---|
JDK v1.8.0_74 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
hadoop.yarn.server.resourcemanager.TestAMAuthorization |
This message was automatically generated.
Thanks, yufeigu! Looks really good. To bring this full circle, I noticed that now you have 40*1000 and 10*1000 each used twice in MockRM. Might be worth making those constants (again). Also, it would be really super swell of you could add JavaDocs for the MockRM and MockAM methods that you touched, especially since there are now implicit timeouts that can be optionally overridden.
Thanks a lot for reviewing, templedf. Nice suggestions! I uploaded the fourth patch and modified as suggested. Would you please take a look? Thanks.
Thanks yufeigu and templedf for this work. Its much needed, and lots of good optimizations are coming in.
Some minor points from me.
static final private int TIMEOUT_MS_FOR_ATTEMPT = 40 * 1000;
Could you pls define 1000 as SEC for better readability. We use GB instead of 1024 in many test cases and its helpful.
Could we try reducing some of this timeout now, or are we planning that as separate. I could see we still wit for 500ms or 1sec for many test cases?
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 10s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 19 new or modified test files. |
+1 | mvninstall | 7m 12s | trunk passed |
+1 | compile | 0m 30s | trunk passed with JDK v1.8.0_77 |
+1 | compile | 0m 32s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 19s | trunk passed |
+1 | mvnsite | 0m 37s | trunk passed |
+1 | mvneclipse | 0m 15s | trunk passed |
+1 | findbugs | 1m 12s | trunk passed |
+1 | javadoc | 0m 24s | trunk passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 27s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 31s | the patch passed |
+1 | compile | 0m 29s | the patch passed with JDK v1.8.0_77 |
+1 | javac | 0m 29s | the patch passed |
+1 | compile | 0m 28s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 28s | the patch passed |
+1 | checkstyle | 0m 17s | the patch passed |
+1 | mvnsite | 0m 33s | the patch passed |
+1 | mvneclipse | 0m 12s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 22s | the patch passed |
+1 | javadoc | 0m 19s | the patch passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 25s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 55m 24s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. |
-1 | unit | 56m 47s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. |
-1 | asflicense | 0m 21s | Patch generated 1 ASF License warnings. |
129m 51s |
Reason | Tests |
---|---|
JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
hadoop.yarn.server.resourcemanager.TestRMAdminService | |
hadoop.yarn.server.resourcemanager.TestClientRMTokens | |
JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
hadoop.yarn.server.resourcemanager.TestRMAdminService | |
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
This message was automatically generated.
Thanks sunilg for review. Nice suggestion about SEC. I'll put them in my new patch. I think the "wait time per loop" is really matter. The patch tremendously reduce it from 500ms/200ms to 10ms. Is it possible to raise the changes of flaky tests if we reduce the length of timeout as templedf suggested? And if we'd like to find an more accurate timeout, we could file another jira and do multiple round of Jenkins jobs for all related unit tests.
Quickly skimmed through the patch. Couple of comments:
- Some of the waitForState methods return boolean, while others return void but have asserts in them. I would prefer them to be consistent. May be return a boolean is better so the tests could log context-sensitive comments? If that ends up being a significant chunk of work, I am fine with doing it in a follow up JIRA.
- Nit: For the loops where we are incrementing something and also checking condition on every iteration, I prefer using for instead of while. If we choose to stick with while, can we please do loop++ where we check.
Thanks kasha@cloudera.com for review and very nice suggestions.
Besides return type, there are even more inconsistencies among these waitForState, some ones have a minimum waiting time(1000ms), some one don't. I appreciate if someone can explain why we need a minimum waiting time. Another inconsistency is some wait function don't have a timeout, they can wait for ever. And there are more, e.g. some functions use LOG.info and others use System.out.println to print messages.
It's better to discuss and make a decision to solve these inconsistencies. Or put them in a follow up JIRA.
Hi kasha@cloudera.com and templedf, I made some changes about loop according to kasha@cloudera.com's comment. And I will create a followup JIRA for all inconsistencies. Patch 006 is uploaded. Would you please take a look? Thanks.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 9s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 22 new or modified test files. |
+1 | mvninstall | 6m 31s | trunk passed |
+1 | compile | 0m 26s | trunk passed with JDK v1.8.0_77 |
+1 | compile | 0m 29s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 17s | trunk passed |
+1 | mvnsite | 0m 34s | trunk passed |
+1 | mvneclipse | 0m 14s | trunk passed |
+1 | findbugs | 1m 5s | trunk passed |
+1 | javadoc | 0m 21s | trunk passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 26s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 30s | the patch passed |
+1 | compile | 0m 24s | the patch passed with JDK v1.8.0_77 |
+1 | javac | 0m 24s | the patch passed |
+1 | compile | 0m 26s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 26s | the patch passed |
+1 | checkstyle | 0m 16s | the patch passed |
+1 | mvnsite | 0m 32s | the patch passed |
+1 | mvneclipse | 0m 13s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 16s | the patch passed |
+1 | javadoc | 0m 18s | the patch passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 24s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 53m 1s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. |
-1 | unit | 54m 8s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. |
+1 | asflicense | 0m 17s | Patch does not generate ASF License warnings. |
123m 15s |
Reason | Tests |
---|---|
JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation |
hadoop.yarn.server.resourcemanager.TestClientRMTokens | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.TestRMAdminService | |
JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation |
hadoop.yarn.server.resourcemanager.TestClientRMTokens | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.TestRMAdminService |
This message was automatically generated.
Few comments, mostly nits:
MockRM
- Can we use "private static final" instead of static final private" for the new constants?
- waitForState(ApplicationId appId, RMAppState finalState):
- move int loop = 0 to the for loop
- After the loop, just add an assert that verifies the "state" irrespective of why we exited the loop.
- waitForState(RMAppAttempt attempt, RMAppAttemptState finalState, int timeoutMsecs)
- move int loop = 0 to the for loop
- After the for loop, why are we sleeping for the remaining time? I would think this slows down tests significantly.
- just add an assert that verifies the "state" irrespective of why we exited the loop.
- waitForContainerToComplete - do we not have any timeout on this? I recommend adding one.
- Same goes for watiForNewAMToLaunchAndRegister
- waitForState(Collection<MockNM> nms, ContainerId containerId, RMContainerState containerState, int timeoutMsecs)
- move int loop = 0 to the first for loop
- assertNotNull need not be guarded by if (container == null)
- Move the if check for (container == null) to one of the conditions for the following for.
One other nit: can we rename variables loop to retries for readability?
Thanks kasha@cloudera.com for detailed review.
1. Yes. Besides, I also use "private final boolean useNullRMNodeLabelsManager" instead of "final private boolean useNullRMNodeLabelsManager".
2. Modified as suggested.
3. Modified as suggested. For the minimum waiting time, I mentions it as one inconsistency in previous comment. Thanks for the explain.
4 and 5. I mentioned them as one consistency as well. It's not too work, but we just need to decide to do it here or next JIRA.
6. Nice suggestion. But to move int loop = 0 to the first for loop will make second for loop have no init value for loop.
loop to retries, yeah, good idea.
Hi kasha@cloudera.com, I uploaded the seventh path. Please have a look. Thanks.
As we discussed, I remove the minimum waiting time in wait for attempt states. We will solve it in the followup JIRA if any test case fails because of it.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 15s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 22 new or modified test files. |
+1 | mvninstall | 6m 39s | trunk passed |
+1 | compile | 0m 26s | trunk passed with JDK v1.8.0_74 |
+1 | compile | 0m 28s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 17s | trunk passed |
+1 | mvnsite | 0m 34s | trunk passed |
+1 | mvneclipse | 0m 14s | trunk passed |
+1 | findbugs | 1m 4s | trunk passed |
+1 | javadoc | 0m 21s | trunk passed with JDK v1.8.0_74 |
+1 | javadoc | 0m 27s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 30s | the patch passed |
+1 | compile | 0m 25s | the patch passed with JDK v1.8.0_74 |
+1 | javac | 0m 25s | the patch passed |
+1 | compile | 0m 27s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 27s | the patch passed |
+1 | checkstyle | 0m 15s | the patch passed |
+1 | mvnsite | 0m 31s | the patch passed |
+1 | mvneclipse | 0m 12s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 14s | the patch passed |
+1 | javadoc | 0m 19s | the patch passed with JDK v1.8.0_74 |
+1 | javadoc | 0m 24s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 31m 11s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. |
-1 | unit | 33m 21s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. |
+1 | asflicense | 0m 17s | Patch does not generate ASF License warnings. |
80m 55s |
Reason | Tests |
---|---|
JDK v1.8.0_74 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | |
hadoop.yarn.server.resourcemanager.TestApplicationMasterService | |
hadoop.yarn.server.resourcemanager.TestRMAdminService | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.TestContainerResourceUsage | |
hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter | |
JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
hadoop.yarn.server.resourcemanager.TestRMRestart | |
hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | |
hadoop.yarn.server.resourcemanager.TestApplicationMasterService | |
hadoop.yarn.server.resourcemanager.TestRMAdminService | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.TestContainerResourceUsage | |
hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler | |
hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter |
This message was automatically generated.
For the tests that are failing because the insufficient sleep time, can we (1) manually add sleeps so the tests pass, (2) file a follow-up JIRA to fix the test the right way and (3) add a TODO in the code annotated with the JIRA. (e.g. // TODO (YARN-wxyz))
Sure. Thanks for nice suggestions. I haven't found any test case failing until now from the Hadoop QA and my own running of test cases.
The latest Jenkins run has a bunch of test failures. Are they all unrelated to the patch here?
kasha@cloudera.com, you are right. I have a misunderstanding to the Hadoop QA's message. I found the following unit test cases failed because of we remove the minimum time for attempt. So I will file a JIRA YARN-4929 for it.
- TestAMRestart.testRMAppAttemptFailuresValidityInterval
- TestApplicationMasterService.testResourceTypes
- TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers
- TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForFairSche
- TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForCapacitySche
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 18s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 1s | The patch appears to include 22 new or modified test files. |
+1 | mvninstall | 7m 45s | trunk passed |
+1 | compile | 0m 39s | trunk passed with JDK v1.8.0_77 |
+1 | compile | 0m 34s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 18s | trunk passed |
+1 | mvnsite | 0m 39s | trunk passed |
+1 | mvneclipse | 0m 15s | trunk passed |
+1 | findbugs | 1m 17s | trunk passed |
+1 | javadoc | 0m 31s | trunk passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 28s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 33s | the patch passed |
+1 | compile | 0m 38s | the patch passed with JDK v1.8.0_77 |
+1 | javac | 0m 38s | the patch passed |
+1 | compile | 0m 31s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 31s | the patch passed |
+1 | checkstyle | 0m 16s | the patch passed |
+1 | mvnsite | 0m 37s | the patch passed |
+1 | mvneclipse | 0m 14s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 26s | the patch passed |
+1 | javadoc | 0m 27s | the patch passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 27s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 43m 57s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. |
-1 | unit | 28m 5s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. |
+1 | asflicense | 0m 18s | Patch does not generate ASF License warnings. |
91m 23s |
Reason | Tests |
---|---|
JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodeLabels |
hadoop.yarn.webapp.TestRMWithCSRFFilter | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps | |
hadoop.yarn.server.resourcemanager.TestRMRestart | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices | |
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokens | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesReservation | |
hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | |
hadoop.yarn.server.resourcemanager.TestContainerResourceUsage | |
hadoop.yarn.server.resourcemanager.TestClientRMTokens | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesForCSWithPartitions | |
hadoop.yarn.server.resourcemanager.TestApplicationMasterService | |
hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | |
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | |
JDK v1.8.0_77 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodeLabels |
hadoop.yarn.webapp.TestRMWithCSRFFilter | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokens | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesReservation | |
hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | |
hadoop.yarn.server.resourcemanager.TestClientRMTokens | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesForCSWithPartitions | |
hadoop.yarn.server.resourcemanager.TestApplicationMasterService | |
hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
This message was automatically generated.
Looks like the same tests are still failing. Did you add in the sleeps per kasha's suggestion?
(1) manually add sleeps so the tests pass, (2) file a follow-up JIRA to fix the test the right way and (3) add a TODO in the code annotated with the JIRA. (e.g. // TODO (YARN-wxyz))
Thanks templedf for the review. I've done all the three things in the patch 009. Please take a look.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 11s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 22 new or modified test files. |
+1 | mvninstall | 6m 43s | trunk passed |
+1 | compile | 0m 26s | trunk passed with JDK v1.8.0_77 |
+1 | compile | 0m 30s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 19s | trunk passed |
+1 | mvnsite | 0m 35s | trunk passed |
+1 | mvneclipse | 0m 16s | trunk passed |
+1 | findbugs | 1m 3s | trunk passed |
+1 | javadoc | 0m 21s | trunk passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 25s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 29s | the patch passed |
+1 | compile | 0m 25s | the patch passed with JDK v1.8.0_77 |
+1 | javac | 0m 25s | the patch passed |
+1 | compile | 0m 27s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 27s | the patch passed |
+1 | checkstyle | 0m 17s | the patch passed |
+1 | mvnsite | 0m 33s | the patch passed |
+1 | mvneclipse | 0m 12s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 14s | the patch passed |
+1 | javadoc | 0m 20s | the patch passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 24s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 35m 42s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. |
-1 | unit | 23m 0s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. |
+1 | asflicense | 0m 23s | Patch does not generate ASF License warnings. |
75m 17s |
Reason | Tests |
---|---|
JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation |
hadoop.yarn.server.resourcemanager.TestApplicationMasterService | |
hadoop.yarn.server.resourcemanager.TestClientRMTokens | |
hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodeLabels | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokens | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices | |
hadoop.yarn.server.resourcemanager.TestContainerResourceUsage | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesForCSWithPartitions | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesReservation | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification | |
hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | |
hadoop.yarn.webapp.TestRMWithCSRFFilter | |
JDK v1.8.0_77 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation |
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority | |
hadoop.yarn.server.resourcemanager.TestApplicationMasterService | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched | |
hadoop.yarn.server.resourcemanager.TestClientRMTokens | |
hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodeLabels | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokens | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices | |
hadoop.yarn.server.resourcemanager.TestContainerResourceUsage | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesForCSWithPartitions | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesReservation | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification | |
hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | |
hadoop.yarn.webapp.TestRMWithCSRFFilter | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
This message was automatically generated.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 12s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 1s | The patch appears to include 22 new or modified test files. |
+1 | mvninstall | 6m 48s | trunk passed |
+1 | compile | 0m 26s | trunk passed with JDK v1.8.0_77 |
+1 | compile | 0m 29s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 18s | trunk passed |
+1 | mvnsite | 0m 35s | trunk passed |
+1 | mvneclipse | 0m 14s | trunk passed |
+1 | findbugs | 1m 6s | trunk passed |
+1 | javadoc | 0m 21s | trunk passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 25s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 30s | the patch passed |
+1 | compile | 0m 25s | the patch passed with JDK v1.8.0_77 |
+1 | javac | 0m 25s | the patch passed |
+1 | compile | 0m 26s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 26s | the patch passed |
+1 | checkstyle | 0m 16s | the patch passed |
+1 | mvnsite | 0m 31s | the patch passed |
+1 | mvneclipse | 0m 13s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 14s | the patch passed |
+1 | javadoc | 0m 18s | the patch passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 24s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 42m 13s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. |
-1 | unit | 22m 16s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. |
+1 | asflicense | 0m 17s | Patch does not generate ASF License warnings. |
80m 57s |
Reason | Tests |
---|---|
JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler | |
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
JDK v1.8.0_77 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodeLabels |
hadoop.yarn.webapp.TestRMWithCSRFFilter | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokens | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification | |
hadoop.yarn.server.resourcemanager.TestClientRMTokens | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler | |
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | |
hadoop.yarn.server.resourcemanager.TestRMRestart | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesReservation | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes | |
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesForCSWithPartitions | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization |
This message was automatically generated.
Thanks for the update, yufeigu.
Back to those loops... I don't think converting the while loops to for loops is the right idea. You end up with ugly for loops. My rule of thumb is that a for loop is for iterating through a sequence with a known number of steps; while loops are for everything else. I think your previous structure:
while (timeWaiting < tooLong) {
doStuff();
timeWaiting += timeWaited;
}
was perfectly reasonable.
Some of the loops are for-while combinations. I think the best approach in those cases is to split the conditions like:
while (conditionsTrue) { if (timeWaiting >= tooLong) { break; } doStuff(); timeWaiting += timeWaited; }
It might also be clearer to count the time spent sleeping rather than counting the number of retries, since the latter is just a derivative of the former.
In the javadocs, I think you mean "has" everywhere you used "had". Also "if any" is a little too terse for the exception explanation. How about "thrown if an unexpected error occurs"?
Thanks templedf for all nice suggestions. The new patch I attached address all of them.
Also thanks kasha for offline discussion.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 17s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 22 new or modified test files. |
+1 | mvninstall | 8m 17s | trunk passed |
+1 | compile | 0m 50s | trunk passed with JDK v1.8.0_77 |
+1 | compile | 0m 37s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 20s | trunk passed |
+1 | mvnsite | 0m 45s | trunk passed |
+1 | mvneclipse | 0m 17s | trunk passed |
+1 | findbugs | 1m 23s | trunk passed |
+1 | javadoc | 0m 35s | trunk passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 33s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 44s | the patch passed |
+1 | compile | 0m 52s | the patch passed with JDK v1.8.0_77 |
+1 | javac | 0m 52s | the patch passed |
+1 | compile | 0m 40s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 40s | the patch passed |
+1 | checkstyle | 0m 20s | the patch passed |
+1 | mvnsite | 0m 42s | the patch passed |
+1 | mvneclipse | 0m 15s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 39s | the patch passed |
+1 | javadoc | 0m 33s | the patch passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 31s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 52m 2s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. |
-1 | unit | 51m 6s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. |
+1 | asflicense | 0m 21s | Patch does not generate ASF License warnings. |
125m 0s |
Reason | Tests |
---|---|
JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
hadoop.yarn.server.resourcemanager.TestClientRMTokens | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | |
JDK v1.8.0_77 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
hadoop.yarn.server.resourcemanager.TestContainerResourceUsage | |
hadoop.yarn.server.resourcemanager.TestClientRMTokens | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | |
JDK v1.7.0_95 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
This message was automatically generated.
Thanks, yufeigu! A quick skim of the patch looks good. The loops look much better now. Unfortunately, I did just notice something that I should have caught before. The throws clauses on the waitForState() methods are needlessly general. Some of them should just be InterruptedException. Can you please make them more specific and them update the javadocs accordingly?
templedf, Thanks very much for pointing out. You are right. Some of them just throw InterruptedException. I uploaded a new patch for it.
The signatures look better, but I'd still like to see the javadocs be a little more specific, like:
* @throws InterruptedException if interrupted while waiting for the state transition
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 13s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 22 new or modified test files. |
+1 | mvninstall | 7m 4s | trunk passed |
+1 | compile | 0m 30s | trunk passed with JDK v1.8.0_77 |
+1 | compile | 0m 30s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 18s | trunk passed |
+1 | mvnsite | 0m 38s | trunk passed |
+1 | mvneclipse | 0m 14s | trunk passed |
+1 | findbugs | 1m 6s | trunk passed |
+1 | javadoc | 0m 21s | trunk passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 26s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 30s | the patch passed |
+1 | compile | 0m 25s | the patch passed with JDK v1.8.0_77 |
+1 | javac | 0m 25s | the patch passed |
+1 | compile | 0m 26s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 26s | the patch passed |
+1 | checkstyle | 0m 16s | the patch passed |
+1 | mvnsite | 0m 33s | the patch passed |
+1 | mvneclipse | 0m 12s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 15s | the patch passed |
+1 | javadoc | 0m 19s | the patch passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 24s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 45m 51s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. |
-1 | unit | 45m 38s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. |
+1 | asflicense | 0m 18s | Patch does not generate ASF License warnings. |
108m 28s |
Reason | Tests |
---|---|
JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation |
hadoop.yarn.server.resourcemanager.TestClientRMTokens | |
hadoop.yarn.server.resourcemanager.TestRMRestart | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | |
JDK v1.8.0_77 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | |
JDK v1.7.0_95 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
This message was automatically generated.
templedf, Thanks very much for detailed review. I uploaded a new patch for it.
Thanks, yufeigu. I just did one last thorough review of the full patch. Here's my hopefully last set of comments:
- MockRM.java:216 - spacing is off
- MockRM.java:303 - comment should be more specific
- MackRM.java:322 - exception (and preceeding comment) should be more specific
- MackRM.java:337 - exception (and preceeding comment) should be more specific
- MackRM.java:353 - exception (and preceeding comment) should be more specific
- TestAMRestart.java:831 - the comment should also explain why you're sleeping here
- TestAMRestart.java:877 - the comment should also explain why you're sleeping here
- TestApplicationMasterService.java:313 - the comment should also explain why you're sleeping here
- TestRMApplicationHistoryWriter.java:449 - the comment should also explain why you're sleeping here
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 20s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 22 new or modified test files. |
+1 | mvninstall | 7m 9s | trunk passed |
+1 | compile | 0m 30s | trunk passed with JDK v1.8.0_77 |
+1 | compile | 0m 30s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 20s | trunk passed |
-1 | mvnsite | 0m 27s | hadoop-yarn-server-resourcemanager in trunk failed. |
+1 | mvneclipse | 0m 17s | trunk passed |
+1 | findbugs | 1m 32s | trunk passed |
+1 | javadoc | 0m 33s | trunk passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 35s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 38s | the patch passed |
+1 | compile | 0m 36s | the patch passed with JDK v1.8.0_77 |
+1 | javac | 0m 36s | the patch passed |
+1 | compile | 0m 35s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 35s | the patch passed |
+1 | checkstyle | 0m 18s | the patch passed |
+1 | mvnsite | 0m 41s | the patch passed |
+1 | mvneclipse | 0m 16s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 29s | the patch passed |
+1 | javadoc | 0m 27s | the patch passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 31s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 56m 49s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. |
-1 | unit | 53m 59s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. |
+1 | asflicense | 0m 20s | Patch does not generate ASF License warnings. |
130m 6s |
Reason | Tests |
---|---|
JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
hadoop.yarn.server.resourcemanager.TestRMRestart | |
hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.TestContainerResourceUsage | |
hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | |
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | |
JDK v1.8.0_77 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
hadoop.yarn.server.resourcemanager.TestRMRestart | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.TestContainerResourceUsage | |
hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA | |
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | |
JDK v1.7.0_95 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
This message was automatically generated.
Thanks yufeigu for working on this and templedf for the thorough reviews. Very excited to see these heavily-used helper methods read so well.
Are all the test failures unrelated to this change? For instance, TestRMRestart seems to be due to test timeout. Can we confirm?
Thanks kasha for the review. Some of them did invoke waitForState. So I tested these cases locally many times (10-100 times). They never failed. For example, org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartOnMissingAttempts never failed in my local machine. And it seems similar to YARN-3064.
Seems like it did make some test cases flaky. Need more investigate. Sorry for that.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 13s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 22 new or modified test files. |
+1 | mvninstall | 6m 54s | trunk passed |
+1 | compile | 0m 29s | trunk passed with JDK v1.8.0_77 |
+1 | compile | 0m 31s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 19s | trunk passed |
+1 | mvnsite | 0m 37s | trunk passed |
+1 | mvneclipse | 0m 15s | trunk passed |
+1 | findbugs | 1m 6s | trunk passed |
+1 | javadoc | 0m 25s | trunk passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 26s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 33s | the patch passed |
+1 | compile | 0m 28s | the patch passed with JDK v1.8.0_77 |
+1 | javac | 0m 28s | the patch passed |
+1 | compile | 0m 28s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 28s | the patch passed |
+1 | checkstyle | 0m 16s | the patch passed |
+1 | mvnsite | 0m 34s | the patch passed |
+1 | mvneclipse | 0m 13s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 14s | the patch passed |
+1 | javadoc | 0m 19s | the patch passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 24s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 43m 20s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. |
-1 | unit | 45m 59s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. |
+1 | asflicense | 0m 20s | Patch does not generate ASF License warnings. |
106m 25s |
Reason | Tests |
---|---|
JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation |
hadoop.yarn.server.resourcemanager.TestClientRMTokens | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing | |
JDK v1.8.0_77 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation |
hadoop.yarn.server.resourcemanager.TestClientRMTokens | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | |
hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | |
JDK v1.7.0_95 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
This message was automatically generated.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 11s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 22 new or modified test files. |
+1 | mvninstall | 6m 43s | trunk passed |
+1 | compile | 0m 27s | trunk passed with JDK v1.8.0_77 |
+1 | compile | 0m 29s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 18s | trunk passed |
+1 | mvnsite | 0m 34s | trunk passed |
+1 | mvneclipse | 0m 14s | trunk passed |
+1 | findbugs | 1m 6s | trunk passed |
+1 | javadoc | 0m 23s | trunk passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 25s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 32s | the patch passed |
+1 | compile | 0m 29s | the patch passed with JDK v1.8.0_77 |
+1 | javac | 0m 29s | the patch passed |
+1 | compile | 0m 28s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 28s | the patch passed |
+1 | checkstyle | 0m 15s | the patch passed |
+1 | mvnsite | 0m 34s | the patch passed |
+1 | mvneclipse | 0m 12s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 20s | the patch passed |
+1 | javadoc | 0m 22s | the patch passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 27s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 43m 11s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. |
-1 | unit | 45m 34s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. |
+1 | asflicense | 0m 20s | Patch does not generate ASF License warnings. |
105m 40s |
Reason | Tests |
---|---|
JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
JDK v1.8.0_77 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler | |
hadoop.yarn.server.resourcemanager.TestRMRestart | |
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
JDK v1.7.0_95 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
This message was automatically generated.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 22s | Docker mode activated. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 22 new or modified test files. |
+1 | mvninstall | 9m 25s | trunk passed |
+1 | compile | 0m 47s | trunk passed with JDK v1.8.0_77 |
+1 | compile | 0m 42s | trunk passed with JDK v1.7.0_95 |
+1 | checkstyle | 0m 25s | trunk passed |
+1 | mvnsite | 0m 50s | trunk passed |
+1 | mvneclipse | 0m 22s | trunk passed |
+1 | findbugs | 1m 34s | trunk passed |
+1 | javadoc | 0m 36s | trunk passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 38s | trunk passed with JDK v1.7.0_95 |
+1 | mvninstall | 0m 43s | the patch passed |
+1 | compile | 0m 44s | the patch passed with JDK v1.8.0_77 |
+1 | javac | 0m 44s | the patch passed |
+1 | compile | 0m 40s | the patch passed with JDK v1.7.0_95 |
+1 | javac | 0m 40s | the patch passed |
+1 | checkstyle | 0m 18s | the patch passed |
+1 | mvnsite | 0m 45s | the patch passed |
+1 | mvneclipse | 0m 17s | the patch passed |
+1 | whitespace | 0m 0s | Patch has no whitespace issues. |
+1 | findbugs | 1m 43s | the patch passed |
+1 | javadoc | 0m 35s | the patch passed with JDK v1.8.0_77 |
+1 | javadoc | 0m 35s | the patch passed with JDK v1.7.0_95 |
-1 | unit | 57m 43s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. |
-1 | unit | 57m 21s | hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. |
+1 | asflicense | 0m 21s | Patch does not generate ASF License warnings. |
138m 58s |
Reason | Tests |
---|---|
JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.TestContainerResourceUsage |
hadoop.yarn.server.resourcemanager.TestClientRMTokens | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | |
JDK v1.8.0_77 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | |
hadoop.yarn.server.resourcemanager.TestClientRMTokens | |
hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | |
JDK v1.7.0_95 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
This message was automatically generated.
All test failures are not related to the patch after my investigation. kasha, could you please take a look? Thanks.
Just committed this to trunk and branch-2. I can't wait to see faster Jenkins runs on precommit builds.
Thanks yufeigu for the contribution and templedf for the thorough reviews.
FAILURE: Integrated in Hadoop-trunk-Commit #9682 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9682/)
YARN-4807. MockAM#waitForState sleep duration is too long. (Yufei Gu via (kasha: rev 185c3d4de1ac4cf10cc1aa00aaaaf367b3880b80)
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestWorkPreservingRMRestartForNodeLabel.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRMRPCNodeUpdates.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockAM.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestContainerResourceUsage.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestNodeLabelContainerAllocation.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerResizing.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestSignalContainer.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/ahs/TestRMApplicationHistoryWriter.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestNodesListManager.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterService.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRM.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesApps.java
- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterLauncher.java
Hi kasha and yufeigu
I think this patch will be very helpful in 2.8 branch too. Recently we had some other test failures (YARN-4947) and while trying to prepare a patch for 2.8, we felt that this will be helpful to use some new apis. (waitForState for node) cc/rohithsharma
sunilg I tried back porting this JIRA, there are other multiple JIRA need to be back ported first!! Some of them are not really required to back port in branch-2.8. I think let YARN-4947 rebased patch go in.
As I see, MockRM#waitForState is also considering 500ms to 1sec in various methods. I think that can be made smaller to 200ms. Recently myslef and rohithsharma had worked on few random test cases failures in CS and all were fixed when we properly made use of waitForState with correct timeout. I think 200ms is more than enough, and we can remove MockAM method and consolidate all into MockRM. thoughts?