Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.94.0
-
None
-
None
-
Reviewed
Description
Likely because it belongs to the too long list of flaky tests. A test should not fail randomly. Here is the list, from 2974 to 2598, with a timeouted TestMetaReaderEditor
https://builds.apache.org/job/HBase-TRUNK/2597/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/HBase-TRUNK/2588/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/HBase-TRUNK/2585/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/HBase-TRUNK/2584/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/HBase-TRUNK/2582/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/HBase-TRUNK/2577/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/HBase-TRUNK/2574/testReport/org.apache.hadoop.hbase.catalog/
It happens on hadoop-qa as well:
https://builds.apache.org/job/PreCommit-HBASE-Build/640/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/PreCommit-HBASE-Build/638/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/PreCommit-HBASE-Build/630/testReport/org.apache.hadoop.hbase.catalog/
https://builds.apache.org/job/PreCommit-HBASE-Build/622/testReport/org.apache.hadoop.hbase.catalog/
Looking at the source code, main issues are:
- in testRetrying, some threads are started, but will not be stopped if an assertion fails. As a consequence, the process will not stop
- still in testRetrying, it's not possible to use @Test(timeout=180000), because if this timeout is reached, the thread won't be stopped.
This does not explain why the test fails sometimes, but explain why we have a timeout instead of a clean failure.
I also fixed some stuff that seems unrelated, and I cannot reproduce the error anymore locally after 10 tries. So I will try the patch on hadoop-qa.
The patch includes as well the removal of TestAdmin#testHundredsOfTable because this test takes 2 minutes to execute; but proves nothing on success as there is no check on ressources used.
Attachments
Attachments
Issue Links
- relates to
-
HBASE-5064 utilize surefire tests parallelization
- Closed
-
HBASE-4602 Make the suite run in at least half the time
- Closed