Details
-
Bug
-
Status: Open
-
Blocker
-
Resolution: Unresolved
-
None
-
None
-
None
Description
There are several Units that are consistently failing on Yetus for a log period of time.
The list keeps growing and it is driving the repository into unstable status. Qbt reports more than 40 failing unit tests on average.
Personally, over the last week, with every submitted patch, I have to spend a considerable time looking at the same stack trace to double check whether or not the patch contributes to those failures.
I found out that the majority of those tests were failing for quite sometime but no Jiras were filed.
The main problem of those consistent failures is that they have side effect on the runtime of the other Junits by sucking up resources such as memory and ports.
StripedFile and EC tests in particular are 100% show-ups in the list of bad tests.
I looked at those tests and they certainly need some improvements (i.e., HDFS-15459). Is any one interested in those test cases? Can we just turn them off?
I like to give some heads-up that we need some more collaboration to enforce the stability of the code set.
- For all developers, please, file a Jira once you see a failing test whether it is unrelated to your patch or not. This gives heads-up to other developers about the potential failures. Please do not stop at commenting on your patch "this is unrelated to my work".
- Volunteer to dedicate more time on fixing flaky tests.
- Periodically, make sure that the list of failing tests does not exceed a certain number of tests. We have Qbt reports to monitor that, but there is no follow up on its status.
- We should consider aggressive strategies such as blocking any merges until the code is brought back to stability.
- We need a clear and well-defined process to address Yetus issues: configuration, investigating running out of memory, slowness..etc.
- Turn-off the Junits within the modules that are not being actively used in the community (i.e., EC, stripedFiles, or..etc.).
CC: aajisaka, elgoiri, kihwal, daryn, weichiu
Do you guys have any thoughts on the current status of the HDFS ?
The following list is a quick list of failing Junits from Qbt reports:
org.apache.hadoop.crypto.key.kms.server.TestKMS.testKMSProviderCaching1.5 sec1
org.apache.hadoop.fs.azure.TestBlobMetadata.testFolderMetadata42 ms3
org.apache.hadoop.fs.azure.TestBlobMetadata.testFirstContainerVersionMetadata46 ms3
org.apache.hadoop.fs.azure.TestBlobMetadata.testPermissionMetadata27 ms3
org.apache.hadoop.fs.azure.TestBlobMetadata.testOldPermissionMetadata19 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemConcurrency.testNoTempBlobsVisible0.95 sec3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemConcurrency.testLinkBlobs33 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemContractMocked.testListStatusRootDir31 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemContractMocked.testRenameDirectoryMoveToExistingDirectory0.25 sec3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemContractMocked.testListStatus29 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemContractMocked.testRenameDirectoryAsExistingDirectory36 ms3
3 org.apache.hadoop.fs.azure.TestNativeAzureFileSystemContractMocked.testLSRootDir19 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemContractMocked.testDeleteRecursively31 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemFileNameCheck.testWasbFsck1 sec3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testChineseCharactersFolderRename1 sec3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testRedoRenameFolderInFolderListingWithZeroByteRenameMetadata41 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testRedoRenameFolderInFolderListing37 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testUriEncoding38 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testDeepFileCreation37 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testListDirectory29 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testRedoRenameFolderRenameInProgress37 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testRenameFolder34 ms
3 org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testRenameImplicitFolder27 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testRedoRenameFolder66 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testStoreDeleteFolder27 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked.testRename40 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemOperationsMocked.testListStatus36 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemOperationsMocked.testRenameDirectoryAsEmptyDirectory0.26 sec3
3 org.apache.hadoop.fs.azure.TestNativeAzureFileSystemOperationsMocked.testRenameDirectoryAsNonExistentDirectory28 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemOperationsMocked.testGlobStatusSomeMatchesInDirectories26 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemOperationsMocked.testGlobStatusWithMultipleWildCardMatches27 ms3
org.apache.hadoop.fs.azure.TestNativeAzureFileSystemOperationsMocked.testDeleteRecursively22 ms3
org.apache.hadoop.fs.azure.TestOutOfBandAzureBlobOperations.testImplicitFolderDeleted0.99 sec3
org.apache.hadoop.fs.azure.TestOutOfBandAzureBlobOperations.testFileAndImplicitFolderSameName31 ms3
org.apache.hadoop.fs.azure.TestOutOfBandAzureBlobOperations.testSetOwnerOnImplicitFolder26 ms3
org.apache.hadoop.fs.azure.TestOutOfBandAzureBlobOperations.testFileInImplicitFolderDeleted30 ms3
org.apache.hadoop.fs.azure.TestOutOfBandAzureBlobOperations.testImplicitFolderListed22 ms3
org.apache.hadoop.fs.azure.TestOutOfBandAzureBlobOperations.testCreatingDeepFileCreatesExplicitFolder53 ms3
org.apache.hadoop.fs.azure.TestOutOfBandAzureBlobOperations.testSetPermissionOnImplicitFolder22 ms3
org.apache.hadoop.fs.azure.TestWasbFsck.testDelete1 sec3
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithOpportunisticContainers1 min 30 sec17
Attachments
Issue Links
- incorporates
-
HADOOP-17325 WASB: Test failures
- Resolved