Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.3.0
Description
Intermittent test timeout for ITestAbfsInputStreamStatistics#testReadAheadCounters happening due to race conditions in readAhead threads.
Test error:
[ERROR] testReadAheadCounters(org.apache.hadoop.fs.azurebfs.ITestAbfsInputStreamStatistics) Time elapsed: 30.723 s <<< ERROR!org.junit.runners.model.TestTimedOutException: test timed out after 30000 milliseconds at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.fs.azurebfs.ITestAbfsInputStreamStatistics.testReadAheadCounters(ITestAbfsInputStreamStatistics.java:346) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748)
Possible Reasoning:
- ReadAhead queue doesn't get completed and hence the counter values are not satisfied in 30 seconds time for some systems.
- The condition that readAheadBytesRead and remoteBytesRead counter values need to be greater than or equal to 4KB and 32KB respectively doesn't occur in some machines due to the fact that sometimes instead of reading for readAhead Buffer, remote reads are performed due to Threads still being in the readAhead queue to fill that buffer. Thus resulting in either of the 2 counter values to be not satisfying the condition and getting in an infinite loop and hence timing out the test eventually.
Possible Fixes:
- Write better test(That would pass under all conditions).
- Maybe UT instead of IT?
Possible fix to better the test would be preferable and UT as the last resort.
Attachments
Issue Links
- is duplicated by
-
HADOOP-17160 ITestAbfsInputStreamStatistics#testReadAheadCounters timing out always
- Reopened
- links to