Details
-
Sub-task
-
Status: Patch Available
-
Trivial
-
Resolution: Unresolved
-
None
-
None
Description
The TestWriteReadStripedFile is basically doing the full product of file size with data node failure or not. It's better to use JUnit Paramaterized test suite.
Attachments
Attachments
- HDFS-12398.002.patch
- 21 kB
- Huafeng Wang
- HDFS-12398.001.patch
- 21 kB
- Huafeng Wang
Issue Links
- relates to
-
HDFS-12444 Reduce runtime of TestWriteReadStripedFile
- Resolved
Activity
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 15s | Docker mode activated. |
Prechecks | |||
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
trunk Compile Tests | |||
+1 | mvninstall | 15m 48s | trunk passed |
+1 | compile | 0m 59s | trunk passed |
+1 | checkstyle | 0m 37s | trunk passed |
+1 | mvnsite | 0m 56s | trunk passed |
-1 | findbugs | 1m 46s | hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant Findbugs warnings. |
+1 | javadoc | 0m 46s | trunk passed |
Patch Compile Tests | |||
+1 | mvninstall | 0m 51s | the patch passed |
+1 | compile | 0m 52s | the patch passed |
+1 | javac | 0m 52s | hadoop-hdfs-project_hadoop-hdfs generated 0 new + 405 unchanged - 3 fixed = 405 total (was 408) |
+1 | checkstyle | 0m 34s | hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 0 unchanged - 7 fixed = 0 total (was 7) |
+1 | mvnsite | 0m 52s | the patch passed |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | findbugs | 1m 59s | the patch passed |
+1 | javadoc | 0m 40s | the patch passed |
Other Tests | |||
-1 | unit | 91m 19s | hadoop-hdfs in the patch failed. |
+1 | asflicense | 0m 16s | The patch does not generate ASF License warnings. |
119m 55s |
Reason | Tests |
---|---|
Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure060 |
hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy | |
hadoop.hdfs.protocol.datatransfer.sasl.TestSaslDataTransfer | |
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 | |
hadoop.hdfs.TestClientProtocolForPipelineRecovery | |
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 | |
hadoop.hdfs.server.namenode.TestReencryptionWithKMS | |
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 | |
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure190 | |
hadoop.hdfs.server.blockmanagement.TestBlockManager | |
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 | |
hadoop.hdfs.TestLeaseRecoveryStriped | |
hadoop.hdfs.server.datanode.TestDirectoryScanner | |
hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean | |
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 | |
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure040 | |
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 | |
hadoop.hdfs.TestLargeBlock | |
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 | |
Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
Subsystem | Report/Notes |
---|---|
Docker | Image:yetus/hadoop:71bbb86 |
JIRA Issue | HDFS-12398 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12886026/HDFS-12398.002.patch |
Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle |
uname | Linux f7c57bf07682 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
git revision | trunk / b3a4d7d |
Default Java | 1.8.0_144 |
findbugs | v3.1.0-RC1 |
findbugs | https://builds.apache.org/job/PreCommit-HDFS-Build/21047/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html |
unit | https://builds.apache.org/job/PreCommit-HDFS-Build/21047/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt |
Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/21047/testReport/ |
modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/21047/console |
Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org |
This message was automatically generated.
Hi Huafeng Wang,
Thanks for working on the tests. The idea seems good, and after refactored, the codes are much less. However, the good effect of this refactoring doesn't look very obvious to me and I'd like more the current way for some reasons.
1. The current way of having many test methods are much better readable;
2. It's also easier to debug if some of them are failed;
3. More important, every test case (contained in a test method) needs a brand new cluster to start with;
4. Timeout can be fine-tuned for each test method in current way.
I'm not very sure if these are all correct since I don't quite get the new approach.
Note in the Jenkins building, the changed TestWriteReadStripedFile times out.
===The current way, having many test methods===
@Test public void testFileSmallerThanOneStripe2() throws Exception { testOneFileUsingDFSStripedInputStream("/ec/SmallerThanOneStripe", cellSize + 123); testOneFileUsingDFSStripedInputStream("/ec/SmallerThanOneStripe2", cellSize + 123, true); } @Test public void testFileEqualsWithOneStripe() throws Exception { testOneFileUsingDFSStripedInputStream("/ec/EqualsWithOneStripe", cellSize * dataBlocks); testOneFileUsingDFSStripedInputStream("/ec/EqualsWithOneStripe2", cellSize * dataBlocks, true); } @Test public void testFileMoreThanOneStripe1() throws Exception { testOneFileUsingDFSStripedInputStream("/ec/MoreThanOneStripe1", cellSize * dataBlocks + 123); testOneFileUsingDFSStripedInputStream("/ec/MoreThanOneStripe12", cellSize * dataBlocks + 123, true); } ...
===Now after refactored===
+ /** + * Tests for different size of striped files. + */ + @RunWith(Parameterized.class) + public static class TestReadStripedFiles extends StripedFileIOTestBase { + @Parameterized.Parameters(name = "{index}: {0}") + public static Iterable<Object[]> data() { + return ImmutableList.<Object[]>builder() + .add(new Object[]{"/ec/EmptyFile", 0}) + .add(new Object[]{"/ec/SmallerThanOneCell", 1}) + .add(new Object[]{"/ec/SmallerThanOneCell2", CELL_SIZE - 1}) + .add(new Object[]{"/ec/EqualsWithOneCell", CELL_SIZE}) + .add(new Object[]{"/ec/SmallerThanOneStripe", CELL_SIZE + 123}) + .add(new Object[]{"/ec/SmallerThanOneStripe2", STRIPE_SIZE - 1}) + .add(new Object[]{"/ec/EqualsWithOneStripe", STRIPE_SIZE}) + .add(new Object[]{"/ec/MoreThanOneStripe", STRIPE_SIZE + 123}) + .add(new Object[]{"/ec/MoreThanOneStripe2", STRIPE_SIZE * 2 + 123}) + .add(new Object[]{"/ec/LessThanFullBlockGroup", + STRIPE_SIZE * (STRIPES_PER_BLOCK - 1) + CELL_SIZE}) + .add(new Object[]{"/ec/FullBlockGroup", BLOCK_SIZE * DATA_BLOCKS}) + .add(new Object[]{"/ec/MoreThanABlockGroup", + BLOCK_SIZE * DATA_BLOCKS + 123}) + .add(new Object[]{"/ec/MoreThanABlockGroup2", + BLOCK_SIZE * DATA_BLOCKS + CELL_SIZE + 123}) + .add(new Object[]{"/ec/MoreThanABlockGroup3", + BLOCK_SIZE * DATA_BLOCKS * 3 + STRIPE_SIZE + CELL_SIZE + 123}) + .build(); + }
Hi Kai Zheng, very thanks for your review.
1. The current way of having many test methods are much better readable;
It's true, I can add some comments on the parameters if you wish. But I think currently the ec file names also can tell what the test is doing.
2. It's also easier to debug if some of them are failed;
It's also true and it's the limitation of junit Parameterized.
3. More important, every test case (contained in a test method) needs a brand new cluster to start with;
It's intended because in each test, it will randomly kill a datanode so start with a new cluster is needed.
4. Timeout can be fine-tuned for each test method in current way.
It's not true, before the refactor, the timeout is controlled by
@Rule public Timeout globalTimeout = new Timeout(300000)
which applies the same timeout to all test methods in a class.
FYI that I posted a little patch on HDFS-12444 to reduce this runtime. I noticed the same duplication that Huafeng did.
An alternative idea to parameterizing is to split the tests into subclasses, one that does no failures and one that does failures. This is easier to use than JUnit parameterized testing.
Hi Andrew Wang, sorry I don't fully get your idea. Splitting the tests into subclasses cannot reduce the duplication and most of them come from the different file sizes. Did I miss anything?
And I also noticed the same duplication in TestDFSStripedOutputStream.
Hi Huafeng, agree that this doesn't reduce duplication much, my idea was to split it into multiple classes to avoid the Surefire class-level timeout.
As an example, right now we have a bunch of test methods that look like this:
@Test public void testFileMoreThanABlockGroup1() throws Exception { testOneFileUsingDFSStripedInputStream("/ec/MoreThanABlockGroup1", blockSize * dataBlocks + 123); testOneFileUsingDFSStripedInputStream("/ec/MoreThanABlockGroup12", blockSize * dataBlocks + 123, true); } .... private void testOneFileUsingDFSStripedInputStream(String src, int fileLength) throws Exception { testOneFileUsingDFSStripedInputStream(src, fileLength, false); }
We could change it as follows:
@Test public void testFileMoreThanABlockGroup1() throws Exception { testOneFileUsingDFSStripedInputStream("/ec/MoreThanABlockGroup1", blockSize * dataBlocks + 123); } .... protected void testOneFileUsingDFSStripedInputStream(String src, int fileLength) throws Exception { testOneFileUsingDFSStripedInputStream(src, fileLength, false); }
@Override protected void testOneFileUsingDFSStripedInputStream(String src, int fileLength) throws Exception { testOneFileUsingDFSStripedInputStream(src, fileLength, true); }
We could combine with JUnit parameterization if you want, seems like Kai had some reservations though. Personally I'm okay parameterizing; it's harder to run a single test case on the commandline, but IntelliJ IDEA knows how to do this.
-1 overall |
Vote | Subsystem | Runtime | Logfile | Comment |
---|---|---|---|---|
0 | reexec | 0m 0s | Docker mode activated. | |
-1 | patch | 0m 11s | HDFS-12398 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. |
Subsystem | Report/Notes |
---|---|
JIRA Issue | HDFS-12398 |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12886026/HDFS-12398.002.patch |
Console output | https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/257/console |
versions | git=2.17.1 |
Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
This message was automatically generated.
This message was automatically generated.