Issue Details (XML | Word | Printable)

Key: HADOOP-3614
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Blocker Blocker
Assignee: Tsz Wo (Nicholas), SZE
Reporter: Konstantin Shvachko
Votes: 0
Watchers: 3
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

TestLeaseRecovery fails when run with assertions enabled.

Created: 20/Jun/08 10:56 PM   Updated: 08/Jul/09 04:43 PM
Return to search
Component/s: None
Affects Version/s: 0.18.0
Fix Version/s: 0.18.2

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works 3614_20080926.patch 2008-09-26 09:49 PM Tsz Wo (Nicholas), SZE 5 kB
Text File Licensed for inclusion in ASF works 3614_20080926b.patch 2008-09-26 10:48 PM Tsz Wo (Nicholas), SZE 5 kB
Text File Licensed for inclusion in ASF works 3614_20080926c.patch 2008-09-26 10:57 PM Tsz Wo (Nicholas), SZE 4 kB
Text File Licensed for inclusion in ASF works 3614_20080926c_0.18.patch 2008-09-27 04:13 PM Tsz Wo (Nicholas), SZE 3 kB
Issue Links:
Reference

Hadoop Flags: Reviewed
Resolution Date: 27/Sep/08 04:18 PM


 Description  « Hide
I used -ea jvm options to run the test, and it fails in FSDataset.findBlockFile(Block) on
assert b.generationStamp == GenerationStamp.WILDCARD_STAMP;

Without asserts on the test passes.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Konstantin Shvachko added a comment - 20/Jun/08 11:07 PM
I see the same assert with TestLeaseRecovery2.

Bill de hOra added a comment - 03/Aug/08 05:55 PM - edited
When I step through the tests I see that 1001L is the value of the generationStamp for newblocks[i] below (from TestLeaseRecovery.testBlockSynchronization)
//update blocks with random block sizes
      Block[] newblocks = new Block[REPLICATION_NUM];
      for(int i = 0; i < REPLICATION_NUM; i++) {
        newblocks[i] = new Block(lastblock.getBlockId(), newblocksizes[i],
            lastblock.getGenerationStamp());
        idps[i].updateBlock(lastblock, newblocks[i], false);
        checkMetaInfo(newblocks[i], idps[i]);
      }

when that gets to FSDataset.findBlockFile the assert line in evaluates to 1 != 1001L because GenerationStamp.WILDCARD_STAMP is 1. I'm assuming the 1001L value is being calculated as an offset from GenerationStamp.FIRST_VALID_STAMP (which is 1000L) and 1001L comes from using the GenerationStamp.nextStamp() method:

public File findBlockFile(Block b) {
    assert b.getGenerationStamp() == GenerationStamp.WILDCARD_STAMP;

    File blockfile = null;
    ActiveFile activefile = ongoingCreates.get(b);
    if (activefile != null) {
      blockfile = activefile.file;
    }
    if (blockfile == null) {
      blockfile = getFile(b);
    }
    if (blockfile == null) {
      if (DataNode.LOG.isDebugEnabled()) {
        DataNode.LOG.debug("ongoingCreates=" + ongoingCreates);
        DataNode.LOG.debug("volumeMap=" + volumeMap);
      }
    }
    return blockfile;
  }

That part of the code was last touched in r673857 as part of HADOOP-2885, where findBlockFile was made public. Trawling the commits it looks like the assert was added as part of HADOOP-3310 (r662513; I wish svn had bisect).

What I don't understand is why the assert line is there; I don't see how it could ever work where stamps are actually being calculated. I suspect the assert needs to come out and you just might be the first person to run with assert on.


Tsz Wo (Nicholas), SZE added a comment - 25/Sep/08 08:37 PM
Linking this to HADOOP-2658.

Konstantin Shvachko added a comment - 26/Sep/08 09:25 PM
I think assert is right here. It means that findBlockFile() method should find a file containing the block by its block id no matter what its generation stamp is, and therefore findBlockFile should always be called with the generation stamp == WILDCARD_STAMP.
What is incorrect here is the api. findBlockFile(long blockID) should take block ID as a parameter and then the assert can be removed.
A comment stating exactly what the method does would be extremely helpful.

Tsz Wo (Nicholas), SZE added a comment - 26/Sep/08 09:49 PM
3614_20080926.patch: fixed the bug.

Tsz Wo (Nicholas), SZE added a comment - 26/Sep/08 10:48 PM
3614_20080926b.patch: fixed some comments and codes.

Konstantin Shvachko added a comment - 26/Sep/08 10:49 PM
  1. In FSDataset.updateBlock() you do not need to set oldblock.setGenerationStamp(GenerationStamp.WILDCARD_STAMP)
  2. As far as I can see TestFileCtreation should be eventually moved to org.apache.hadoop.hdfs.server.datanode pavkage in the test directory. Because it uses mostly public (legally) classes and methods from other packages, and some methods like getBlockFile() that were made public just to keep the test in its current package. So I would rather leave findBlockFile() public for now and wouldn't introduce new DataNodeTestUtil class.
  3. Please add a JavaDoc comment for findBlockFile() so that there were no confusion about its semantics, which is rather hard to see right away.

Tsz Wo (Nicholas), SZE added a comment - 26/Sep/08 10:57 PM
3614_20080926c.patch: incorporated Konstantin's comments

Konstantin Shvachko added a comment - 26/Sep/08 11:44 PM
+1
Looks like this is a bug and should be committed in all version starting from 0.18

Tsz Wo (Nicholas), SZE added a comment - 26/Sep/08 11:46 PM
The patch fixed a bug which may cause other append-related tests failed.

Tsz Wo (Nicholas), SZE added a comment - 27/Sep/08 06:56 AM
Passed all tests locally. Submitting...

Hadoop QA added a comment - 27/Sep/08 09:12 AM
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12391069/3614_20080926c.patch
against trunk revision 699517.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 Eclipse classpath. The patch retains Eclipse classpath integrity.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3387/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3387/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3387/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3387/console

This message is automatically generated.


Tsz Wo (Nicholas), SZE added a comment - 27/Sep/08 04:13 PM
3614_20080926c_0.18.patch: for 0.18

Tsz Wo (Nicholas), SZE added a comment - 27/Sep/08 04:18 PM
I just committed this to 0.18, 0.19 and trunk.

Hudson added a comment - 29/Sep/08 03:27 PM