|
This patch does the following:
1. Upgrades existing clusters to the new disk format. Each Block has a generation stamp associated with it. Existing blocks get a generation stamp of 0. The generation stamp is used to create the name of the block metafile on the datanode. 2. The Block object has a new field called "generationStamp" of type "long". The BlocksMap on the namenode is keyed on the blockid and the generation stamp. 3. The datanode sends the generation stamp, block id and size of each block in a block report. 3. All log statements that print the blockid now prints the blockid and generationstamp. 4. The client receives the blockid and generation stamp as part of an RPC that receives a block. 5. The DataTransferProtocol sends the blockid and generation stamp with every connection request that the client makes to the datanode(s). upgradeGenStamp4.patch :
We have to be careful about handling the filenames for blocks and meta data. It would be better if we put all the codes in one place. This patch incorporates the feedback given by Nicholas.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12380249/upgradeGenStamp5.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 27 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs -1. The patch appears to cause Findbugs to fail. core tests -1. The patch failed core unit tests. contrib tests -1. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2252/testReport/ This message is automatically generated. Am I correct that only the last block in a file need to have generation stamp?
In that case generationStamp should be a member of INodeFile rather than Block. It should also be a part LocatedBlock so that clients had the same api to work with blocks. BlockMetadata.
Datanode
DatanodeBlockInfo
DataBlockScanner
FSDataset
FSEditLog
FSImage
FSNamesystem
UpgradeManager boolean isUpgradeCompleted() { return currentUpgrade == null; } GenerationStampUpgrade
ClientProtocol General
May be this is a good time to return back to the question of renaming hdfs blocks, stop generating block ids randomly, and replace it with sequentially generated ids.
This is related to my previous question whether (in the name-node) we need to store block generation stamp for each block or only for the last block of each file. The only problem here is with prehistoric (according to Dhruba, Reminder: a block is prehistoric if it is reported to the system after its id was reassigned to another physical block.
This leads to data corruption, but this can be avoided if block ids are generated sequentially rather than randomly. Incorporates most of Konstantin's comments. I left the definition of BlockTwo just because I think it is cleaner this way. Also, the reason some code is duplicated between BlockCrcUpgrade and BlockGenerationStampUpgrade is because the BlockCrcUpgrade code might probably go away in the next release or so.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12380642/upgradeGenStamp6.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 27 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs -1. The patch appears to introduce 10 new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2296/testReport/ This message is automatically generated. -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12380784/upgradeGenStamp7.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 27 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs -1. The patch appears to introduce 1 new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2310/testReport/ This message is automatically generated. Fixed the last (hopefully) findbugs warnings.
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12380799/upgradeGenStamp8.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 27 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2313/testReport/ This message is automatically generated. Merged patch with latest trunk.
merged path with latest trunk.
Here is a Test Plan for testing Distributed Upgrades for Block Generation Stamp
Test Cases T1 Install build version 0.17 and create a new HDFS cluster of size 500 nodes.. Run random writer, sort, and then sort validation. Then shutdown cluster. Install version 0.18 build on the cluster. Start cluster with the -upgrade option. T2 Start a DFS Upgrade and restart Namenode during the upgrade. T3 Start a DFS upgrade and restart some of the Datanodes while the upgrade is occurring. T4 Start an upgrade and stop two Datanodes. T4 Start an upgrade and stop half of the Datanodes. T5 Start a DFS upgrade and let it complete successfully. Then issue the admin command to finalize the upgrade. T5 Same as T4. Once the upgrade completes successfully, bring up the two Datanodes that were shut down earlier. T6 Same at T1 but with more aggressive periodic block scanner by setting dfs.datanode.scan.period.hours to 1. T7 Same as T1. Before starting the upgrade, log into a Datanode and delete a particular blk_xxx.meta file. Then start the DFS upgrade. Merged patch with latest trunk. The latest trunk does not have BlockCrcUpgrade any more.
I ran testcase T1 on 100 nodes and it passed using the patch upgradeGenStamp9.patch
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12381224/upgradeGenStamp10.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 27 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests -1. The patch failed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2370/testReport/ This message is automatically generated. The timeout failure in the TestBalancer test does not seem to be related with this patch at all. I have run this patch multiple times now without seeing this failure again.
I would like to commit this patch earlier rather than later, especially because it is a disk format change needed for supporting "appends" to files. Please let me know (before EOB May 6th) if anybody thinks that I should hold off checking in this patch.
Merged patch with latest trunk.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12381480/upgradeGenStamp11.patch against trunk revision 653638. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 27 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2408/testReport/ This message is automatically generated. Resubmitting. TestEditLog failure does not seem to be related to the patch.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12381480/upgradeGenStamp11.patch against trunk revision 654128. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 27 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2416/testReport/ This message is automatically generated. Merged patch with latest trunk.
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12381844/upgradeGenStamp12.patch against trunk revision 655337. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 27 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2447/testReport/ This message is automatically generated. I plan on committing this patch tonight May 13th. This means that all existing clusters will have to do an "-upgrade".
Integrated in Hadoop-trunk #491 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/491/
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A few other alternatives:
1. Encode the generation stamp into the name of the metafile. Each metafile will look like blkxxxxxx.genstamp.meta. The block file will remain the same.
2. Encode the generation stamp into the name of the block file. Each block file will be of the form blkxxxxxx.genstamp. The metafile will remain the same.
3. Encode the generation stamp into the name of a new zero-size file named blkxxxxx.genstamp. The block file and the metadata file will remain the same.
4. A completely separate file (one per datanode) that records the metadata of all blocks in the datanode.
I propose that we implement option 1.