Issue Details (XML | Word | Printable)

Key: HADOOP-3732
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Blocker Blocker
Assignee: Raghu Angadi
Reporter: Konstantin Shvachko
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Block scanner should read block information during initialization.

Created: 09/Jul/08 06:41 PM   Updated: 08/Jul/09 04:43 PM
Return to search
Component/s: None
Affects Version/s: 0.17.0
Fix Version/s: 0.19.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works HADOOP-3732.patch 2008-07-11 12:31 AM Raghu Angadi 3 kB
Text File Licensed for inclusion in ASF works HADOOP-3732.patch 2008-07-09 11:00 PM Raghu Angadi 3 kB

Resolution Date: 16/Jul/08 01:46 AM


 Description  « Hide
We see a lot of warning messages on each data-node during startup and upgrading from 0.17 to 0.18 saying:
2008-07-08 22:22:15,711 WARN org.apache.hadoop.dfs.DataNode: Block /grid/3/hadoop/var/hdfs/data/current/blk_3359714082706415785 does not have a metafile!

The message received twice for each block, because the block information is first read buy the data-node itself and then by the block scanner.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Konstantin Shvachko added a comment - 09/Jul/08 06:43 PM
  1. Right now the DataBlockScanner is initialized during data-node startup, and reads block information from the data-node storage directory at that time. See init() in DataBlockScanner(). A better time to obtain block information would be when the block scanner actually starts scanning and verifying, that is in DataBlockScanner.run(). During regular startup the difference in timing is negligible, but if there is distributed upgrade then it can take a while until the scanner will have a chance to run, besides it may read incorrect information since it will be reading the pre-upgrade block files.
  2. The message itself is confusing for administrators mostly because it is repeated 80,000 times (2 * #blocks). I am not sure how to deal with this. If we downgrade it to DEBUG level we risk to miss it during regular dn operation. May be we should postpone creating the dn block map until the upgrade is done. It is the same thins as with the bscanner, before the upgrade is finished block information might not be correct from the new software point of view.

Raghu Angadi added a comment - 09/Jul/08 08:44 PM
> May be we should postpone creating the dn block map until the upgrade is done. It is the same thins as with the bscanner, before the upgrade is finished block information might not be correct from the new software point of view.

If FSDataset does not scan the directories, datanode won't know what to upgrade. So some where FSDataset should be able to understand prev layout.

For blockScanner I will attach a patch that differs the initialization.


Raghu Angadi added a comment - 09/Jul/08 11:00 PM
Patch for trunk is attached. The fix differs loading block info until the scanner thread is started. If there are any calls to addBlock() or deleteBlock() in between, these are ignored.

Konstantin Shvachko added a comment - 11/Jul/08 12:15 AM
I would use blockMap == null instead of new boolean <initialized>.
+1 on the rest.

Konstantin Shvachko added a comment - 11/Jul/08 12:19 AM
Since you are in this file could you pls "generate serial version ID" for the servlet in order to get rid of the warning.
Eclipse will do it for you.

Raghu Angadi added a comment - 11/Jul/08 12:31 AM
> I would use blockMap == null instead of new boolean <initialized>.

thats right. Updated patch uses "throttler == null" since this condition should be changed only at the end of init().

> "generate serial version ID" for the servlet
I think Hadoop's policy is not to include them.. this warning can be disabled in Eclipse.


Hadoop QA added a comment - 15/Jul/08 11:41 PM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12385818/HADOOP-3732.patch
against trunk revision 677054.

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2863/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2863/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2863/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2863/console

This message is automatically generated.


Raghu Angadi added a comment - 16/Jul/08 01:46 AM
I just committed this.

Hudson added a comment - 22/Aug/08 12:34 PM