Issue Details (XML | Word | Printable)

Key: HADOOP-1224
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Unassigned
Reporter: Konstantin Shvachko
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

"Browse the filesystem" link pointing to a dead data-node

Created: 06/Apr/07 10:51 PM   Updated: 08/Jul/09 04:42 PM
Return to search
Component/s: None
Affects Version/s: 0.12.3
Fix Version/s: 0.13.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works DFSBrowsingDeadNode_v1.0.patch 2007-04-13 12:59 PM Enis Soztutar 1 kB
Text File Licensed for inclusion in ASF works DFSBrowsingDeadNode_v1.1.patch 2007-05-02 08:40 AM Enis Soztutar 1 kB
Text File Licensed for inclusion in ASF works DFSBrowsingDeadNode_v1.2.patch 2007-05-04 08:53 AM Enis Soztutar 0.7 kB

Resolution Date: 07/May/07 08:59 PM


 Description  « Hide
On the NameNode status web page "Browse the filesystem" link can point to a dead data-node.
The reason for that is that FSNamesystem.randomDataNode() selects a random node from the
list of all nodes rather then selecting among alive nodes only.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Enis Soztutar added a comment - 13/Apr/07 12:59 PM
This patch
1.changes randomDataNode() so that it will skip deadnodes and decommissioned nodes. Starts with a random data node and checks the data nodes sequentially until a live node is found.
2.changes the return type of the getDatanodeByIndex() from DatanodeInfo to DatanodeDescriptor.


dhruba borthakur added a comment - 13/Apr/07 06:54 PM
+1. Code looks good.

Tom White added a comment - 17/Apr/07 08:36 AM
I've just committed this. Thanks Enis!

Hadoop QA added a comment - 17/Apr/07 11:21 AM

Konstantin Shvachko added a comment - 02/May/07 07:31 AM
Now its even worse, we select ONLY! dead nodes.
if (d != null && !d.isDecommissioned() && isDatanodeDead(d) &&
!d.isDecommissionInProgress()) {
return d.getHost() + ":" + d.getInfoPort();
Did anybody ever actually tried to click the link?

Enis Soztutar added a comment - 02/May/07 08:40 AM
This patch applies to current trunk(534354). Fixes the bug in [forgotten!] negation in check in isDataNodeDead(), introduces a function isDataNodeLive()


Tom White added a comment - 02/May/07 12:55 PM
Thanks Enis. Have you manually tested this latest patch as Konstantin suggests?

Enis Soztutar added a comment - 03/May/07 07:40 AM
Finally, i was able to test the patch.
Manually i have set up a cluster with 2 DN and one NN.
After intentionally killing one DN, or decommisioning one DN, browsing worked as expected. Sorry for the previously untested version

Konstantin Shvachko added a comment - 03/May/07 05:47 PM
It is confusing if a data-node can be neither dead nor alive. In your patch
isDatanodeDead() =/= ! isDatanodeLive()
The patch should merely add "!" imo.

Tom White added a comment - 03/May/07 10:14 PM
> The patch should merely add "!" imo.

+1


Enis Soztutar added a comment - 04/May/07 07:50 AM
The confusing thing here, IMO, is that the admin status of the datanode, can be either NORMAL, DECOMMISIONED or DECOMMISSION_IN_PROGRESS, and if the admin state is normal, it can be either dead or "not dead". So a data node, from the perspective of the end user, can be in one of the four states : live, dead, decommissioned or decommission_in_progress. Thus isDatanodeDead() =/= ! isDatanodeLive().

Enis Soztutar added a comment - 04/May/07 08:53 AM
one line patch that adds "!" .

Konstantin Shvachko added a comment - 04/May/07 08:03 PM
+1

Doug Cutting added a comment - 07/May/07 08:59 PM
I just committed this. Thanks, Enis!

Hadoop QA added a comment - 08/May/07 11:24 AM