[HADOOP-14855] Hadoop scripts may errantly believe a daemon is still running, preventing it from starting - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.0.0-alpha4
Fix Version/s: 3.2.0
Component/s: scripts
Labels:
None

Target Version/s:

3.2.0
Hadoop Flags:

Reviewed

Description

I encountered a case recently where the NN wouldn't start, with the error message "namenode is running as process 16769. Stop it first." In fact the NN was not running at all, but rather another long-running process was running with this pid.

It looks to me like our scripts just check to see if any process is running with the pid that the NN (or any Hadoop daemon) most recently ran with. This is clearly not a fool-proof way of checking to see if a particular type of daemon is now running, as some other process could start running with the same pid since the daemon in question was previously shut down.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HADOOP-14855.001.patch
21/Mar/18 00:03
0.8 kB
Robert Kanter
HADOOP-14855.002.patch
21/Mar/18 17:26
0.8 kB
Robert Kanter

Issue Links

is duplicated by

HADOOP-9085 start namenode failure,bacause pid of namenode pid file is other process pid or thread id before start namenode

Resolved

Activity

People

Assignee:: Robert Kanter

Reporter:: Aaron Myers

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 08/Sep/17 18:21

Updated:: 04/Apr/18 22:56

Resolved:: 04/Apr/18 22:39