Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.1.1
-
None
-
None
Description
Hit a weird bug where writing mkdir op to edit log throws an NPE and NameNode crashed
2019-06-26 10:57:27,398 ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: write op failed for (journal JournalAndStream(mgr=FileJournalManager(root=/ssd/work/src/upstream/impala/testdata/cluster/cdh6/node-1/data/dfs/nn), stream=EditLogFileOutputStream(/ssd/work/src/upstream/impala/testdata/cluster/cdh6/node-1/data/dfs/nn/current/edits_inprogress_0000000000000598588))) java.lang.NullPointerException at org.apache.hadoop.io.Text.encode(Text.java:451) at org.apache.hadoop.io.Text.encode(Text.java:431) at org.apache.hadoop.io.Text.writeString(Text.java:491) at org.apache.hadoop.fs.permission.PermissionStatus.write(PermissionStatus.java:104) at org.apache.hadoop.fs.permission.PermissionStatus.write(PermissionStatus.java:84) at org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$MkdirOp.writeFields(FSEditLogOp.java:1654) at org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$Writer.writeOp(FSEditLogOp.java:4866) at org.apache.hadoop.hdfs.server.namenode.EditsDoubleBuffer$TxnBuffer.writeOp(EditsDoubleBuffer.java:157) at org.apache.hadoop.hdfs.server.namenode.EditsDoubleBuffer.writeOp(EditsDoubleBuffer.java:60) at org.apache.hadoop.hdfs.server.namenode.EditLogFileOutputStream.write(EditLogFileOutputStream.java:97) at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$1.apply(JournalSet.java:444) at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:385) at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:55) at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.write(JournalSet.java:440) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.doEditTransaction(FSEditLog.java:481) at org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync$Edit.logEdit(FSEditLogAsync.java:288) at org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.run(FSEditLogAsync.java:232)
The stacktrace is similar to SENTRY-555, which is thought to be a Sentry bug (authorization provider), but this cluster doesn't have Sentry and therefore could be a genuine HDFS bug.
File this jira to keep a record.