[STORM-969] HDFS Bolt can end up in an unrecoverable state - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.0.0
Component/s: storm-hdfs
Labels:
None

Description

The body of the HDFSBolt.execute() method is essentially one try-catch block. The catch block reports the error and fails the current tuple. In some cases the bolt's FSDataOutputStream object (named 'out') is in an unrecoverable state and no subsequent calls to execute() can succeed.

To produce this scenario:

process some tuples through HDFS bolt
put the underlying HDFS system into safemode
process some more tuples and receive a correct ClosedChannelException
take the underlying HDFS system out of safemode
subsequent tuples continue to fail with the same exception

The three fundamental operations that execute takes (writing, sync'ing, rotating) need to be isolated so that errors from each are specifically handled.

Attachments

Issue Links

duplicates

STORM-1162 Add tick tuples to HDFSBolt for time-based flushing

Resolved

is duplicated by

STORM-804 HdfsBolt doesn't work after a period of time which caused by network problems

Resolved

is related to

STORM-1073 SequenceFileBolt can end up in an unrecoverable state

Resolved

Activity

People

Assignee:: Aaron Blake Niskode-Dossett

Reporter:: Aaron Dossett

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 02/Aug/15 20:10

Updated:: 03/Nov/15 18:38

Resolved:: 05/Oct/15 14:23