Flume
  1. Flume
  2. FLUME-1110

HDFS Sink throws IllegalStateException when flume-daemon shuts down

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: v1.1.0
    • Fix Version/s: None
    • Component/s: Sinks+Sources
    • Labels:
      None

      Description

      When using HDFS sink, if you shutdown the daemon (sudo /etc/init.d/flume-ng-node stop), then an IllegalStateException is shown in the logs (/var/log/flume-ng/flume.log).

      2012-04-06 10:44:19,912 ERROR hdfs.HDFSEventSink: Error calling org.apache.flume.sink.hdfs.HDFSEventSink$4@32091738
      java.lang.IllegalStateException: Shutdown in progress
      at java.lang.ApplicationShutdownHooks.add(ApplicationShutdownHooks.java:39)
      at java.lang.Runtime.addShutdownHook(Runtime.java:192)
      at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1607)
      at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1579)
      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:228)
      at org.apache.hadoop.fs.Path.getFileSystem(Path.java:183)
      at org.apache.flume.sink.hdfs.BucketWriter.renameBucket(BucketWriter.java:196)
      at org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:122)
      at org.apache.flume.sink.hdfs.HDFSEventSink$4.call(HDFSEventSink.java:440)
      at org.apache.flume.sink.hdfs.HDFSEventSink$4.call(HDFSEventSink.java:436)
      at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
      at java.lang.Thread.run(Thread.java:662)
      2012-04-06 10:44:19,927 INFO source.SyslogTcpSource: Syslog TCP Source stopping...
      2012-04-06 10:44:19,927 INFO source.SyslogTcpSource: Metrics:{ name:null counters:

      {events.success=11002}

      }

      1. FLUME-1110.patch
        0.7 kB
        Prasad Mujumdar

        Issue Links

          Activity

          Hide
          Mike Percy added a comment -

          This was fixed a long time ago, related to disabling the shutdown hook.

          Show
          Mike Percy added a comment - This was fixed a long time ago, related to disabling the shutdown hook.
          Hide
          Hari Shreedharan added a comment -

          Is this still an issue?

          Show
          Hari Shreedharan added a comment - Is this still an issue?
          Hide
          Brock Noland added a comment -

          I am still getting this with trunk:

          12/04/18 22:16:30 ERROR hdfs.HDFSEventSink: Error calling org.apache.flume.sink.hdfs.HDFSEventSink$4@5dd68001
          java.lang.IllegalStateException: Shutdown in progress
          	at java.lang.ApplicationShutdownHooks.add(ApplicationShutdownHooks.java:39)
          	at java.lang.Runtime.addShutdownHook(Runtime.java:192)
          	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1607)
          	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1579)
          	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:228)
          	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:111)
          	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:212)
          	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:183)
          	at org.apache.flume.sink.hdfs.BucketWriter.renameBucket(BucketWriter.java:201)
          	at org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:127)
          	at org.apache.flume.sink.hdfs.HDFSEventSink$4.call(HDFSEventSink.java:442)
          	at org.apache.flume.sink.hdfs.HDFSEventSink$4.call(HDFSEventSink.java:1)
          	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
          	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
          	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
          	at java.lang.Thread.run(Thread.java:662)
          
          Show
          Brock Noland added a comment - I am still getting this with trunk: 12/04/18 22:16:30 ERROR hdfs.HDFSEventSink: Error calling org.apache.flume.sink.hdfs.HDFSEventSink$4@5dd68001 java.lang.IllegalStateException: Shutdown in progress at java.lang.ApplicationShutdownHooks.add(ApplicationShutdownHooks.java:39) at java.lang.Runtime.addShutdownHook(Runtime.java:192) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1607) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1579) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:228) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:111) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:212) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:183) at org.apache.flume.sink.hdfs.BucketWriter.renameBucket(BucketWriter.java:201) at org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:127) at org.apache.flume.sink.hdfs.HDFSEventSink$4.call(HDFSEventSink.java:442) at org.apache.flume.sink.hdfs.HDFSEventSink$4.call(HDFSEventSink.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
          Hide
          Hudson added a comment -

          Integrated in flume-trunk #166 (See https://builds.apache.org/job/flume-trunk/166/)
          FLUME-1110. HDFS Sink throws IllegalStateException when flume shuts down.

          (Prasad Mujumdar via Arvind Prabhakar) (Revision 1311517)

          Result = SUCCESS
          arvind : http://svn.apache.org/viewvc/?view=rev&rev=1311517
          Files :

          • /incubator/flume/trunk/flume-ng-core/src/main/java/org/apache/flume/SinkRunner.java
          Show
          Hudson added a comment - Integrated in flume-trunk #166 (See https://builds.apache.org/job/flume-trunk/166/ ) FLUME-1110 . HDFS Sink throws IllegalStateException when flume shuts down. (Prasad Mujumdar via Arvind Prabhakar) (Revision 1311517) Result = SUCCESS arvind : http://svn.apache.org/viewvc/?view=rev&rev=1311517 Files : /incubator/flume/trunk/flume-ng-core/src/main/java/org/apache/flume/SinkRunner.java
          Hide
          Arvind Prabhakar added a comment -

          Patch committed. Thanks Prasad!

          Show
          Arvind Prabhakar added a comment - Patch committed. Thanks Prasad!
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2012-04-09 22:35:44, Arvind Prabhakar wrote:

          > +1

          >

          > @Brock will keep an eye on the issue and if it resurfaces after this fix, will open a new jira.

          Prasad - please attach the patch to the Jira

          • Arvind

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/4681/#review6807
          -----------------------------------------------------------

          On 2012-04-09 06:54:55, Prasad Mujumdar wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/4681/

          -----------------------------------------------------------

          (Updated 2012-04-09 06:54:55)

          Review request for Flume and Arvind Prabhakar.

          Summary

          -------

          The sink runner's stop method first calls stop() to underlying sink and then shuts down the PollingRunner thread. If that thread is in middle of process, it leads to race conditions in the sink's process() and stop().

          Rather than making all sinks to handle concurrently process() and stop(), its safer to shutdown the runner thread first and then stop the sink.

          This addresses bug FLUME-1110.

          https://issues.apache.org/jira/browse/FLUME-1110

          Diffs

          -----

          flume-ng-core/src/main/java/org/apache/flume/SinkRunner.java e73c09b

          Diff: https://reviews.apache.org/r/4681/diff

          Testing

          -------

          full regression test run

          Thanks,

          Prasad

          Show
          jiraposter@reviews.apache.org added a comment - On 2012-04-09 22:35:44, Arvind Prabhakar wrote: > +1 > > @Brock will keep an eye on the issue and if it resurfaces after this fix, will open a new jira. Prasad - please attach the patch to the Jira Arvind ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4681/#review6807 ----------------------------------------------------------- On 2012-04-09 06:54:55, Prasad Mujumdar wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4681/ ----------------------------------------------------------- (Updated 2012-04-09 06:54:55) Review request for Flume and Arvind Prabhakar. Summary ------- The sink runner's stop method first calls stop() to underlying sink and then shuts down the PollingRunner thread. If that thread is in middle of process, it leads to race conditions in the sink's process() and stop(). Rather than making all sinks to handle concurrently process() and stop(), its safer to shutdown the runner thread first and then stop the sink. This addresses bug FLUME-1110 . https://issues.apache.org/jira/browse/FLUME-1110 Diffs ----- flume-ng-core/src/main/java/org/apache/flume/SinkRunner.java e73c09b Diff: https://reviews.apache.org/r/4681/diff Testing ------- full regression test run Thanks, Prasad
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/4681/#review6807
          -----------------------------------------------------------

          Ship it!

          +1

          @Brock will keep an eye on the issue and if it resurfaces after this fix, will open a new jira.

          • Arvind

          On 2012-04-09 06:54:55, Prasad Mujumdar wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/4681/

          -----------------------------------------------------------

          (Updated 2012-04-09 06:54:55)

          Review request for Flume and Arvind Prabhakar.

          Summary

          -------

          The sink runner's stop method first calls stop() to underlying sink and then shuts down the PollingRunner thread. If that thread is in middle of process, it leads to race conditions in the sink's process() and stop().

          Rather than making all sinks to handle concurrently process() and stop(), its safer to shutdown the runner thread first and then stop the sink.

          This addresses bug FLUME-1110.

          https://issues.apache.org/jira/browse/FLUME-1110

          Diffs

          -----

          flume-ng-core/src/main/java/org/apache/flume/SinkRunner.java e73c09b

          Diff: https://reviews.apache.org/r/4681/diff

          Testing

          -------

          full regression test run

          Thanks,

          Prasad

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4681/#review6807 ----------------------------------------------------------- Ship it! +1 @Brock will keep an eye on the issue and if it resurfaces after this fix, will open a new jira. Arvind On 2012-04-09 06:54:55, Prasad Mujumdar wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4681/ ----------------------------------------------------------- (Updated 2012-04-09 06:54:55) Review request for Flume and Arvind Prabhakar. Summary ------- The sink runner's stop method first calls stop() to underlying sink and then shuts down the PollingRunner thread. If that thread is in middle of process, it leads to race conditions in the sink's process() and stop(). Rather than making all sinks to handle concurrently process() and stop(), its safer to shutdown the runner thread first and then stop the sink. This addresses bug FLUME-1110 . https://issues.apache.org/jira/browse/FLUME-1110 Diffs ----- flume-ng-core/src/main/java/org/apache/flume/SinkRunner.java e73c09b Diff: https://reviews.apache.org/r/4681/diff Testing ------- full regression test run Thanks, Prasad
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2012-04-09 11:10:19, Brock Noland wrote:

          > I think the change makes sense, but I am not sure if it solves the problem from the JIRA? From what I can tell about the error, it looks like HDFS is trying to add a shutdown hook after the shutdown has started.

          The roller can cause the file to be closed during the process() and the stop() also closes the file. It looked like two threads are trying to close the same file simultaneously.

          • Prasad

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/4681/#review6789
          -----------------------------------------------------------

          On 2012-04-09 06:54:55, Prasad Mujumdar wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/4681/

          -----------------------------------------------------------

          (Updated 2012-04-09 06:54:55)

          Review request for Flume and Arvind Prabhakar.

          Summary

          -------

          The sink runner's stop method first calls stop() to underlying sink and then shuts down the PollingRunner thread. If that thread is in middle of process, it leads to race conditions in the sink's process() and stop().

          Rather than making all sinks to handle concurrently process() and stop(), its safer to shutdown the runner thread first and then stop the sink.

          This addresses bug FLUME-1110.

          https://issues.apache.org/jira/browse/FLUME-1110

          Diffs

          -----

          flume-ng-core/src/main/java/org/apache/flume/SinkRunner.java e73c09b

          Diff: https://reviews.apache.org/r/4681/diff

          Testing

          -------

          full regression test run

          Thanks,

          Prasad

          Show
          jiraposter@reviews.apache.org added a comment - On 2012-04-09 11:10:19, Brock Noland wrote: > I think the change makes sense, but I am not sure if it solves the problem from the JIRA? From what I can tell about the error, it looks like HDFS is trying to add a shutdown hook after the shutdown has started. The roller can cause the file to be closed during the process() and the stop() also closes the file. It looked like two threads are trying to close the same file simultaneously. Prasad ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4681/#review6789 ----------------------------------------------------------- On 2012-04-09 06:54:55, Prasad Mujumdar wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4681/ ----------------------------------------------------------- (Updated 2012-04-09 06:54:55) Review request for Flume and Arvind Prabhakar. Summary ------- The sink runner's stop method first calls stop() to underlying sink and then shuts down the PollingRunner thread. If that thread is in middle of process, it leads to race conditions in the sink's process() and stop(). Rather than making all sinks to handle concurrently process() and stop(), its safer to shutdown the runner thread first and then stop the sink. This addresses bug FLUME-1110 . https://issues.apache.org/jira/browse/FLUME-1110 Diffs ----- flume-ng-core/src/main/java/org/apache/flume/SinkRunner.java e73c09b Diff: https://reviews.apache.org/r/4681/diff Testing ------- full regression test run Thanks, Prasad
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/4681/#review6789
          -----------------------------------------------------------

          I think the change makes sense, but I am not sure if it solves the problem from the JIRA? From what I can tell about the error, it looks like HDFS is trying to add a shutdown hook after the shutdown has started.

          • Brock

          On 2012-04-09 06:54:55, Prasad Mujumdar wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/4681/

          -----------------------------------------------------------

          (Updated 2012-04-09 06:54:55)

          Review request for Flume and Arvind Prabhakar.

          Summary

          -------

          The sink runner's stop method first calls stop() to underlying sink and then shuts down the PollingRunner thread. If that thread is in middle of process, it leads to race conditions in the sink's process() and stop().

          Rather than making all sinks to handle concurrently process() and stop(), its safer to shutdown the runner thread first and then stop the sink.

          This addresses bug FLUME-1110.

          https://issues.apache.org/jira/browse/FLUME-1110

          Diffs

          -----

          flume-ng-core/src/main/java/org/apache/flume/SinkRunner.java e73c09b

          Diff: https://reviews.apache.org/r/4681/diff

          Testing

          -------

          full regression test run

          Thanks,

          Prasad

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4681/#review6789 ----------------------------------------------------------- I think the change makes sense, but I am not sure if it solves the problem from the JIRA? From what I can tell about the error, it looks like HDFS is trying to add a shutdown hook after the shutdown has started. Brock On 2012-04-09 06:54:55, Prasad Mujumdar wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4681/ ----------------------------------------------------------- (Updated 2012-04-09 06:54:55) Review request for Flume and Arvind Prabhakar. Summary ------- The sink runner's stop method first calls stop() to underlying sink and then shuts down the PollingRunner thread. If that thread is in middle of process, it leads to race conditions in the sink's process() and stop(). Rather than making all sinks to handle concurrently process() and stop(), its safer to shutdown the runner thread first and then stop the sink. This addresses bug FLUME-1110 . https://issues.apache.org/jira/browse/FLUME-1110 Diffs ----- flume-ng-core/src/main/java/org/apache/flume/SinkRunner.java e73c09b Diff: https://reviews.apache.org/r/4681/diff Testing ------- full regression test run Thanks, Prasad
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/4681/
          -----------------------------------------------------------

          Review request for Flume and Arvind Prabhakar.

          Summary
          -------

          The sink runner's stop method first calls stop() to underlying sink and then shuts down the PollingRunner thread. If that thread is in middle of process, it leads to race conditions in the sink's process() and stop().
          Rather than making all sinks to handle concurrently process() and stop(), its safer to shutdown the runner thread first and then stop the sink.

          This addresses bug FLUME-1110.
          https://issues.apache.org/jira/browse/FLUME-1110

          Diffs


          flume-ng-core/src/main/java/org/apache/flume/SinkRunner.java e73c09b

          Diff: https://reviews.apache.org/r/4681/diff

          Testing
          -------

          full regression test run

          Thanks,

          Prasad

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4681/ ----------------------------------------------------------- Review request for Flume and Arvind Prabhakar. Summary ------- The sink runner's stop method first calls stop() to underlying sink and then shuts down the PollingRunner thread. If that thread is in middle of process, it leads to race conditions in the sink's process() and stop(). Rather than making all sinks to handle concurrently process() and stop(), its safer to shutdown the runner thread first and then stop the sink. This addresses bug FLUME-1110 . https://issues.apache.org/jira/browse/FLUME-1110 Diffs flume-ng-core/src/main/java/org/apache/flume/SinkRunner.java e73c09b Diff: https://reviews.apache.org/r/4681/diff Testing ------- full regression test run Thanks, Prasad

            People

            • Assignee:
              Prasad Mujumdar
              Reporter:
              Prasad Mujumdar
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development