Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-3190

flume shutdown hook issue when both hbase and hdfs sink are in use

    XMLWordPrintableJSON

Details

    • Bug
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 1.6.0
    • None
    • None
    • None

    Description

      When both hdfs and hbase sink are in use, during shutdown (KILL SIGTERM), the hdfs sink won't be able to rename/close the .tmp hdfs file because the underlying filesystem could be closed earlier when shutting down the other component:

      2017/10/23 15:34:50,858 ERROR (AbstractHDFSWriter.hflushOrSync:268) - Error while trying to hflushOrSync!
      2017/10/23 15:34:50,859 WARN (BucketWriter.close:400) - failed to close() HDFSWriter for file (/tmp/bothSource/FlumeData.1508744083526.tmp). Exception follows.
      java.io.IOException: Filesystem closed
              at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:860)
              at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2388)
              at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:2334)
              at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.flume.sink.hdfs.AbstractHDFSWriter.hflushOrSync(AbstractHDFSWriter.java:265)
              at org.apache.flume.sink.hdfs.HDFSDataStream.close(HDFSDataStream.java:134)
              at org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:327)
              at org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:323)
              at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:701)
              at org.apache.flume.auth.SimpleAuthenticator.execute(SimpleAuthenticator.java:50)
              at org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:698)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:745)
      

      the root cause is HBase client's DynamicClassLoader injection (See DynamicClassLoader.java in HBase). HBase added a feature at some point to load JARs from HDFS dynamically into its class loader, and to do this it loads a DistributedFileSystem object via the standard FileSystem.get(…) / equivalent call.
      Flume, OTOH, in its HDFS BucketWriter, uses FileSystem.get(…) too (all a single instance, coming from the cache), but supplies an instruction that disables automatic-close at shutdown (Look for fs.automatic.close in BucketWriter.java).
      When HBase sink is active, HBase shares the FileSystem object indirectly for its internal/implicit DynamicClassLoader object, but this is grabbed from the cache without specifying 'do not auto-close at shutdown' cause HBase is not really troubled by that. However, since the same FileSystem object instance is now shared by something that wants it to auto-close and something that does not, the shutdown causes a problem in Flume.

      Attachments

        Issue Links

          Activity

            People

              mcsanady Miklos Csanady
              yxzhang Yuexin Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: