Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-5383

TaskManager fails with SIGBUS when loading RocksDB

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.2.0
    • 1.3.0
    • None
    • None

    Description

      While trying out Flink 1.2, my TaskManager died with the following error while deploying a job:

      2016-12-21 15:57:50,080 INFO  org.apache.flink.runtime.taskmanager.Task                     - Map -> Sink
      : Unnamed (15/16) (50f527e4445479fb1fc9f34394d86d2f) switched from DEPLOYING to RUNNING.
      2016-12-21 15:57:50,081 INFO  org.apache.flink.runtime.taskmanager.Task                     - Map -> Sink
      : Unnamed (16/16) (b4b3d3340de587d729fe83d65eac3e10) switched from DEPLOYING to RUNNING.
      2016-12-21 15:57:50,081 INFO  org.apache.flink.streaming.runtime.tasks.StreamTask           - Using user-
      defined state backend: RocksDB State Backend {isInitialized=false, configuredDbBasePaths=null, initialize
      dDbBasePaths=null, checkpointStreamBackend=File State Backend @ hdfs://nameservice1/shared/checkpoint-dir
      -rocks}.
      2016-12-21 15:57:50,081 INFO  org.apache.flink.streaming.runtime.tasks.StreamTask           - Using user-
      defined state backend: RocksDB State Backend {isInitialized=false, configuredDbBasePaths=null, initialize
      dDbBasePaths=null, checkpointStreamBackend=File State Backend @ hdfs://nameservice1/shared/checkpoint-dir
      -rocks}.
      2016-12-21 15:57:50,223 INFO  org.apache.flink.contrib.streaming.state.RocksDBStateBackend  - Attempting 
      to load RocksDB native library and store it at '/yarn/nm/usercache/longrunning/appcache/application_14821
      56101125_0016'
      
      LogType:taskmanager.out
      Log Upload Time:Wed Dec 21 16:00:35 +0000 2016
      LogLength:959
      Log Contents:
      #
      # A fatal error has been detected by the Java Runtime Environment:
      #
      #  SIGBUS (0x7) at pc=0x00007fe745fd596a, pid=7414, tid=140630801725184
      #
      # JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 1.7.0_67-b01)
      # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 compressed oops)
      # Problematic frame:
      # C  [ld-linux-x86-64.so.2+0x1a96a]  realloc+0x2bfa
      #
      

      the error report file contained the following frames:

      Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
      j  java.lang.ClassLoader$NativeLibrary.load(Ljava/lang/String;)V+0
      j  java.lang.ClassLoader.loadLibrary1(Ljava/lang/Class;Ljava/io/File;)Z+302
      j  java.lang.ClassLoader.loadLibrary0(Ljava/lang/Class;Ljava/io/File;)Z+2
      j  java.lang.ClassLoader.loadLibrary(Ljava/lang/Class;Ljava/lang/String;Z)V+48
      j  java.lang.Runtime.load0(Ljava/lang/Class;Ljava/lang/String;)V+57
      j  java.lang.System.load(Ljava/lang/String;)V+7
      j  org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(Ljava/lang/String;)V+14
      j  org.rocksdb.NativeLibraryLoader.loadLibrary(Ljava/lang/String;)V+22
      j  org.apache.flink.contrib.streaming.state.RocksDBStateBackend.ensureRocksDBIsLoaded(Ljava/lang/String;)V+62
      j  org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(Lorg/apache/flink/runtime/execution/Environment;Lorg/apache/flink/api/common/JobID;Ljava/lang/String;Lorg/apache/flink/api/common/typeutils/TypeSerializer;ILorg/apache/flink/runtime/state/KeyGroupRange;Lorg/apache/flink/runtime/query/TaskKvStateRegistry;)Lorg/apache/flink/runtime/state/AbstractKeyedStateBackend;+16
      j  org.apache.flink.streaming.runtime.tasks.StreamTask.createKeyedStateBackend(Lorg/apache/flink/api/common/typeutils/TypeSerializer;ILorg/apache/flink/runtime/state/KeyGroupRange;)Lorg/apache/flink/runtime/state/AbstractKeyedStateBackend;+137
      

      I saw this error only once so far. I'll report again if it happens more frequently.

      Attachments

        Activity

          People

            sewen Stephan Ewen
            rmetzger Robert Metzger
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: