Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15042

ConnectedComponents fails to compute graph with 200 vertices (but long paths)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Cannot Reproduce
    • 1.6.1
    • None
    • GraphX
    • None
    • Local cluster (1 instance) running on Arch Linux
      Scala 2.11.7, Java 1.8.0_92

    Description

      ConnectedComponents takes forever and eventually fails with OutOfMemory when computing this graph:

      { (i, i+1) | i <- { 1..200 } }

      If you generate the example graph, e.g., with this bash command

      for i in {1..200} ; do echo "$i $(($i+1))" ; done > input.graph
      

      ... then should be able to reproduce in the spark-shell by running:

      import org.apache.spark.graphx._
      import org.apache.spark.graphx.lib._
      val graph = GraphLoader.edgeListFile(sc, "input.graph").cache()
      
      ConnectedComponents.run(graph)
      

      I seems to take forever, and spawns these warnings from time to time:

      16/04/30 20:06:24 WARN NettyRpcEndpointRef: Error sending message [message = Heartbeat(driver,[Lscala.Tuple2;@7af98fbd,BlockManagerId(driver, localhost, 43440))] in 1 attempts
      

      For additional information, here is a link to my related question on Stackoverflow:
      http://stackoverflow.com/q/36892272/783510

      One comment so far, was that the number of skipping tasks grows exponentially.

      Here is the complete output of a spark-shell session:

      phil@terra-arch:~/tmp/spark-graph$ spark-shell 
      log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
      log4j:WARN Please initialize the log4j system properly.
      log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
      Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.properties
      To adjust logging level use sc.setLogLevel("INFO")
      Spark context available as sc.
      SQL context available as sqlContext.
      Welcome to
            ____              __
           / __/__  ___ _____/ /__
          _\ \/ _ \/ _ `/ __/  '_/
         /___/ .__/\_,_/_/ /_/\_\   version 1.6.1
            /_/
               
      Using Scala version 2.11.7 (OpenJDK 64-Bit Server VM, Java 1.8.0_92)
      Type in expressions to have them evaluated.
      Type :help for more information.
      
      scala> import org.apache.spark.graphx._
      import org.apache.spark.graphx._
      
      scala> import org.apache.spark.graphx.lib._
      import org.apache.spark.graphx.lib._
      
      scala> 
      
      scala> val graph = GraphLoader.edgeListFile(sc, "input.graph").cache()
      graph: org.apache.spark.graphx.Graph[Int,Int] = org.apache.spark.graphx.impl.GraphImpl@1fa9692b
      
      scala> ConnectedComponents.run(graph)
      16/04/30 20:05:29 WARN NettyRpcEndpointRef: Error sending message [message = Heartbeat(driver,[Lscala.Tuple2;@50432fd2,BlockManagerId(driver, localhost, 43440))] in 1 attempts
      org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 seconds]. This timeout is controlled by spark.executor.heartbeatInterval
      	at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
      	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
      	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
      	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
      	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
      	at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
      	at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:449)
      	at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:470)
      	at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
      	at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
      	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765)
      	at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:470)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10 seconds]
      	at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
      	at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
      	at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
      	at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
      	at scala.concurrent.Await$.result(package.scala:190)
      	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
      	... 14 more
      16/04/30 20:06:24 WARN NettyRpcEndpointRef: Error sending message [message = Heartbeat(driver,[Lscala.Tuple2;@7af98fbd,BlockManagerId(driver, localhost, 43440))] in 1 attempts
      org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 seconds]. This timeout is controlled by spark.executor.heartbeatInterval
      	at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
      	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
      	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
      	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
      	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
      	at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
      	at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:449)
      	at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:470)
      	at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
      	at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
      	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765)
      	at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:470)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10 seconds]
      	at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
      	at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
      	at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
      	at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
      	at scala.concurrent.Await$.result(package.scala:190)
      	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
      	... 14 more
      16/04/30 20:13:00 WARN NettyRpcEndpointRef: Error sending message [message = Heartbeat(driver,[Lscala.Tuple2;@7af98fbd,BlockManagerId(driver, localhost, 43440))] in 2 attempts
      org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 seconds]. This timeout is controlled by spark.executor.heartbeatInterval
      	at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
      	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
      	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
      	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
      	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
      	at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
      	at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:449)
      	at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:470)
      	at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
      	at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
      	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765)
      	at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:470)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10 seconds]
      	at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
      	at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
      	at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
      	at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
      	at scala.concurrent.Await$.result(package.scala:190)
      	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
      	... 14 more
      16/04/30 20:13:30 WARN HeartbeatReceiver: Removing executor driver with no recent heartbeats: 145068 ms exceeds timeout 120000 ms
      16/04/30 20:24:46 WARN NettyRpcEndpointRef: Error sending message [message = Heartbeat(driver,[Lscala.Tuple2;@7af98fbd,BlockManagerId(driver, localhost, 43440))] in 3 attempts
      org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 seconds]. This timeout is controlled by spark.executor.heartbeatInterval
      	at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
      	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
      	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
      	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
      	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
      	at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
      	at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:449)
      	at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:470)
      	at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
      	at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
      	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765)
      	at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:470)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10 seconds]
      	at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
      	at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
      	at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
      	at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
      	at scala.concurrent.Await$.result(package.scala:190)
      	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
      	... 14 more
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              philipp.classen Philipp Claßen
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: