Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Cannot Reproduce
-
1.1.0
-
None
-
Linux Red Hat 6.4 on Spark 1.1.0
Description
Hi all,
I've build spark 1.1.0 with sbt with ganglia enabled and hadoop version 2.4.0
No issues there, spark works fine on hadoop 2.4.0 and ganglia (GraphiteSink) is installed.
I've added the following to the metrics.properties
*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
*.sink.graphite.host=HOSTNAME
*.sink.graphite.port=8649
*.sink.graphite.period=1
*.sink.graphite.prefix=aa
and I get this error message
14/07/31 05:39:00 WARN graphite.GraphiteReporter: Unable to report to Graphite
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at java.io.BufferedWriter.flush(BufferedWriter.java:254)
at com.codahale.metrics.graphite.Graphite.send(Graphite.java:77)
at com.codahale.metrics.graphite.GraphiteReporter.reportGauge(GraphiteReporter.java:254)
at com.codahale.metrics.graphite.GraphiteReporter.report(GraphiteReporter.java:156)
at com.codahale.metrics.ScheduledReporter.report(ScheduledReporter.java:107)
at com.codahale.metrics.ScheduledReporter$1.run(ScheduledReporter.java:86)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
From looking at the code I see the following.
val graphite: Graphite = new Graphite(new InetSocketAddress(host, port))
val reporter: GraphiteReporter = GraphiteReporter.forRegistry(registry)
.convertDurationsTo(TimeUnit.MILLISECONDS)
.convertRatesTo(TimeUnit.SECONDS)
.prefixedWith(prefix)
.build(graphite)
https://github.com/apache/spark/blob/87bd1f9ef7d547ee54a8a83214b45462e0751efb/core/src/main/scala/org/apache/spark/metrics/sink/GraphiteSink.scala#L69
Followed by
override def start()
{ reporter.start(pollPeriod, pollUnit) }I noticed that the error fails when we first fry to send a message but nowhere do I see graphite.connect() being called?
as it seems to fail on the send function..
https://github.com/dropwizard/metrics/blob/master/metrics-graphite/src/main/java/com/codahale/metrics/graphite/Graphite.java#L77
a with "this.writer" not initialized the "writer.write" will fail.
The GraphiteBuilder doesn't call it either when creating the "reporter" object.
https://github.com/dropwizard/metrics/blob/master/metrics-graphite/src/main/java/com/codahale/metrics/graphite/GraphiteReporter.java#L113
Maybe I'm looking in the wrong area and I'm passing in the wrong values - but very little logging has me thinking it is a bug.
EDIT:
found out where the connect gets called.
https://github.com/dropwizard/metrics/blob/master/metrics-graphite/src/main/java/com/codahale/metrics/graphite/GraphiteReporter.java#L153
ad his is called from here
which is called form here
but the issue still stands. :/
Edit 2:
my ports are open and listening
[root@rtr-dev-spark4 ~]# lsof -i :8649
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
gmond 32173 ganglia 5u IPv4 3480253 0t0 UDP rtr-dev-spark4.ord2012:8649
gmond 32173 ganglia 6u IPv4 3480255 0t0 TCP rtr-dev-spark4.ord2012:8649 (LISTEN)
gmond 32173 ganglia 7u IPv4 3480257 0t0 UDP rtr-dev-spark4.ord2012:55523->rtr-dev-spark4.ord2012:8649
Regards
Steve