Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-11400

GraphiteSink does not reconnect to Graphite after 'broken pipe'

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.5.1, 2.6.0
    • 2.7.0
    • metrics
    • None
    • Reviewed

    Description

      I see that after network error GraphiteSink does not reconnects to Graphite server and in effect metrics are not sent.

      Here is stacktrace I see (this is from nodemanager):

      2014-12-11 16:39:21,655 ERROR org.apache.hadoop.metrics2.impl.MetricsSinkAdapter: Got sink exception, retry in 4806ms
      org.apache.hadoop.metrics2.MetricsException: Error flushing metrics
      at org.apache.hadoop.metrics2.sink.GraphiteSinkFixed.flush(GraphiteSinkFixed.java:120)
      at org.apache.hadoop.metrics2.impl.MetricsSinkAdapter.consume(MetricsSinkAdapter.java:184)
      at org.apache.hadoop.metrics2.impl.MetricsSinkAdapter.consume(MetricsSinkAdapter.java:43)
      at org.apache.hadoop.metrics2.impl.SinkQueue.consumeAll(SinkQueue.java:87)
      at org.apache.hadoop.metrics2.impl.MetricsSinkAdapter.publishMetricsFromQueue(MetricsSinkAdapter.java:129)
      at org.apache.hadoop.metrics2.impl.MetricsSinkAdapter$1.run(MetricsSinkAdapter.java:88)
      Caused by: java.net.SocketException: Broken pipe
      at java.net.SocketOutputStream.socketWrite0(Native Method)
      at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
      at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
      at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
      at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
      at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
      at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
      at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
      at org.apache.hadoop.metrics2.sink.GraphiteSinkFixed.flush(GraphiteSinkFixed.java:118)
      ... 5 more
      2014-12-11 16:39:26,463 ERROR org.apache.hadoop.metrics2.impl.MetricsSinkAdapter: Got sink exception and over retry limit, suppressing further error messages
      org.apache.hadoop.metrics2.MetricsException: Error flushing metrics
      at org.apache.hadoop.metrics2.sink.GraphiteSinkFixed.flush(GraphiteSinkFixed.java:120)
      at org.apache.hadoop.metrics2.impl.MetricsSinkAdapter.consume(MetricsSinkAdapter.java:184)
      at org.apache.hadoop.metrics2.impl.MetricsSinkAdapter.consume(MetricsSinkAdapter.java:43)
      at org.apache.hadoop.metrics2.impl.SinkQueue.consumeAll(SinkQueue.java:87)
      at org.apache.hadoop.metrics2.impl.MetricsSinkAdapter.publishMetricsFromQueue(MetricsSinkAdapter.java:129)
      at org.apache.hadoop.metrics2.impl.MetricsSinkAdapter$1.run(MetricsSinkAdapter.java:88)
      Caused by: java.net.SocketException: Broken pipe
      at java.net.SocketOutputStream.socketWrite0(Native Method)
      at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
      at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
      at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
      at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
      at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
      at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
      at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
      at org.apache.hadoop.metrics2.sink.GraphiteSinkFixed.flush(GraphiteSinkFixed.java:118)
      ... 5 more

      GraphiteSinkFixed.java is simply GraphiteSink.java from Hadoop 2.6.0 (with fixed https://issues.apache.org/jira/browse/HADOOP-11182) because I cannot simply upgrade Hadoop (I am using CDH5).

      I see that GraphiteSink is using OutputStreamWriter which is created only in init method (which is probably called only once per application runtime) and there is no reconnection logic.

      Attachments

        1. HADOOP-11400.patch
          15 kB
          Kamil Gorlo

        Activity

          People

            kgs Kamil Gorlo
            kgs Kamil Gorlo
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: