Wave
  1. Wave
  2. WAVE-302

Server is unstable due to OutOfmemoryEception in Socket.IO

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Invalid
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Server
    • Labels:
      None

      Description

      Turbulence message is displayed after login.
      The logs shows the following exception:

      java.lang.OutOfMemoryError: Direct buffer memory
      at java.nio.Bits.reserveMemory(Bits.java:656)
      at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:113)
      at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:305)
      at org.eclipse.jetty.io.nio.DirectNIOBuffer.<init>(DirectNIOBuffer.java:46)
      at org.eclipse.jetty.io.AbstractBuffers.newBuffer(AbstractBuffers.java:94)
      at org.eclipse.jetty.io.PooledBuffers.getBuffer(PooledBuffers.java:70)
      at org.eclipse.jetty.http.AbstractGenerator.increaseContentBufferSize(AbstractGenerator.java:187)
      at org.eclipse.jetty.server.Response.setBufferSize(Response.java:986)
      at com.glines.socketio.server.transport.jetty.JettyContinuationTransportHandler.connect(JettyContinuationTranspo
      rtHandler.java:340)
      at com.glines.socketio.server.transport.AbstractHttpTransport.connect(AbstractHttpTransport.java:101)
      at com.glines.socketio.server.transport.AbstractHttpTransport.handle(AbstractHttpTransport.java:75)
      at org.waveprotocol.box.server.rpc.AbstractWaveSocketIOServlet.serve(AbstractWaveSocketIOServlet.java:164)
      at org.waveprotocol.box.server.rpc.AbstractWaveSocketIOServlet.doGet(AbstractWaveSocketIOServlet.java:117)
      at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
      at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
      at org.waveprotocol.box.server.rpc.ServerRpcProvider$WaveSocketIOServlet.service(ServerRpcProvider.java:566)
      at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263)
      at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178)
      at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
      at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62)
      at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
      at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
      at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1333)
      at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:487)
      at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
      at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:520)
      at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
      at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:972)
      at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:417)
      at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
      at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:906)
      at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
      at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:110)
      at org.eclipse.jetty.server.Server.handle(Server.java:346)
      at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:442)
      at org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:910)
      at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:565)
      at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:217)
      at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:46)
      at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:545)
      at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:43)
      at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:598)
      at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:533)
      at java.lang.Thread.run(Thread.java:679)
      2011-11-12 03:51:05.663:DBUG:oejs.ServletHandler:[GET /socket.io/xhr-multipart/null]@1488755615 org.eclipse.jetty.server
      .Request@58bc9b9f
      2011-11-12 03:51:05.663:DBUG:oejs.Server:RESPONSE /socket.io/xhr-multipart/null 500

      1. selenium-test-creation.html
        2 kB
        Vicente J. Ruiz Jurado

        Activity

        Hide
        Vicente J. Ruiz Jurado added a comment -

        Thanks indeed Yuri...

        Show
        Vicente J. Ruiz Jurado added a comment - Thanks indeed Yuri...
        Hide
        Yuri Zelikov added a comment -

        Here are the contents of run-server.sh

        #!/bin/bash

        1. This script will start the Wave in a Box server.
          #
        1. Make sure the config file exists.
          if [ ! -e server.config ]; then
          echo "You need to copy server.config.example to server.config and edit it. Or run: 'ant -f server-config.xml' to gener
          ate the file automatically."
          exit 1
          fi
        1. The version of Wave in a Box, extracted from the build.properties file
          WAVEINABOX_VERSION=`sed "s/\\t=\\t/=/g" build.properties | grep ^waveinabox.version= | cut -f2 -d=`

        . process-script-args.sh

        exec java $DEBUG_FLAGS \
        -server \
        -XX:ErrorFile=/var/wave/fatalerror.log \
        -XX:+HeapDumpOnOutOfMemoryError \
        -Dorg.eclipse.jetty.util.log.DEBUG=true \
        -Djava.security.auth.login.config=jaas.config \
        -Dwave.server.config=server.config \
        -Xms5096M -Xmx5096M \
        -Xss256K \
        jar dist/waveinabox-server$WAVEINABOX_VERSION.jar

        Show
        Yuri Zelikov added a comment - Here are the contents of run-server.sh #!/bin/bash This script will start the Wave in a Box server. # Make sure the config file exists. if [ ! -e server.config ]; then echo "You need to copy server.config.example to server.config and edit it. Or run: 'ant -f server-config.xml' to gener ate the file automatically." exit 1 fi The version of Wave in a Box, extracted from the build.properties file WAVEINABOX_VERSION=`sed "s/ \\t = \\t /=/g" build.properties | grep ^waveinabox.version= | cut -f2 -d=` . process-script-args.sh exec java $DEBUG_FLAGS \ -server \ -XX:ErrorFile=/var/wave/fatalerror.log \ -XX:+HeapDumpOnOutOfMemoryError \ -Dorg.eclipse.jetty.util.log.DEBUG=true \ -Djava.security.auth.login.config=jaas.config \ -Dwave.server.config=server.config \ -Xms5096M -Xmx5096M \ -Xss256K \ jar dist/waveinabox-server $WAVEINABOX_VERSION.jar
        Hide
        Vicente J. Ruiz Jurado added a comment -

        Hi Yuri,

        I'm getting this same error these days. Which JVM args are your using in waveinabox?

        TIA,

        Show
        Vicente J. Ruiz Jurado added a comment - Hi Yuri, I'm getting this same error these days. Which JVM args are your using in waveinabox? TIA,
        Hide
        Vicente J. Ruiz Jurado added a comment -

        Probably it's useful to share this kind of tunning information in a wiki page (like the ulimits info, etc), or in a FAQ, or in a troubleshooting page.

        Show
        Vicente J. Ruiz Jurado added a comment - Probably it's useful to share this kind of tunning information in a wiki page (like the ulimits info, etc), or in a FAQ, or in a troubleshooting page.
        Hide
        Yuri Zelikov added a comment -

        Was actually caused my incorrect JVM arguments i used to run waveinabox.net.

        Show
        Yuri Zelikov added a comment - Was actually caused my incorrect JVM arguments i used to run waveinabox.net.
        Hide
        Vicente J. Ruiz Jurado added a comment -

        Anyway I found WIAB very unstable these days (more than normal).

        I think that websocket/continuation is a source of problems, so this is why I'm trying to upgrade it. But, there is something more.

        Also I'm suspecting about adding links in wave. For instance, try to reproduce this:
        1) Create a wave (in waveinabox.net)
        2) add a link to it
        3) try to open again that wave, with the same client or with other

        This give me a "No conversations in this wave", or sometimes
        Client delta expressed against non-server version. Server version: 28, client delta: 23

        I have to investigate more... but, if I add waves with plain text (as I did via Selenium), I have no problems. The problem arise with special contents like "links" etc. Maybe the client gives some exception and we don't see it in the error panel.

        Show
        Vicente J. Ruiz Jurado added a comment - Anyway I found WIAB very unstable these days (more than normal). I think that websocket/continuation is a source of problems, so this is why I'm trying to upgrade it. But, there is something more. Also I'm suspecting about adding links in wave. For instance, try to reproduce this: 1) Create a wave (in waveinabox.net) 2) add a link to it 3) try to open again that wave, with the same client or with other This give me a "No conversations in this wave", or sometimes Client delta expressed against non-server version. Server version: 28, client delta: 23 I have to investigate more... but, if I add waves with plain text (as I did via Selenium), I have no problems. The problem arise with special contents like "links" etc. Maybe the client gives some exception and we don't see it in the error panel.
        Hide
        Yuri Zelikov added a comment -

        Thanks for investigating the issue. Well, if the issue is with WaveMap - then it's expected - as it holds all the wavelet snapshots in memory (and until recently also all the deltas). Also, I think there's some issue with Continuation - it seems like the thread that should return the search response with index of waves is constantly gets suspended, and it causes additional memory leak. As for me - it turned out that by tuning up the memory JVM settings for running waveinbox.net I managed to increase the period it can work without running out of memory.

        Show
        Yuri Zelikov added a comment - Thanks for investigating the issue. Well, if the issue is with WaveMap - then it's expected - as it holds all the wavelet snapshots in memory (and until recently also all the deltas). Also, I think there's some issue with Continuation - it seems like the thread that should return the search response with index of waves is constantly gets suspended, and it causes additional memory leak. As for me - it turned out that by tuning up the memory JVM settings for running waveinbox.net I managed to increase the period it can work without running out of memory.
        Hide
        Vicente J. Ruiz Jurado added a comment -

        More info about the memory issue. I just get an out of memory in one server and the dump only shows one suspect (WaveMap with 74% of the memory).

        Problem Suspect 1

        One instance of "org.waveprotocol.box.server.waveserver.WaveMap" loaded by "java.net.URLClassLoader @ 0x7faeb2e92328" occupies 318.489.336 (74,49%) bytes. The memory is accumulated in one instance of "com.google.common.collect.CustomConcurrentHashMap$Segment[]" loaded by "java.net.URLClassLoader @ 0x7faeb2e92328".

        Keywords
        java.net.URLClassLoader @ 0x7faeb2e92328
        com.google.common.collect.CustomConcurrentHashMap$Segment[]
        org.waveprotocol.box.server.waveserver.WaveMap

        Show
        Vicente J. Ruiz Jurado added a comment - More info about the memory issue. I just get an out of memory in one server and the dump only shows one suspect (WaveMap with 74% of the memory). Problem Suspect 1 One instance of "org.waveprotocol.box.server.waveserver.WaveMap" loaded by "java.net.URLClassLoader @ 0x7faeb2e92328" occupies 318.489.336 (74,49%) bytes. The memory is accumulated in one instance of "com.google.common.collect.CustomConcurrentHashMap$Segment[]" loaded by "java.net.URLClassLoader @ 0x7faeb2e92328". Keywords java.net.URLClassLoader @ 0x7faeb2e92328 com.google.common.collect.CustomConcurrentHashMap$Segment[] org.waveprotocol.box.server.waveserver.WaveMap
        Hide
        Vicente J. Ruiz Jurado added a comment -

        Some simple selenium IDE test to add Waves and replies in a loop.

        If someone it's interested to use it, I use this for the loop:
        http://51elliot.blogspot.com/2008/02/selenium-ide-goto.html

        Show
        Vicente J. Ruiz Jurado added a comment - Some simple selenium IDE test to add Waves and replies in a loop. If someone it's interested to use it, I use this for the loop: http://51elliot.blogspot.com/2008/02/selenium-ide-goto.html
        Hide
        Vicente J. Ruiz Jurado added a comment -

        I'm doing tests since this Friday and I find WIAB very unstable and I think is related (probably) with websocket communication or maybe related with our last changes.

        Some logs:

        Caused by: java.io.IOException: Broken pipe
        at sun.nio.ch.FileDispatcher.write0(Native Method)

        Caused by: org.eclipse.jetty.io.EofException
        at org.eclipse.jetty.http.HttpGenerator.flushBuffer(HttpGenerator.java:919)

        20-nov-2011 23:34:12 org.waveprotocol.box.server.rpc.WebSocketServerChannel sendMessageString
        WARNING: Websocket is not connected

        and also I lost my session, frequently. Maybe it's just a jetty continuation bug.

        But is strange, because doing test with selenium (I'll attach some simple test that creates a hundred of waves via selenium IDE with a loop extension) I can create waves without problems. It's during normal use, edition, using gadgets, links, etc. where problems arises.

        Also I was doing tests with socket.io (sending thousand of messages with selenium) without any problem.

        Anyone have similar problems?

        I'll try another jetty version to see if it's more stable.

        Show
        Vicente J. Ruiz Jurado added a comment - I'm doing tests since this Friday and I find WIAB very unstable and I think is related (probably) with websocket communication or maybe related with our last changes. Some logs: Caused by: java.io.IOException: Broken pipe at sun.nio.ch.FileDispatcher.write0(Native Method) Caused by: org.eclipse.jetty.io.EofException at org.eclipse.jetty.http.HttpGenerator.flushBuffer(HttpGenerator.java:919) 20-nov-2011 23:34:12 org.waveprotocol.box.server.rpc.WebSocketServerChannel sendMessageString WARNING: Websocket is not connected and also I lost my session, frequently. Maybe it's just a jetty continuation bug. But is strange, because doing test with selenium (I'll attach some simple test that creates a hundred of waves via selenium IDE with a loop extension) I can create waves without problems. It's during normal use, edition, using gadgets, links, etc. where problems arises. Also I was doing tests with socket.io (sending thousand of messages with selenium) without any problem. Anyone have similar problems? I'll try another jetty version to see if it's more stable.
        Hide
        Vicente J. Ruiz Jurado added a comment -

        Thanks Michael for your clarification.

        I did some tests using again selenium. Some screenshots:
        http://homes.ourproject.org/~vjrj/otros/wiab-leaks/
        and in html:
        http://homes.ourproject.org/~vjrj/otros/wiab-leaks/zips/
        I only see a increase of memory when I create lot of waves (several hundreds), and this is normal. After let the server running since yesterday night I don't see a increase of memory usage (see 4 and 5 images).

        So my proposal is to try to get a heap dump when the out of memory arise in one of our servers:
        http://wiki.eclipse.org/index.php/MemoryAnalyzer#Getting_a_Heap_Dump

        Or if you have other proposals, welcome...

        Show
        Vicente J. Ruiz Jurado added a comment - Thanks Michael for your clarification. I did some tests using again selenium. Some screenshots: http://homes.ourproject.org/~vjrj/otros/wiab-leaks/ and in html: http://homes.ourproject.org/~vjrj/otros/wiab-leaks/zips/ I only see a increase of memory when I create lot of waves (several hundreds), and this is normal. After let the server running since yesterday night I don't see a increase of memory usage (see 4 and 5 images). So my proposal is to try to get a heap dump when the out of memory arise in one of our servers: http://wiki.eclipse.org/index.php/MemoryAnalyzer#Getting_a_Heap_Dump Or if you have other proposals, welcome...
        Hide
        Michael MacFadden added a comment -

        I think this bug might be a bit misleading. Just because the exception mentions Socket IO, doesn't mean the memory leak is in Socket IO. The only thing the exception says is that the code was in Socket IO when the JVM ran out of memory.

        The memory leak could be anywhere. For example, suppose you have ClassA and ClassB. ClassA has a memory leak in it. Class B doesn't have a memory leak, but creates a new object and properly releases it. Now lets say that ClassA is being called every so often and is slowly eating up all the memory. Now assume that ClassA has eating up ALMOST all of the JVMs memory, but there is some left. Now we call ClassB and it tries to grab memory for an Object, but there isn't enough left. The OutOfMemoryError would now be thrown fro within ClassB, even though it is not where the leak is.

        Since Socket IO is a part of the code that is executed pretty much all the time. It's possible that it is the most frequent allocator of memory. This could raise the likelihood that if there was a leak somewhere else, it could manifest itself in the Socket IO code.

        Of course the leak could be in Socket IO itself, but the exception doesn't directly give you any information on where the leak might be.

        Show
        Michael MacFadden added a comment - I think this bug might be a bit misleading. Just because the exception mentions Socket IO, doesn't mean the memory leak is in Socket IO. The only thing the exception says is that the code was in Socket IO when the JVM ran out of memory. The memory leak could be anywhere. For example, suppose you have ClassA and ClassB. ClassA has a memory leak in it. Class B doesn't have a memory leak, but creates a new object and properly releases it. Now lets say that ClassA is being called every so often and is slowly eating up all the memory. Now assume that ClassA has eating up ALMOST all of the JVMs memory, but there is some left. Now we call ClassB and it tries to grab memory for an Object, but there isn't enough left. The OutOfMemoryError would now be thrown fro within ClassB, even though it is not where the leak is. Since Socket IO is a part of the code that is executed pretty much all the time. It's possible that it is the most frequent allocator of memory. This could raise the likelihood that if there was a leak somewhere else, it could manifest itself in the Socket IO code. Of course the leak could be in Socket IO itself, but the exception doesn't directly give you any information on where the leak might be.
        Hide
        Vicente J. Ruiz Jurado added a comment -

        I've been testing socket-io-java with the Eclipse Memory Analysis + Selenium IDE (with a loop with hundred of petitions) but I didn't find leaks.

        I'll try the same with WIAB.

        Show
        Vicente J. Ruiz Jurado added a comment - I've been testing socket-io-java with the Eclipse Memory Analysis + Selenium IDE (with a loop with hundred of petitions) but I didn't find leaks. I'll try the same with WIAB.
        Hide
        Vicente J. Ruiz Jurado added a comment -

        Ok, I'll try to play with the Eclipse Memory Analizer to find the memory leak
        http://www.eclipse.org/mat/
        Any other recomendation, welcome...

        Show
        Vicente J. Ruiz Jurado added a comment - Ok, I'll try to play with the Eclipse Memory Analizer to find the memory leak http://www.eclipse.org/mat/ Any other recomendation, welcome...
        Hide
        Yuri Zelikov added a comment -

        It's something not related to a wave. It happens only after a while, usually after day without restart. Basically the error means that the JVM is out of heap memory due to a memory leak.

        Show
        Yuri Zelikov added a comment - It's something not related to a wave. It happens only after a while, usually after day without restart. Basically the error means that the JVM is out of heap memory due to a memory leak.
        Hide
        Vicente J. Ruiz Jurado added a comment -

        Strange. I don't find this Exception in any of my (dev/testing/semi-prod) servers that we use.

        Is a big Wave? Because in our servers all the waves are small.

        Do you have a wave URL that gives you that error or is something that happend with all your Waves?

        Some info, maybe related:
        http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4879883

        Show
        Vicente J. Ruiz Jurado added a comment - Strange. I don't find this Exception in any of my (dev/testing/semi-prod) servers that we use. Is a big Wave? Because in our servers all the waves are small. Do you have a wave URL that gives you that error or is something that happend with all your Waves? Some info, maybe related: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4879883
        Hide
        Yuri Zelikov added a comment -

        Hi Vicente, the exception seems to be related to Socket.IO. Maybe you can take a look at it?

        Show
        Yuri Zelikov added a comment - Hi Vicente, the exception seems to be related to Socket.IO. Maybe you can take a look at it?

          People

          • Assignee:
            Yuri Zelikov
            Reporter:
            Yuri Zelikov
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development