Apache Jena
  1. Apache Jena
  2. JENA-181

Fuseki starts producing 500 errors if rapidly sent a sequence of queries

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Not a Problem
    • Affects Version/s: Fuseki 0.2.1
    • Fix Version/s: None
    • Component/s: Fuseki
    • Labels:
      None
    • Environment:

      Mac OS X Lion

      Description

      It is fairly trivial to cause Fuseki to start generating a 500 : Direct buffer memory error code in response to queries simply by sending a sequence of queries to it with no delays between them, even with a short delay e.g. 0.5 seconds Fuseki will typically get into this state at a similar point.

      Attached is a simple test case which fires SELECT * WHERE { } queries at a local Fuseki instance, for me this reliably fails on the 25th iteration, turning on --debug and --verbose for Fuseki and modifying the log4j.properties file to set DEBUG level for everything didn't show anything particularly useful on the command line so I have no idea what the cause of this may be beyond something related to java.nio.HeapByteBuffer

        Activity

        Hide
        Rob Vesse added a comment -

        Simple test case which reliably reproduces this issue

        Show
        Rob Vesse added a comment - Simple test case which reliably reproduces this issue
        Hide
        Rob Vesse added a comment -

        Just to add a bit of mystery to this - running the same code with an ASK/CONSTRUCT/DESCRIBE query will not hit this problem, so this must be a bug somewhere in the code related to SELECT queries (or at least in the code that serializes SPARQL Result Sets)

        Show
        Rob Vesse added a comment - Just to add a bit of mystery to this - running the same code with an ASK/CONSTRUCT/DESCRIBE query will not hit this problem, so this must be a bug somewhere in the code related to SELECT queries (or at least in the code that serializes SPARQL Result Sets)
        Hide
        Andy Seaborne added a comment - - edited

        Have you tried setting "-XX:MaxDirectMemorySize" to a larger value than 64K (the default?) i don't think thsi can be done in the jetty config file.

        Google for Jetty and Direct buffer memory shows various reports.

        Show
        Andy Seaborne added a comment - - edited Have you tried setting "-XX:MaxDirectMemorySize" to a larger value than 64K (the default?) i don't think thsi can be done in the jetty config file. Google for Jetty and Direct buffer memory shows various reports.
        Hide
        Rob Vesse added a comment -

        Hmm, looks like I made a dumb error, there is no close() call on the query executor in my test code which causes it to be left open until such time as it gets GCd on the client. Looks like the server has no way of telling that a client has finished with the connection and so doesn't free it up without that explicit close() call.

        The fact that this only happens for SELECT queries when close() is not called suggests to me that only SELECT results get parsed in a streaming fashion. Looking at XMLInputStAX and XMLInputSAX it looks like it might be possible to automatically call close() on the source QueryExecution at the point where no further results are found?

        There would need to be some refactoring to allow an optional QueryExecution to be passed in upon which you'd call close() at the appropriate juncture but would this be a patch you guys would be willing to incorporate?

        Show
        Rob Vesse added a comment - Hmm, looks like I made a dumb error, there is no close() call on the query executor in my test code which causes it to be left open until such time as it gets GCd on the client. Looks like the server has no way of telling that a client has finished with the connection and so doesn't free it up without that explicit close() call. The fact that this only happens for SELECT queries when close() is not called suggests to me that only SELECT results get parsed in a streaming fashion. Looking at XMLInputStAX and XMLInputSAX it looks like it might be possible to automatically call close() on the source QueryExecution at the point where no further results are found? There would need to be some refactoring to allow an optional QueryExecution to be passed in upon which you'd call close() at the appropriate juncture but would this be a patch you guys would be willing to incorporate?
        Hide
        Andy Seaborne added a comment -

        A patch would be great - for all input forms, once the logical end is seen, then the code can and should .close().

        Even for a stream passed in to be read, the state of the stream is undefined after the parser code has done it's stuff (read ahead buffering if nothing else). I guess that keeping the HTTP/TCP connection delays recycling resources in the Jetty server.

        Show
        Andy Seaborne added a comment - A patch would be great - for all input forms, once the logical end is seen, then the code can and should .close(). Even for a stream passed in to be read, the state of the stream is undefined after the parser code has done it's stuff (read ahead buffering if nothing else). I guess that keeping the HTTP/TCP connection delays recycling resources in the Jetty server.
        Hide
        Andy Seaborne added a comment -

        ("not a problem" isn't the ideal label but the best of the bunch)

        Show
        Andy Seaborne added a comment - ("not a problem" isn't the ideal label but the best of the bunch)
        Hide
        Andy Seaborne added a comment -

        I've looked at the Fuseki side. The code flushes the servlet output stream. I'm unclear as to whether the server code should call .close() , or leave the stream along, or if it makes no difference. Jetty is responsible for managing the HTTP connection, sending the headers and then the body. It may then do some connection caching.

        Show
        Andy Seaborne added a comment - I've looked at the Fuseki side. The code flushes the servlet output stream. I'm unclear as to whether the server code should call .close() , or leave the stream along, or if it makes no difference. Jetty is responsible for managing the HTTP connection, sending the headers and then the body. It may then do some connection caching.
        Hide
        Andy Seaborne added a comment -

        Checking the Jetty source code, it looks like closing the implementation of ServletOutputStream is not going to affect any HTTP connection resources like direct memory in any way. .close() flushes the stream and sets a flag, nothing else.

        package org.eclipse.jetty.server

        HttpConnection$Output extends HttpOutput

        Show
        Andy Seaborne added a comment - Checking the Jetty source code, it looks like closing the implementation of ServletOutputStream is not going to affect any HTTP connection resources like direct memory in any way. .close() flushes the stream and sets a flag, nothing else. package org.eclipse.jetty.server HttpConnection$Output extends HttpOutput

          People

          • Assignee:
            Andy Seaborne
            Reporter:
            Rob Vesse
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development