Details
Description
When running concurrent POST updates and queries against a Fuseki/TDB server, the server appears to bleed memory until it eventually runs out and dies with:
java.lang.OutOfMemoryError: GC overhead limit exceeded
Using the included TDB config file, sample data file, and Groovy script, the Fuseki/TDB server can consistently be knocked down. The script runs four concurrent threads: one that repeatedly POSTs data (in separate contexts/graphs) and three that query the server for triple counts.
To execute the script, do the following:
- Install Groovy
- Download and install jena-fuseki-1.0.1
- Download the attached file FusekiTest.tar.gz and untar it in the jena-fuseki directory
- Edit the fuseki-server script, set the max heap size to 2G (--Xmx2G)
- Start the server with: ./fuseki-server --config=config-test.ttl
- In a separate window/shell, execute: groovy query.groovy
- Wait a few minutes for the OOE to occur. The script will output some stats.
A typical run of the script will result in:
Added context #1
Added context #2
Added context #3
Added context #4
Added context #5
Added context #6
Added context #7
Added context #8
Added context #9
Query thread dying
Total contexts added: 9
Total triples added: 4500000
Total successful queries: 155
While this simple test fails consistently on OSX and running with a 2G heap Fuseki/TDB server, we've also observed it running on CentOS with a 16GB heap max and monitoring with NewRelic. It took a lot longer, but the end result was the same: all the heaps (regular, eden, survivor, and old gen) eventually converge on their maximums and the JVM fails.
It's interesting to note that if all the contexts/graphs are added FIRST (with no concurrent queries), everything works just fine.