[JENA-178] SPARQL Results serialization is slow for some formats with large result sets - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: ARQ 2.9.0
Fix Version/s: ARQ 2.9.1
Component/s: ARQ
Labels:
None
Environment:

Windows 7 Enterprise 64 bit

Description

The SPARQL XML and JSON Result formats are very slow when the result set is large. This is surprising to me since both formats are relatively simple and should lend themselves to fairly fast streaming serialization and parsing.

The following are observed performance figures comparing SPARQL XML, SPARQL JSON and SPARQL TSV results format. This is the averaged time over 5 runs to retrieve the first 50,000 triples from the dataset with a simple SELECT * WHERE

{ ?s ?p ?o } LIMIT 50000 via a HTTP request to Fuseki and iterate over the results on the client.

SPARQL XML = 15.25 seconds
SPARQL JSON = 10.9 seconds
SPARQL TSV = 0.54 seconds

Now obviously TSV is way simpler to serialize and parse than XML/JSON but these serializers and parsers should not be 20-30 times slower IMO

Also for comparison note that doing an equivalent CONSTRUCT { ?s ?p ?p } WHERE { ?s ?p ?o }

LIMIT 50000 takes only about 2s and that is using RDF/XML serialization which I would have expected to be slower because RDF/XML is more complex to generate than either SPARQL XML/JSON results. I haven't dived into the code in detail to investigate why this is slow yet but do the Jena team have any thoughts on this?

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

TestArqSerializerPerformance.java
14/Dec/11 00:03
5 kB
Rob Vesse
Jena178.java
14/Dec/11 14:38
6 kB
Damian Steer
XMLOutputSAX.java
16/Dec/11 00:57
9 kB
Stephen Allen
XMLOutputStAX.java
16/Dec/11 00:57
7 kB
Stephen Allen
Jena178.patch
21/Dec/11 18:48
4 kB
Damian Steer

Activity

People

Assignee:: Damian Steer

Reporter:: Rob Vesse

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 13/Dec/11 23:12

Updated:: 27/Dec/11 12:40

Resolved:: 23/Dec/11 21:26