Uploaded image for project: 'Batik'
  1. Batik
  2. BATIK-1183

Performance of <use> and <symbol>

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.13
    • None
    • Bridge

    Description

      In ELKI, we use Batik for scatterplots.
      Marker symbols are generated as <symbol> tag, and then a <use> at the individual locations. This is nice for post-editing (because the symbols can be changed in a single place), but performance of this approach is pretty bad (up to the point where I am considering to kick out Batik, and try something else).

      When analyzing performance bottlenecks, I noticed the following things:
      1. A substantial amount of time (way too much) goes into listener list management (yes, I want support for dynamic changes; so I do need listeners). It seems that for every <use>, several listeners are added?
      2. String.intern is a major performance factor. I understand that we need to intern strings, but we need to avoid redoing it as often.
      3. When a <symbol> is used, it gets cloned. With thousands of <use> tags, this leads to a substantial cost. In particular, because every string will be interned again for every usage.
      (org.apache.batik.bridge.SVGUseElementBridge#buildCompositeGraphicsNode calls 'importNode')

      Attached is a file that shows the performance bottleneck; in particular when interactions are enabled.

      I have tried to improve some of these things in my speedup branch:
      https://github.com/kno10/batik/tree/fixesAndSpeed

      In this branch:

      • the namespace SVGConstants.SVG_NAMESPACE_URI is recognized and the call to String.intern() is avoided. This is the default namespace for SVG, and the constant will point to the interned version.
      • the custom "Hashtable" has been removed, and replaced with a type-safe HashMap<> (which should actually be faster)
      • The listener list management is now much simpler (and more efficient, as some of the functionality wasn't ever used anywhere).

      But I could not tackle reducing the amount of listeners and the cloning, as I am not deep enough into Batik internals. I understand they are meant to propagate changes to the symbol to all the copies, but maybe we can instead have one shared listener on the <symbol> tag for all the <use> tags, not one listener per <use> tag?

      Without using '<symbol>' and '<use>', performance is much better. It makes the file harder to edit, and twice as large.

      Attachments

        1. scatter.svg.gz
          575 kB
          Erich Schubert

        Activity

          People

            Unassigned Unassigned
            erich.schubert Erich Schubert
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: