FOP
  1. FOP
  2. FOP-1316

[PATCH] Proposal: some refactoring in the fo tree

    Details

    • Type: Bug Bug
    • Status: Closed
    • Resolution: Fixed
    • Affects Version/s: trunk
    • Fix Version/s: None
    • Component/s: fo/unqualified
    • Labels:
      None
    • Environment:
      Operating System: other
      Platform: All
    • External issue ID:
      41656

      Description

      The proposed patch refactors the fo tree in the following ways:
      -> removal of the childNodes instance member in fop.fo.FObj
      -> addition of a firstChild instance member in fop.fo.FObj
      -> addition of a siblings instance member in fop.fo.FONode
      -> addition of a FONodeIterator interface in FONode + corresponding implementation in FObj
      -> changed implementations of FObj.addChildNode(), .removeChild() and .getChildNodes()

      Possible benefits:
      Reduction of the number of ArrayLists used by the fo tree.
      Before the patch, for every FObj that has at least one child, an ArrayList is instantiated with a default
      capacity of 10 elements (a backing Object[10], effectively wasting space on nulls in a substantial
      amount of cases)
      After the patch, every FObj has a maximum of 3 references: one first-child (possibly null), a preceding
      and a following sibling. If the node only has children, it will have only two references: one child, one
      uninitialized array of siblings.
      Downside is that this means an added reference for any FONode, even FOText or a single child, but this
      may be negligeable, considering that most FObj subtypes in general have a quite a few references
      already (mostly properties).

      Another upside: this refactoring made it possible to remove fo:character nodes during the refinement
      white-space-handling /as they were encountered/, so there was no more need to wait for the
      RecursiveCharIterator to go out of scope. The FObjIterator does not throw a
      ConcurrentModificationException, since there is no real backing list anymore.
      More generally, while extremely dangerous if used carelessly, this does open up the possibility of
      starting to process the virtual list of child nodes while another thread keeps appending nodes to the
      end.

      Since this is only a proof-of-concept, I'd appreciate any feedback, especially from the profiler-geeks
      out there (Richard? I'd be very interested to see how many more cells you can process before running
      into an OOMError...)
      I'd expect it to save us 'some' heap space, no idea how much. OTOH, I'd also expect a slowdown –
      since an add() means traversing the following-sibling axis until the last one is reached. I have not yet
      performed any optimizations where it comes to FOs with a lot of children. Maybe a solution would be to
      use an iterator for the addition, and append the new child to its previous element...?

      In the (very) long run, parts of the layout algorithm may already be triggered sooner.
      For example: layoutmanager initialization, base element list generation, providing new pages...

      This latter would, however, take quite a bit of refactoring, so don't get any wild ideas.

      1. fotree_refactoring.diff
        39 kB
        Andreas L. Delmelle

        Activity

        Hide
        Andreas L. Delmelle added a comment -

        Attachment fotree_refactoring.diff has been added with description: patch containing the proposed changes

        Show
        Andreas L. Delmelle added a comment - Attachment fotree_refactoring.diff has been added with description: patch containing the proposed changes
        Hide
        Richard Wheeldon added a comment -

        See http://issues.apache.org/bugzilla/show_bug.cgi?id=41044 for details of other
        memory usage patches.

        Having done some tests, this patch takes the number of processable table cells
        on my given example from 30406 up to 32442. The patches in 41044 take the number
        from 30406 up to 83197. Applying both patches allows me to process 100803 cells.
        Hence, I'd say both patches look like a good improvement - assuming they don't
        break anything else I haven't looked at.

        Show
        Richard Wheeldon added a comment - See http://issues.apache.org/bugzilla/show_bug.cgi?id=41044 for details of other memory usage patches. Having done some tests, this patch takes the number of processable table cells on my given example from 30406 up to 32442. The patches in 41044 take the number from 30406 up to 83197. Applying both patches allows me to process 100803 cells. Hence, I'd say both patches look like a good improvement - assuming they don't break anything else I haven't looked at.
        Hide
        Andreas L. Delmelle added a comment -

        Looking again at this patch, and Richard's observations, I'd say it is worth it. Breaks nothing, and the
        benefit seems to become bigger with the number of FOs.

        Patch applied. see: http://svn.apache.org/viewvc?view=rev&rev=554104

        Show
        Andreas L. Delmelle added a comment - Looking again at this patch, and Richard's observations, I'd say it is worth it. Breaks nothing, and the benefit seems to become bigger with the number of FOs. Patch applied. see: http://svn.apache.org/viewvc?view=rev&rev=554104
        Hide
        Glenn Adams added a comment -

        batch transition pre-FOP1.0 resolved+fixed bugs to closed+fixed

        Show
        Glenn Adams added a comment - batch transition pre-FOP1.0 resolved+fixed bugs to closed+fixed

          People

          • Assignee:
            fop-dev
            Reporter:
            Andreas L. Delmelle
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development