Lucene - Core
  1. Lucene - Core
  2. LUCENE-5959

Optimized memory management in Automaton.Builder.finish()

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 4.10
    • Fix Version/s: 5.0, 6.0
    • Component/s: core/other
    • Labels:
    • Lucene Fields:
      New, Patch Available

      Description

      Reworked Automaton.Builder.finish() to not allocate memory stepwise. Added growTransitions(int numTransitions) to be able to resize the transistions array just once.

      1. Automaton.diff
        5 kB
        Markus Heiden
      2. finish.patch
        2 kB
        Markus Heiden

        Activity

        Hide
        Markus Heiden added a comment -

        Patch with suggested changes.

        Show
        Markus Heiden added a comment - Patch with suggested changes.
        Hide
        Michael McCandless added a comment -

        Thanks, this is a nice fix! Maybe instead of using ArrayUtil.grow in that new private method, we could just alloc to that exact size? (Since the Builder knows it will only add exactly that many transitions...).

        Show
        Michael McCandless added a comment - Thanks, this is a nice fix! Maybe instead of using ArrayUtil.grow in that new private method, we could just alloc to that exact size? (Since the Builder knows it will only add exactly that many transitions...).
        Hide
        Markus Heiden added a comment - - edited

        A better approach would be to create the Automaton with the correct size in Builder.finish() instead of creating it in the constructor with a default size. But I can't do that, because the states have already been added to the automaton at that point.

        I may directly allocate
        a.transitions = new int[numTransitions * 3];
        in Builder.finish(), but this looks a bit dirty to me. The overhead of the approach in the patch should be negligible.

        Show
        Markus Heiden added a comment - - edited A better approach would be to create the Automaton with the correct size in Builder.finish() instead of creating it in the constructor with a default size. But I can't do that, because the states have already been added to the automaton at that point. I may directly allocate a.transitions = new int [numTransitions * 3] ; in Builder.finish(), but this looks a bit dirty to me. The overhead of the approach in the patch should be negligible.
        Hide
        Markus Heiden added a comment -

        To avoid the initial creation of the transitions array, I could set it to "new int[0]" by using a static constant for that. I don't know, if that affects the performance of other use cases?

        Show
        Markus Heiden added a comment - To avoid the initial creation of the transitions array, I could set it to "new int [0] " by using a static constant for that. I don't know, if that affects the performance of other use cases?
        Hide
        Michael McCandless added a comment -

        I may directly allocate a.transitions = new int[numTransitions * 3]; in Builder.finish(), but this looks a bit dirty to me.

        I think that solution is OK?

        It's nice not to tie up extra RAM (from ArrayUtil.oversize) since we don't know how long the automaton will be around...

        Show
        Michael McCandless added a comment - I may directly allocate a.transitions = new int [numTransitions * 3] ; in Builder.finish(), but this looks a bit dirty to me. I think that solution is OK? It's nice not to tie up extra RAM (from ArrayUtil.oversize) since we don't know how long the automaton will be around...
        Hide
        Markus Heiden added a comment - - edited

        I reworked the Builder completely, see Automaton.diff. Now no unneeded memory will be allocated in finish(). This looks for me like a clean and (memory) efficient solution.

        Show
        Markus Heiden added a comment - - edited I reworked the Builder completely, see Automaton.diff. Now no unneeded memory will be allocated in finish(). This looks for me like a clean and (memory) efficient solution.
        Hide
        Markus Heiden added a comment -

        Major rework of Automaton.Builder.

        Show
        Markus Heiden added a comment - Major rework of Automaton.Builder.
        Hide
        Michael McCandless added a comment -

        I like this, I'll commit shortly. Thanks Markus!

        Show
        Michael McCandless added a comment - I like this, I'll commit shortly. Thanks Markus!
        Hide
        ASF subversion and git services added a comment -

        Commit 1626001 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1626001 ]

        LUCENE-5959: add CHANGES entry

        Show
        ASF subversion and git services added a comment - Commit 1626001 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1626001 ] LUCENE-5959 : add CHANGES entry
        Hide
        ASF subversion and git services added a comment -

        Commit 1626002 from Michael McCandless in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1626002 ]

        LUCENE-5959: add CHANGES entry

        Show
        ASF subversion and git services added a comment - Commit 1626002 from Michael McCandless in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1626002 ] LUCENE-5959 : add CHANGES entry
        Hide
        Michael McCandless added a comment -

        Thanks Markus!

        Show
        Michael McCandless added a comment - Thanks Markus!
        Hide
        Anshum Gupta added a comment -

        Bulk close after 5.0 release.

        Show
        Anshum Gupta added a comment - Bulk close after 5.0 release.

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Markus Heiden
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development