UIMA
  1. UIMA
  2. UIMA-1068

Use of the JCas cache should be configurable

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.2
    • Fix Version/s: 2.3
    • Component/s: Core Java Framework
    • Labels:
      None

      Description

      The JCas caches all CAS objects that are accessed through it. This means that JCas objects that are no longer used can't be garbage collected. If only part of the processing chain uses the JCas, or the caching is redundant for some other reason, this produces a severe memory overhead.

      I ran the same experiment I ran for UIMA-1067: doubled the size of Moby Dick and ran the POS tagger from the sandbox. I used the improved version from UIMA-1067 as base case and simply commented out the line that adds JCas objects to the cache. This reduced the required heap size from 115MB to 105MB. It also improved the performance from around 10s for the base case to consistently under 9s for the version without any caching. I looked at the tagger source code, and saw that it keeps its own list of tokens around. So the savings are just the caching data structure.

      There may be cases where the JCas cache is a performance win, though I'd be curious to see the benchmarks. So we should not just turn it off, but make it configurable.

        Activity

        Thilo Goetz created issue -
        Thilo Goetz made changes -
        Field Original Value New Value
        Status Open [ 1 ] In Progress [ 3 ]
        Hide
        Thilo Goetz added a comment -

        Done. I also added documentation on the other performance tuning settings.

        Show
        Thilo Goetz added a comment - Done. I also added documentation on the other performance tuning settings.
        Thilo Goetz made changes -
        Status In Progress [ 3 ] Closed [ 6 ]
        Resolution Fixed [ 1 ]
        Hide
        Thilo Goetz added a comment -

        Fix in 2.2.2 hotfix 1.

        Show
        Thilo Goetz added a comment - Fix in 2.2.2 hotfix 1.
        Thilo Goetz made changes -
        Resolution Fixed [ 1 ]
        Status Closed [ 6 ] Reopened [ 4 ]
        Michael Baessler made changes -
        Assignee Thilo Goetz [ twgoetz ] Michael Baessler [ mbaessler ]
        Hide
        Michael Baessler added a comment -

        Thilo, please review my changes.

        Unfortunately I didn't find a way to check the JCas cache size with a test case. I didn't find a way to get the JCasHashMap object from the CAS/JCas to call the size() method on it.

        Show
        Michael Baessler added a comment - Thilo, please review my changes. Unfortunately I didn't find a way to check the JCas cache size with a test case. I didn't find a way to get the JCasHashMap object from the CAS/JCas to call the size() method on it.
        Michael Baessler made changes -
        Assignee Michael Baessler [ mbaessler ] Thilo Goetz [ twgoetz ]
        Hide
        Thilo Goetz added a comment -

        Reviewed, thanks.

        Show
        Thilo Goetz added a comment - Reviewed, thanks.
        Thilo Goetz made changes -
        Resolution Fixed [ 1 ]
        Status Reopened [ 4 ] Closed [ 6 ]

          People

          • Assignee:
            Thilo Goetz
            Reporter:
            Thilo Goetz
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development