Sling
  1. Sling
  2. SLING-913

Add a cache for pre-compiled scripts

    Details

    • Type: New Feature New Feature
    • Status: In Progress
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: Scripting Core 2.0.2
    • Fix Version/s: None
    • Component/s: Scripting
    • Labels:
      None

      Description

      The Java Scripting API provides support for scripting langugages which may precompile script source and reuse the precompiled scripts:

      javax.script.Compilable: May be implemented by a ScriptEngine if precompilation is
      supported
      javax.script.CompiledScript: Result of calling the Compilable.compile method.

      The CompiledScript can be called to repeatedly execute the script without the need for recompilation and thus for improved performance.

      The Sling Core Scripting support should make use of this functionality by maintaining a cache compiled scripts with the following properties

      • indexed by script path
      • size limited (using LinkedHashMap overwriting the removeEldestEntry method)
      • entries are weak or soft references ot cache entries

      A cache entry consists of the following information:

      • The CompiledScript instance
      • The time of last compilation. this is compared to the last modification time of the script to decide on whether to recompile

      We might probaly also try to add a reference to the script engine implementation bundle to only use the cache entry if the bundle has not been stopped since adding the cache entry

      Executing the script would then consist of the following steps:

      1 Check the cache of precompiled scripts. if an entry exists and can be used, use it
      2. if the ScriptEngine is Compilable:
      2a. Compile the script and add it to the cache and use it
      2b. Otherwise have the script engine execute the script

      1. SLING-913.patch
        16 kB
        Chetan Mehrotra

        Activity

        Hide
        Radu Cotescu added a comment -

        Wouldn't this still need a read operation to be performed, which in this case would be more expensive than reading the last modified property?

        Show
        Radu Cotescu added a comment - Wouldn't this still need a read operation to be performed, which in this case would be more expensive than reading the last modified property?
        Hide
        Bertrand Delacretaz added a comment -

        I haven't looked at the patch in detail but maybe using a digest of the actual script that's being compiled, instead of last-modified, would be useful? Scripts are usually small so I suspect computing the digest wouldn't add much to the compile time.

        Show
        Bertrand Delacretaz added a comment - I haven't looked at the patch in detail but maybe using a digest of the actual script that's being compiled, instead of last-modified, would be useful? Scripts are usually small so I suspect computing the digest wouldn't add much to the compile time.
        Hide
        Carsten Ziegeler added a comment -

        With event based cache, there is no need to distinguish between production and development - on production there is usually no event, so you get the cache forever for free.
        But I agree that the current approach with checking modified is better than not caching at all, therefore we could just start with this and see how it works

        Show
        Carsten Ziegeler added a comment - With event based cache, there is no need to distinguish between production and development - on production there is usually no event, so you get the cache forever for free. But I agree that the current approach with checking modified is better than not caching at all, therefore we could just start with this and see how it works
        Hide
        Chetan Mehrotra added a comment -

        I'm not 100% sure if the last modified approach is the best.

        Agreed. However compared to current where we read whole script content for compilation the overhead would be small. Further we can have different cache policies depending on need and deployment scenario

        • cache disabled - No caching at all
        • cache untill modified - Check for last modified to invalidate the cache
        • cache forever - For production deployments where we do not expect the script content to change we can switch to cache for ever. If script content changes say via some content package deployment then either we expect system restart or expose a JMX operation to invalidate the cache
        Show
        Chetan Mehrotra added a comment - I'm not 100% sure if the last modified approach is the best. Agreed. However compared to current where we read whole script content for compilation the overhead would be small. Further we can have different cache policies depending on need and deployment scenario cache disabled - No caching at all cache untill modified - Check for last modified to invalidate the cache cache forever - For production deployments where we do not expect the script content to change we can switch to cache for ever. If script content changes say via some content package deployment then either we expect system restart or expose a JMX operation to invalidate the cache
        Hide
        Carsten Ziegeler added a comment -

        I think the patch in general is fine - I'm not sure about the new method, however if it's required (as you indicate), that's the only way to pass it from one part to the other.
        While this definitely seems to improve things, I'm not 100% sure if the last modified approach is the best. We started with that within the jsp engine but found out later that this is actually a bottleneck as each and every script execution needs to check for the last modified which is a resource access (read). In addition this needed to be synchronized to avoid concurrent compilation. Under high load this slowed down things. Therefore we switched to event/observation based compilation. We might not need syncing in this case as compiling the same javascript script in parallel might not hurt that much though.
        Therefore I guess we can start with this approach and see whether we run into trouble or not

        Show
        Carsten Ziegeler added a comment - I think the patch in general is fine - I'm not sure about the new method, however if it's required (as you indicate), that's the only way to pass it from one part to the other. While this definitely seems to improve things, I'm not 100% sure if the last modified approach is the best. We started with that within the jsp engine but found out later that this is actually a bottleneck as each and every script execution needs to check for the last modified which is a resource access (read). In addition this needed to be synchronized to avoid concurrent compilation. Under high load this slowed down things. Therefore we switched to event/observation based compilation. We might not need syncing in this case as compiling the same javascript script in parallel might not hurt that much though. Therefore I guess we can start with this approach and see whether we run into trouble or not
        Hide
        Chetan Mehrotra added a comment -

        Felix Meschberger/Carsten Ziegeler Can you review the patch and approach taken?

        Show
        Chetan Mehrotra added a comment - Felix Meschberger / Carsten Ziegeler Can you review the patch and approach taken?
        Hide
        Chetan Mehrotra added a comment -

        One observation regarding cache implementation.

        It would work fine if the script is self contained as it only checks the modified timestamp for that script. However I am not sure for the case where script refers to other script say via include in esp. if the include request again gets processed via the Sling Scripting flow then things should work fine however if the script engine internally resolves the included scripts and compile them then that might cause issue. From what I can make out currently in most cases the request for include would be handled via Sling Script support itself. Something to be aware of

        Show
        Chetan Mehrotra added a comment - One observation regarding cache implementation. It would work fine if the script is self contained as it only checks the modified timestamp for that script. However I am not sure for the case where script refers to other script say via include in esp. if the include request again gets processed via the Sling Scripting flow then things should work fine however if the script engine internally resolves the included scripts and compile them then that might cause issue. From what I can make out currently in most cases the request for include would be handled via Sling Script support itself. Something to be aware of
        Hide
        Chetan Mehrotra added a comment -

        Ran the Apache Benchmark [1] after installing the espblog sample. Following command was used

        ab -k -n 100 -c 20 -A admin:admin http://localhost:8080/content/espblog/*.html

        Default

        Benchmarking localhost (be patient).....done
        
        
        Server Software:        Jetty(7.x.y-SNAPSHOT)
        Server Hostname:        localhost
        Server Port:            8080
        
        Document Path:          /content/espblog/*.html
        Document Length:        1317 bytes
        
        Concurrency Level:      20
        Time taken for tests:   0.945 seconds
        Complete requests:      100
        Failed requests:        0
        Write errors:           0
        Keep-Alive requests:    100
        Total transferred:      148900 bytes
        HTML transferred:       131700 bytes
        Requests per second:    105.87 [#/sec] (mean)
        Time per request:       188.916 [ms] (mean)
        Time per request:       9.446 [ms] (mean, across all concurrent requests)
        Transfer rate:          153.94 [Kbytes/sec] received
        
        Connection Times (ms)
                      min  mean[+/-sd] median   max
        Connect:        0    0   0.3      0       1
        Processing:    61  183 105.4    164     899
        Waiting:       61  183 105.4    164     899
        Total:         61  183 105.5    164     900
        
        Percentage of the requests served within a certain time (ms)
          50%    164
          66%    191
          75%    211
          80%    237
          90%    254
          95%    283
          98%    713
          99%    900
         100%    900 (longest request)
        
        

        With Script Cache

        Server Software:        Jetty(7.x.y-SNAPSHOT)
        Server Hostname:        localhost
        Server Port:            8080
        
        Document Path:          /content/espblog/*.html
        Document Length:        1317 bytes
        
        Concurrency Level:      20
        Time taken for tests:   0.350 seconds
        Complete requests:      100
        Failed requests:        0
        Write errors:           0
        Keep-Alive requests:    100
        Total transferred:      148900 bytes
        HTML transferred:       131700 bytes
        Requests per second:    285.61 [#/sec] (mean)
        Time per request:       70.025 [ms] (mean)
        Time per request:       3.501 [ms] (mean, across all concurrent requests)
        Transfer rate:          415.31 [Kbytes/sec] received
        
        Connection Times (ms)
                      min  mean[+/-sd] median   max
        Connect:        0    0   0.3      0       1
        Processing:    13   67  36.7     57     202
        Waiting:       13   67  36.7     57     202
        Total:         13   67  36.9     57     203
        
        Percentage of the requests served within a certain time (ms)
          50%     57
          66%     76
          75%     86
          80%     99
          90%    125
          95%    137
          98%    168
          99%    203
         100%    203 (longest request)
        
        

        So the Request per second (TPS) jumps from 105 to 285 indicating the caching of scripts should give improved performance

        [1] http://httpd.apache.org/docs/2.2/programs/ab.html

        Show
        Chetan Mehrotra added a comment - Ran the Apache Benchmark [1] after installing the espblog sample. Following command was used ab -k -n 100 -c 20 -A admin:admin http://localhost:8080/content/espblog/*.html Default Benchmarking localhost (be patient).....done Server Software: Jetty(7.x.y-SNAPSHOT) Server Hostname: localhost Server Port: 8080 Document Path: /content/espblog/*.html Document Length: 1317 bytes Concurrency Level: 20 Time taken for tests: 0.945 seconds Complete requests: 100 Failed requests: 0 Write errors: 0 Keep-Alive requests: 100 Total transferred: 148900 bytes HTML transferred: 131700 bytes Requests per second: 105.87 [#/sec] (mean) Time per request: 188.916 [ms] (mean) Time per request: 9.446 [ms] (mean, across all concurrent requests) Transfer rate: 153.94 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.3 0 1 Processing: 61 183 105.4 164 899 Waiting: 61 183 105.4 164 899 Total: 61 183 105.5 164 900 Percentage of the requests served within a certain time (ms) 50% 164 66% 191 75% 211 80% 237 90% 254 95% 283 98% 713 99% 900 100% 900 (longest request) With Script Cache Server Software: Jetty(7.x.y-SNAPSHOT) Server Hostname: localhost Server Port: 8080 Document Path: /content/espblog/*.html Document Length: 1317 bytes Concurrency Level: 20 Time taken for tests: 0.350 seconds Complete requests: 100 Failed requests: 0 Write errors: 0 Keep-Alive requests: 100 Total transferred: 148900 bytes HTML transferred: 131700 bytes Requests per second: 285.61 [#/sec] (mean) Time per request: 70.025 [ms] (mean) Time per request: 3.501 [ms] (mean, across all concurrent requests) Transfer rate: 415.31 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.3 0 1 Processing: 13 67 36.7 57 202 Waiting: 13 67 36.7 57 202 Total: 13 67 36.9 57 203 Percentage of the requests served within a certain time (ms) 50% 57 66% 76 75% 86 80% 99 90% 125 95% 137 98% 168 99% 203 100% 203 (longest request) So the Request per second (TPS) jumps from 105 to 285 indicating the caching of scripts should give improved performance [1] http://httpd.apache.org/docs/2.2/programs/ab.html
        Hide
        Chetan Mehrotra added a comment - - edited

        patch which implements the above logic.

        Key points

        • Introduced a new ScriptNameAware interface to enable passing scriptName via Reader instance. This was required in JS impl where the reader needs to be wrapped in ESPReader if the script is esp script
        • Compiled scripts would be cached if possible

        Changes are also available at https://github.com/chetanmeh/sling/commits/SLING-913

        Show
        Chetan Mehrotra added a comment - - edited patch which implements the above logic. Key points Introduced a new ScriptNameAware interface to enable passing scriptName via Reader instance. This was required in JS impl where the reader needs to be wrapped in ESPReader if the script is esp script Compiled scripts would be cached if possible Changes are also available at https://github.com/chetanmeh/sling/commits/SLING-913

          People

          • Assignee:
            Radu Cotescu
            Reporter:
            Felix Meschberger
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development