Tapestry 5
  1. Tapestry 5
  2. TAP5-769

JavaScript aggregation can be inefficient across multiple pages with different JS requirements

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.1.0.5
    • Fix Version/s: 5.2.0
    • Component/s: tapestry-core
    • Labels:
      None

      Description

      I think Tapestry's JavaScript combination functionality is flawed. Each page & component specifies which JS files it needs, which means that JS can be split into functional units (good for development & maintenance) and only the JS that's actually needed for that page is added for the client to download. The consequence of this is that pages can have lots of JS files to download, all of which has to be downloaded before the page is loaded/rendered now that the script link tags are enforced to be back in the head section. Our search results page has 34 JS files for instance.

      Yahoo's YSlow tool recommends that these files are combined and minified, and Tapestry includes functionality to do the first (minifying is on the TODO list I believe) probably as a response to this recommendation which is good. Unfortunately the implementation based on only having the JS files required for a page means that the combined JS can easily be unique for most pages of a site. This means that the client browser has to download & cache lots of large JS multiple times (prototype, scriptaculous, tapestry etc) as part of bigger combined files, which I think is probably worse than requesting them separately, but only downloading stuff once and using that for all pages.

      To solve this issue, Tapestry script combination could combine all of the scripts needed for the site, and not just the unique set for each page. That way only a single JS file needs to be downloaded and cached by the client browser. I'm aware that this may not be that easy given the existing way only scripts needed for the page are put on it, so an alternative solution that may be easier to implement would be to combine scripts into two files for each page. The first file would contain all of the commonly Tapestry provided JS such as prototype.js, scriptaculous.js, effects.js, tapestry.js, etc in one file that's the same for every page, and have the rest in a second file that is unique for the page but that is not likely to include very large JS files, just many little ones.

      A second flaw that the combination has is that if an external JS file is requested, script combination is aborted rather than just excluding the external file from the combination.

      One other thing that surprised me about Tapestry's script combination is the length of the generated filename, for example it's 919 characters long for a page on our site.

      1. 0004-TAP-769.patch
        1 kB
        Ben Gidley
      2. 0003-TAP-769.patch
        5 kB
        Ben Gidley
      3. 0002-TAP-769.patch
        15 kB
        Ben Gidley

        Activity

        Hide
        Ben Gidley added a comment -

        I agree with this problem but regarding the solution - I think if you are that concerned you have to put it into the common JS loaded on each page. (There is an service for this - clientInfrastructure). I have built a library to help with this in ioko-tapestry-commons.

        The solution you advocate would be ideal - but I can't see any way of achieving it.

        Show
        Ben Gidley added a comment - I agree with this problem but regarding the solution - I think if you are that concerned you have to put it into the common JS loaded on each page. (There is an service for this - clientInfrastructure). I have built a library to help with this in ioko-tapestry-commons. The solution you advocate would be ideal - but I can't see any way of achieving it.
        Hide
        Howard M. Lewis Ship added a comment -

        I'm not sure if there's an issue for this, but a suggestion was floating around that the core JS stack be combined separately from the per-page JS. Thus you'd have one request for prototype/scripaculous/events/tapestry/etc. and a second request anything specific per-page. An advantage here is that the core stack could have a shorter url, something like:

        /assets/virtual/x.y.z/core.js

        Since the core JS library paths would not need to be encoded into the name, the per-page virtual script would also be shorter.

        I can also see an easy way to extend the base stack to include additional JS files; this would pre-emptively include the JS library as part of the core; so if you are using, say, Modalbox across many pages, it would be in core (even for pages that don't actually use it).

        5.1 introduces a service responsible for encoding binary streams into MIME (for inclusion in the URL); that's where your 919 characters come from. The intent is that the service could be overridden to store the data on the server side (in the main database, or some kind of embedded database) and send just a token or id to the client side. That has a lot of implications for a clustered application, which is why the default behavior is to encode the data into the MIME stream and let it live on the client.

        Show
        Howard M. Lewis Ship added a comment - I'm not sure if there's an issue for this, but a suggestion was floating around that the core JS stack be combined separately from the per-page JS. Thus you'd have one request for prototype/scripaculous/events/tapestry/etc. and a second request anything specific per-page. An advantage here is that the core stack could have a shorter url, something like: /assets/virtual/x.y.z/core.js Since the core JS library paths would not need to be encoded into the name, the per-page virtual script would also be shorter. I can also see an easy way to extend the base stack to include additional JS files; this would pre-emptively include the JS library as part of the core; so if you are using, say, Modalbox across many pages, it would be in core (even for pages that don't actually use it). 5.1 introduces a service responsible for encoding binary streams into MIME (for inclusion in the URL); that's where your 919 characters come from. The intent is that the service could be overridden to store the data on the server side (in the main database, or some kind of embedded database) and send just a token or id to the client side. That has a lot of implications for a clustered application, which is why the default behavior is to encode the data into the MIME stream and let it live on the client.
        Hide
        Andy Blower added a comment -

        That sounds like a good solution to me (hardly surprising since I suggested it as an alternative solution above) and extending base stack for additional files sounds very good.

        What data is encoded into the filename? What's it used for?

        Show
        Andy Blower added a comment - That sounds like a good solution to me (hardly surprising since I suggested it as an alternative solution above) and extending base stack for additional files sounds very good. What data is encoded into the filename? What's it used for?
        Hide
        Howard M. Lewis Ship added a comment -

        The file name is really a MIME encoded stream of all the JS library paths; thus it gets bigger as the number of JS files increases.

        Show
        Howard M. Lewis Ship added a comment - The file name is really a MIME encoded stream of all the JS library paths; thus it gets bigger as the number of JS files increases.
        Hide
        Fernando Padilla added a comment -

        I also would love it for someone to enhance that class to properly support external js files better. I have particular pages/components that depend on YUI js files, as well a facebook js files. But because they are included, means that none of my pages will use js combining..

        I am very confident that someone could make the combiner smart enough to combine stretches of files.. so that a series of js files, so that the following will work!

        1) internal.js
        2) internal2.js
        3) http://external.js
        4) internal3.js
        5) internal4.js
        6) http://external2.js

        combined to:

        1) combo.js?internal.js&internal2.js
        2) http://external.js
        3) combo.js?internal2.js&internal4.js
        4) http://external2.js

        Show
        Fernando Padilla added a comment - I also would love it for someone to enhance that class to properly support external js files better. I have particular pages/components that depend on YUI js files, as well a facebook js files. But because they are included, means that none of my pages will use js combining.. I am very confident that someone could make the combiner smart enough to combine stretches of files.. so that a series of js files, so that the following will work! 1) internal.js 2) internal2.js 3) http://external.js 4) internal3.js 5) internal4.js 6) http://external2.js combined to: 1) combo.js?internal.js&internal2.js 2) http://external.js 3) combo.js?internal2.js&internal4.js 4) http://external2.js
        Hide
        Fernando Padilla added a comment -

        and while you're at it, adding the option to make the combiner smart enough to protect against any combined url getting too large..

        if the url is bigger than 2K of 4K then it should start a new combiner url.. before it generates an invalid url..

        this makes sense right?

        Show
        Fernando Padilla added a comment - and while you're at it, adding the option to make the combiner smart enough to protect against any combined url getting too large.. if the url is bigger than 2K of 4K then it should start a new combiner url.. before it generates an invalid url.. this makes sense right?
        Hide
        Ben Gidley added a comment -

        Here is a patch that implements a solution for this.

        This changes the DocumentLinker and RenderSupportImpl so that scripts can be 'grouped' prior to assembly into a single script.

        The default behaviour now is to create one 'group' from the ClientInfrastructure stack and a second group from everything else.

        This should address a good percentage of this problem raised in this issue.

        Further enhancements that could be made

        This patch does not address this issue of the length of the filename for the 'stack'. I can't see an easy, generic and cluster safe way to implement that (except via the contribution mentioned above)

        Show
        Ben Gidley added a comment - Here is a patch that implements a solution for this. This changes the DocumentLinker and RenderSupportImpl so that scripts can be 'grouped' prior to assembly into a single script. The default behaviour now is to create one 'group' from the ClientInfrastructure stack and a second group from everything else. This should address a good percentage of this problem raised in this issue. Further enhancements that could be made Add method to render support to users to create their own groups Add a contribution point (similar to that provided by http://tapestry.formos.com/projects/ioko-tapestry-commons/tapestry-javascript/ ) could be provided to allow inclusion of stacks on certain pages A further contribution to allow replacement for a stack rolled up javascript with a 'production' one e.g. from a CDN. This patch does not address this issue of the length of the filename for the 'stack'. I can't see an easy, generic and cluster safe way to implement that (except via the contribution mentioned above)
        Hide
        Howard M. Lewis Ship added a comment -

        Wow! That was quite a bit of work. The concept of "virtual" assets is now gone, replaced with "stack" assets. The URL is short because the server side knows which assets go into the stack, they are fixed. That means that instead of one virtual asset URL, you get one asset URL for the stack, and additional asset URLs for any other JavaScript libraries used on the page. That means that for most pages, it will just be the stack, and for more involved pages, additional (smaller) JS libraries will be downloaded (and cached on the client side).

        You can now define your own stacks, and use JavascriptSupport.importStack().

        In a seperate issue, I'll be creating new versions of @IncludeJavascriptLibrary and @IncludeStylesheet that will expose new features.

        Show
        Howard M. Lewis Ship added a comment - Wow! That was quite a bit of work. The concept of "virtual" assets is now gone, replaced with "stack" assets. The URL is short because the server side knows which assets go into the stack, they are fixed. That means that instead of one virtual asset URL, you get one asset URL for the stack, and additional asset URLs for any other JavaScript libraries used on the page. That means that for most pages, it will just be the stack, and for more involved pages, additional (smaller) JS libraries will be downloaded (and cached on the client side). You can now define your own stacks, and use JavascriptSupport.importStack(). In a seperate issue, I'll be creating new versions of @IncludeJavascriptLibrary and @IncludeStylesheet that will expose new features.

          People

          • Assignee:
            Howard M. Lewis Ship
            Reporter:
            Andy Blower
          • Votes:
            4 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development