Tapestry
  1. Tapestry
  2. TAPESTRY-2028

Mimimize whitespace in the output markup

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.0.7
    • Fix Version/s: 5.0.8
    • Component/s: tapestry-core
    • Labels:
      None

      Description

      Tapestry (4 and 5) has traditionally honored all the white space in a template. This is for a few specific cases, such as text inside a <pre> element (who uses those?).
      This results in output documents that, due to the extra whitespace that often surrounds Tapestry components, contain large amounts of whitespace.

      In most cases, interior white space (whitespace between text characters) can be reduced to a single space, and white space just after a tag or just before a tag can be eliminated entirely.

      The Tapestry template parser should honor the xml:space attribute and use it to determine what template whitespace is relevant, and what whitespace may be minimized or eliminated.

        Activity

        Hide
        Howard M. Lewis Ship added a comment -

        The stripped output is now really hard to read, which is fine, because there's FireBug to show you a neatly formatted version of the downloaded markup.

        Show
        Howard M. Lewis Ship added a comment - The stripped output is now really hard to read, which is fine, because there's FireBug to show you a neatly formatted version of the downloaded markup.
        Hide
        Howard M. Lewis Ship added a comment -

        A side benefit of this is that not only will the response be smaller, but the number of tokens in the DOM tree, once rendered, will also be smaller, since many text tokens that contain only whitespace are entirely culled out.

        Show
        Howard M. Lewis Ship added a comment - A side benefit of this is that not only will the response be smaller, but the number of tokens in the DOM tree, once rendered, will also be smaller, since many text tokens that contain only whitespace are entirely culled out.
        Hide
        Massimo Lusetti added a comment -

        This is actually nice but in some use case the whitespace characters are used by web designers (at least the one i work with) so control at some degree the layout, in particular, when the CSS margin elements are not honored correctly by the browser.

        We have one application which will render completely differently with this two different templates:

        First
        <ul id="ulId" class="ulClass">
        <li class="liClass">some text</li>
        </ul>

        Second
        <ul id="ulId" class="ulClass"><li class="liClass">some text</li></ul>

        So we end up using this kind of indentation to keep the template readable:
        <ul id="ulId" class="ulClass"
        ><li class="liClass">some text</li
        ></ul>

        BTW i actually don't know what is actually causing this behaviour...

        Show
        Massimo Lusetti added a comment - This is actually nice but in some use case the whitespace characters are used by web designers (at least the one i work with) so control at some degree the layout, in particular, when the CSS margin elements are not honored correctly by the browser. We have one application which will render completely differently with this two different templates: First <ul id="ulId" class="ulClass"> <li class="liClass">some text</li> </ul> Second <ul id="ulId" class="ulClass"><li class="liClass">some text</li></ul> So we end up using this kind of indentation to keep the template readable: <ul id="ulId" class="ulClass" ><li class="liClass">some text</li ></ul> BTW i actually don't know what is actually causing this behaviour...
        Hide
        Howard M. Lewis Ship added a comment -

        Not to worry, you can use xml:space="preserve" where whitespace counts.

        Show
        Howard M. Lewis Ship added a comment - Not to worry, you can use xml:space="preserve" where whitespace counts.
        Hide
        Kevin Menard added a comment - - edited

        What actually prompted this? I don't require the whitespace for rendering anywhere (at least that I know of – this isn't going to be a fun way to find out), but just about any other debugging issue is going to be terrible. Looking at the source before it's processed doesn't always help because one may not know how any components or their IDs will render until they do so.

        I can empathize with the performance of the parser, but this really seems to be too invasive of an operation for a framework to be performing. As for client speed, the whitespace issue was solved a long time ago via gzip compression. I'd rather see Tapestry include a gzip filter that's enabled by default than this approach.

        Show
        Kevin Menard added a comment - - edited What actually prompted this? I don't require the whitespace for rendering anywhere (at least that I know of – this isn't going to be a fun way to find out), but just about any other debugging issue is going to be terrible. Looking at the source before it's processed doesn't always help because one may not know how any components or their IDs will render until they do so. I can empathize with the performance of the parser, but this really seems to be too invasive of an operation for a framework to be performing. As for client speed, the whitespace issue was solved a long time ago via gzip compression. I'd rather see Tapestry include a gzip filter that's enabled by default than this approach.
        Hide
        Howard M. Lewis Ship added a comment -

        I think there's an existing issue to include GZIP filtering.

        This has been a complaint I've received from clients going a ways back.

        It actually does affect server-side performance (though I'm at a loss as to how to measure it). Basically, there are a lot fewer tokens in the parsed templates, since many whitespace-only text tokens drop out ... that means fewer nodes in the rendered DOM and less work to convert that to an output stream.

        Show
        Howard M. Lewis Ship added a comment - I think there's an existing issue to include GZIP filtering. This has been a complaint I've received from clients going a ways back. It actually does affect server-side performance (though I'm at a loss as to how to measure it). Basically, there are a lot fewer tokens in the parsed templates, since many whitespace-only text tokens drop out ... that means fewer nodes in the rendered DOM and less work to convert that to an output stream.
        Hide
        Chris Lewis added a comment -

        I'll join the dissenters by saying that such a feature should be optional. I too love firebug, but that doesn't preculde the occasional desire to "see for myself" what's being rendered. I really like the gzip idea as well, but I'm not flat out against this stripping feature if it's controllable.

        Show
        Chris Lewis added a comment - I'll join the dissenters by saying that such a feature should be optional. I too love firebug, but that doesn't preculde the occasional desire to "see for myself" what's being rendered. I really like the gzip idea as well, but I'm not flat out against this stripping feature if it's controllable.
        Hide
        Howard M. Lewis Ship added a comment -

        Lets see how it works out in 5.0.8. If it causes a lot of problems I can address it by making compression optional instead of default, or by adding a pretty-printer (that still honors xml:space="preserve") when in development mode.

        Show
        Howard M. Lewis Ship added a comment - Lets see how it works out in 5.0.8. If it causes a lot of problems I can address it by making compression optional instead of default, or by adding a pretty-printer (that still honors xml:space="preserve") when in development mode.
        Hide
        Kevin Menard added a comment -

        Then I guess I'm with Chris on this. If it's configurable on an application level, I don't think having it will hurt any. xml:space="preserve" is probably not sufficient though.

        I guess I'm still a bit skeptical about the whole thing because I don't know of any other framework that does this. Like you, I'm not even sure how I would profile for this to see that it's really my biggest bottleneck. Now, obviously the parser itself could be profiled and it could be determined that under a large number of nodes it starts to suffer. In that case, I'd rather investigate how to make it more robust than to just start mucking around with the DOM. I'd suspect it'd be something that would have to be run into once users want to start rendering large, complex pages anyway.

        Show
        Kevin Menard added a comment - Then I guess I'm with Chris on this. If it's configurable on an application level, I don't think having it will hurt any. xml:space="preserve" is probably not sufficient though. I guess I'm still a bit skeptical about the whole thing because I don't know of any other framework that does this. Like you, I'm not even sure how I would profile for this to see that it's really my biggest bottleneck. Now, obviously the parser itself could be profiled and it could be determined that under a large number of nodes it starts to suffer. In that case, I'd rather investigate how to make it more robust than to just start mucking around with the DOM. I'd suspect it'd be something that would have to be run into once users want to start rendering large, complex pages anyway.
        Hide
        Chris Lewis added a comment -

        Let me also state my primary source of ammo here: reality. I have dealt with some boggling problems from IE 5-7, and I'm talking entire pages not rendering (or only partially so) because of a seemingly neurotic dependency on formatting in CSS files - and that's just one family of browsers (read one family of bugs)! On top of that, comments are in the XML (and HTML) specs for a reason. Even if T5 leaves comments, what good are they in such purified pages? Again I'm not wholesale against the idea, but there must be a way to disable it. It's an ambitious feature and that's good, but it may be too idealistic in the assumption that browsers behave the way they should.

        Show
        Chris Lewis added a comment - Let me also state my primary source of ammo here: reality. I have dealt with some boggling problems from IE 5-7, and I'm talking entire pages not rendering (or only partially so) because of a seemingly neurotic dependency on formatting in CSS files - and that's just one family of browsers (read one family of bugs)! On top of that, comments are in the XML (and HTML) specs for a reason. Even if T5 leaves comments, what good are they in such purified pages? Again I'm not wholesale against the idea, but there must be a way to disable it. It's an ambitious feature and that's good, but it may be too idealistic in the assumption that browsers behave the way they should.
        Hide
        Kevin Menard added a comment -

        Yeap. I ran into problems. Nothing catastrophic, but a lot of our pages look crappy now.

        We have something like the following in our templates:

        <strong>Order Placed:</strong> <t:outputdate value="order.date" format="MMMM d, y"/>

        OutputDate is a component we have for rendering dates. It doesn't add whitespace itself because it may not always be appropriate. For example, if used as a value in a table cell. So, the whitespace in the template is important. Now, that space after the ending strong tag is eaten up, so rather than see:

        Order Placed: January 22, 2008

        we get:

        Order Placed:January 22, 2008

        Incidentally, I forgot about this issue and spent time scrubbing the templates trying to figure out what was going on. Shame on me, I guess, but I'm likely not to be the only one.

        Perhaps preserving one space between tags would help.

        Show
        Kevin Menard added a comment - Yeap. I ran into problems. Nothing catastrophic, but a lot of our pages look crappy now. We have something like the following in our templates: <strong>Order Placed:</strong> <t:outputdate value="order.date" format="MMMM d, y"/> OutputDate is a component we have for rendering dates. It doesn't add whitespace itself because it may not always be appropriate. For example, if used as a value in a table cell. So, the whitespace in the template is important. Now, that space after the ending strong tag is eaten up, so rather than see: Order Placed: January 22, 2008 we get: Order Placed:January 22, 2008 Incidentally, I forgot about this issue and spent time scrubbing the templates trying to figure out what was going on. Shame on me, I guess, but I'm likely not to be the only one. Perhaps preserving one space between tags would help.
        Hide
        Kevin Menard added a comment - - edited

        As a follow up, these two similar fragments render differently:

        <strong>Account:</strong> <a href="mailto:$

        {order.billingInfo.customer.email}">${order.billingInfo.customer.email}

        </a><br/>

        <strong>Account:</strong> $

        {order.billingInfo.customer.email}

        In the former case, the space will not be preserved because the expansion is wrapped in an anchor tag. In the latter case, the space will be preserved.

        Show
        Kevin Menard added a comment - - edited As a follow up, these two similar fragments render differently: <strong>Account:</strong> <a href="mailto:$ {order.billingInfo.customer.email}">${order.billingInfo.customer.email} </a><br/> <strong>Account:</strong> $ {order.billingInfo.customer.email} In the former case, the space will not be preserved because the expansion is wrapped in an anchor tag. In the latter case, the space will be preserved.
        Hide
        Angelo Turetta added a comment -

        Yes, I've always considered suspicious this sentence from the Howard's original descrition:

        > and white space just after a tag or just before a tag can be eliminated entirely.

        I don't think so:

        <p>My beautyful <span class="greentext">dog</span>.</p>

        If you remove the space just before <span> the result is not equivalent to the original.

        Show
        Angelo Turetta added a comment - Yes, I've always considered suspicious this sentence from the Howard's original descrition: > and white space just after a tag or just before a tag can be eliminated entirely. I don't think so: <p>My beautyful <span class="greentext">dog</span>.</p> If you remove the space just before <span> the result is not equivalent to the original.

          People

          • Assignee:
            Howard M. Lewis Ship
            Reporter:
            Howard M. Lewis Ship
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development