|
[
Permlink
| « Hide
]
Tom Pasierb added a comment - 25/Sep/06 05:07 PM
Both files are encoded in utf-8 encoding.
I like the idea of being able to specify the encoding per document template. Many templates can be included in a single page. We might be able to build on a similar idea that is built into the clay markup parser.
The clay HTML template parser has a couple special tokens that it used to block out a markup that should be excluded form the document. These tokens are in the form of comments. <!-- ### clay:remove ### --> <html> exclude this text <!-- ### /clay:remove ### --> These special comments and the markup between is ignored - dropped from the document. What if we used another special comment token that operates like a page directive? <!-- ### clay:page charset="UTF-8" / --> This comment would have to be in the first few bytes of the template document. The top of each template could be sniffed for this token. If it exists, extract the charset and open the target template with the specified encoding. Reading each template file would be broken down into two steps. 1) Look at the top of the template for the token comment containing the charset. If not found, use the vm's default "file.encoding". 2) Read the template in with the determined encoding Does is this a sound plan? Any thoughts? This sounds like a good idea.
However, It would be nice to have an extra application wide config option for loading html templates. The proccessing would look like this: 1) Look at the top of the template for the token comment containing the charset. If not found, 2) look for the app wide config option for template encoding. If found use the encoding for reading the template, If not found use the vm's default "file.encoding". 3) Read the template in with the determined encoding This way one could have all the templates in a given encoding and unless there was <!-- ### clay:page charset="UTF-8" / --> directive at the top of the file, they would be read with the default encoding set in web.xml and one wouldn't have to define this config in each and every template file. If no <!-- ### clay:page charset="UTF-8" / --> directive was defined in web.xml (null) then clay would fall back to vm'a default file.encoding How about this? This is the first try at resolving this issue. It will be available in the shale-framework-20060928 nightly build. You can find it here: http://people.apache.org/builds/shale/nightly/.
To summarize the changes based on your notes the encoding is now determined with the following steps: 1) Look at the top of the template for the token comment containing the charset. For example: <-- ### clay:page charset="UTF-8" /### --> 2) If not found, look for the app wide config option for template encoding. If found use the encoding for reading the template. For example: <context-param> <param-name>org.apache.shale.clay.HTML_TEMPLATE_CHARSET</param-name> <param-value>UTF-8</param-value> </context-param> 3) If not found use the vm's default "file.encoding". 4) Read the template in with the determined encoding Tom, I'm going to leave this open until you have a chance to verify. I have tried the updated clay version with html templates.
I experimented with -Dfile.encoding (system encoding setting), org.apache.shale.clay.HTML_TEMPLATE_CHARSET context init parameter and <-- ### clay:page charset="UTF-8" /### --> and everything works as expected so I guess this issue can be closed. Thanks Gary :-) The examples and your input really made the difference on this one. Thanks for the help Tom.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||