Apache OpenOffice (AOO) Bugzilla – Issue 81597
[MWEx] Export filter "MediaWiki" fails to export Non-breaking space
Last modified: 2013-08-07 14:42:46 UTC
Hi. - Please open the attached document. It contains a Non-breaking space. - Export it as MediwWiki - view the exportet results in a text editor: The exportet code is "1 2", but it should be "1 2".
Created attachment 48242 [details] Document containing a Non-breaking space
Reassigned to ES.
To be precise, the exported wiki file contains the following contents: > hexdump -C src/test/fixtures/nbsp.txt 00000000 31 c2 a0 32 0a |1..2.| 00000005 This means that the "1" and "2" are separated by the unicode character  , which is exactly the non breaking space from the ODT document. This unicode character is pretty legal for WikiMedia. Escaping all non-ASCII characters with HTML/XML-entities makes the result hard to read (German "Umlauts", Chinese Text...).
To sum up @haui: do you mean WONTFIX?
I have opened the output of your filter in Firefox (UTF-8) and have copied it into our MediwWiki (UTF-8). The space is breaking when shrinking the browser window!!! (But Umlaute are preserved.) If I change the space to it is non-breaking. So I think you should change this in your filter.
Ok, I can reproduce the issue. Although there is a NBSP character in the filter output, this is either lost during cutting&pasting into the Wiki edit field or it is swallowed by the MediaWiki software. I'll fix this in the filter by quoting exactly this character as as Norbert suggested.
Created attachment 48383 [details] Updated transformation (revision 2711)
Besides the fix for the NBSP issue, the uploaded update contains the following changes to the original version (revision 2639) attached to issue 48409. ------------------------------------------------------------------------ r2642 | hauma | 2007-05-30 22:41:20 +0200 (Mi, 30 Mai 2007) | 4 lines - Allow using tabs for indentation of preformatted text. - Prevent double newlines for separating paragraphs in preformatted text by default. - Suppress lists for implementing section numbering. - Made the transformation user-configurable through user info variables. ------------------------------------------------------------------------ r2708 | hauma | 2007-09-20 23:02:05 +0200 (Do, 20 Sep 2007) | 1 line Added encoding of non-breakable spaces as HTML entity. ------------------------------------------------------------------------ r2711 | hauma | 2007-09-20 23:14:25 +0200 (Do, 20 Sep 2007) | 1 line Bugfix: All '<' characters must be escaped during rendering a text block containing </nowiki> markup. ------------------------------------------------------------------------
@MAV: reassign to you. I guess we can make a CWS for this?
add link to original spec add mmp to cc list seems to me a OOo 2.3.1 PATCH/Bugfix Issue.
Then -> 2.3.1
Eric, do you want to keep the 2.3.1 target though the patch contains more than "just the fix"? As you will have to test it it should be your decision. For me it would be OK. haui, thanks for your great support!
The patch will be integrated into one of the next cws.
The patch is commited to cws fwk76.
MAV->ES: Please verify the issue.
@haui: please explain how to test those features: - Suppress lists for implementing section numbering. - Made the transformation user-configurable through user info variables. - Bugfix: All '<' characters must be escaped during rendering a text block containing </nowiki> markup. What was the state before/after implementing it? A step by step description of how to test it. Thank you! PS: the other fixes are working well.
Hi, for "Suppress lists for implementing section numbering", transform the OpenDocument Spec to MediaWiki with and without the patch. Without the patch, headings are rendered # ## ### '''Heading 3''' Text. I was not yet able to construct a document in OOo that produces the same internal XML structure as the OD spec. "Made the transformation user-configurable through user info variables." is a feature, you may want to silently ignore. Otherwise, you have to test the following: 1. Create a user-defined document info field (in document properties) named "CODE_TAB_REPLACEMENT" (instead of e.g. "Info 1"). Set the value of the field to some string (e.g. 6 space characters). Create a paragraph with fixed-width font and start the paragraph with tab characters. In the transformed result, tab characters are replaced with the value of the user-defined field. 2. Create another user-defined field named "CODE_JOIN_PARAGRAPHS" and set the value either to "true" or to "false". Create a sequence of paragraphs with fixed-width font (e.g. a code snippet), where each line is in a separate paragraph. If the document variable is set to "true", paragraphs are treated as simple new-lines instead of translating to a wiki paragraph break (a double new-line character). If the value is "false", the transformation transfroms paragraph breaks in code section as it where normal text. 3. Create another user-defined field named "CODE_STYLES" and enter the name of a paragraph style that has non fixed-width font. Create a paragraph with this style and transform the document. The paragraph should be rendered as preformatted text with fixed-width font in the wiki output. This customization can be used as workaround, if the font of a code paragraph is de-facto fixed width, but this is not marked in the ODT file as such. An example is again the OpenDocument specification. For "Bugfix: All '<' characters must be escaped during rendering a text block containing </nowiki> markup", tranform the following text: < This < is < a < paragraph & with & multiple & xml > specials. > < This < is < another < paragraph & with </nowiki> & </nowiki> & multiple & xml > specials. > Best regards Bernhard Haumacher
@Bernhard: Thank you for the explanations! It works as described. @mmp: can you please include those new features to the MediaWikiExport Spec? 1) Allow using tabs for indentation of preformatted text. Default: Tab replaced with spaces, Can be customized using CODE_TAB_REPLACEMENT variable as info field 2) Prevent double newlines for separating paragraphs in preformatted text by default. Can be turned off using CODE_JOIN_PARAGRAPHS variable, value "false" as info field 3) Suppress lists for implementing section numbering. Verified in CWS fwk76
Ok in m235
Issue 81833 contains a newer version of the export filter (Revision 2723). Please inculde this one into the next OOo release.