Apache OpenOffice (AOO) Bugzilla – Issue 81373
[MWEx] Export filter "MediaWiki" fails to export formulas in TeX-syntax, put all Wiki stuff into extension
Last modified: 2017-05-20 08:57:37 UTC
When exporting documents containing formulas, the string "[[Image:]]" is exported. But MediaWikis where formulas are often used (for example Wikipedia or our internal MediaWiki here in our company) use the extension "Blahtex" that enables formula functionality in the MediaWiki. If Blahtex is installed, the MediaWiki can understand tex-syntax between <math></math>-tags, which is nearly identical with OOo's formula sytax. There should be at least an option to use tex-syntax for formulas instead of images. This should be very easy to implement, since this code already exists in OOo: OOo's Latex export already does export formulas as tex-code!
I'm not sure if the filter is able to do that (it is based on an XSLT). Bernhard, any comment from your side?
Reassigned to ES.
Looking at: http://wiki.services.openoffice.org/wiki/Odt2Wiki/Features it seems that Haui has (maybe) thought about this (there is an empty section "WikiMath") but didn't implemented this. Therefore it's a feature.
Hi, the filter does not support exporting native OOo Formulas. Don't know, whether this is possible with reasonable effort in XSLT. Instead, one can include native TeX/WikiMedia formulas and mark them with a character style called "WikiMath". An example can be found attached to issue 48409 (oopedia-example.odt, or a more complex one in rytzsche-achsenkonstruktion-herleitung.odt). E.g. one can include the text "\overline{M V_n}" into a document and mark this text with the character style "WikiMath". During export, this formula is converted into a WikiMedia <math>...</math> section. This is really no magic, but I feel that this is a convenient way to compose/edit WikiMedia documents with formulas in OpenOffice.org. Of course, this does not help, if an existing OOo document with native formulas should be exported to WikiMedia format. There is room for future work... :-) Regards Bernhard
@MMP: passing over to you for evaluation.
Hi Bernhard, "... but I feel that this is a convenient way to compose/edit WikiMedia documents with formulas in OpenOffice.org ..." I think the sense of generating MediaWiki-syntax with OOo-Writer is the comfort. And this comfort is currently not available for formuals. "There is room for future work..." Sure! Thanks for your work. The filter is really good for converting tables in Writer documents (or MS-Word documents, that can be imported into Writer) to MediaWiki-syntax.
Hi Norbert, extending the converter to native ODT formulas is an interesting issue. Since formulas are embedded as MathML documents, the problem "reduces" to MathML to LaTeX conversion. There is a SourceForge project http://xsltml.sourceforge.net/, which seems to provide XSLT scripts for doing exactly this. However, my impression is that the inverse transformation (from direct LaTeX input into native OOo formulas) would be far better for the "comfort" while editing formulas in OOo. Having more than one or two formulas in a document - who can really use the built-in "formula editor" - especially for a lot of simple formulas? Admitted, this has nothing to do with this issue... :-)
Hi Bernhard, about the thing that has nothing to do with this issue: It is possible to enter formulas in OOo's LaTeX-related syntax just as text string in writer, select it and select the menu item for inserting formulas: Now the string is converted into a formula without the need to open the formula editor. What I'm missing here is a shortcut to use the keyboard; this would make it possible to insert formuals without using the mouse. ------------------ Now to this issue: It would be great if you could implement this missing feature by using "XSLT MathML Library".
The transformations from http://xsltml.sourceforge.net/ produce quite good results on MathML produced by OOo. Unfortunately, there is a difficulty integrating this transformation, because formulas are not stored in the main contents.xml of a document. Therefore, the formula markup is not at all presented to the XSLT filter during export. I'm not sure, if one can use the XSLT function document(object, node-set?) from within a filter to access other parts of the exported document. Maybe somebody from the core OOo team can clarify this?
Thanks for the information. I will try to clarify this.
There is good news: The formulas are only stored into a separate file in a saved ODT document. During export, the MathML markup is embedded within the draw:object tag as specified in "9.3.3 Objects" of the ODT specification. This makes integrating the MathML transformation really easy. I'll attach a demo of that.
Created attachment 48416 [details] Revision 2718 of the odt2wiki transformation including xslml for formulas
The attachement above consists of several files. To install it in OOo, one must point the "Export XSLT" setting to the file odt2wiki.xslt in the src/ directory. The other files are included from this main transformation.
enhancement > patch target 2.4 I will check the spec according to this discussion and then forward the issue to Mikhail /MAV
Great work, Bernhard! Thank you. Isn't it possible to set target to OOo 2.3.1?
2.3.1 is a bug fix release for 2.3 only. At this point I consider the Patch for formulas a new feature. Therefore I suggest the target 2.4. BTW -Feature Freeze for OOo 2.4 is in November.
The credit goes to Vasil I. Yaroshevich for developing the MathML -> TeX transformation. :-)
Matthias, I think it doesn't make sense that you keep that issue. Mikhail, we should find a way to get that into 2.4
MAV->ES: How should we treat the issue. Is it a real new feature ( then it is not allowed into OOo2.4 )? Or is it an enhancement that is allowed to be integrated in OOo2.4?
From my point of view, this is an enhancement to the feature "MediaWiki Export". Formulas are part of both, OpenDocument and MediaWiki, but the trafo for formulas was simply unimplemented due to the high complexity in the first versions of MWEx. Regards, Bernhard
I'm sorry but we have now feature freeze for 2.4. Another point is that the feature/enhancement is not specified (are every Writer Math expressions really exported?) Thus, targetting to 3.0. This feature will need a detailed sampled which we can add to the general specification.
Mikhail, please create a new CWS for this wonderful patch as soon as possible. Thanks.
@haui: There ist still no specification on what is supported and not. @mav: please, after implementation, reassign the task to MRU. Thax! @mru: I will attach a zip file containing Math test docs. In this archive you will find: - "embedded.odt": a Writer doc containing the Math OLE object with all Math commands and another Math OLE sample. - "all_commands.odf": the standalone Math document included in "embedded.odt" - 20 Math sample docs - 9 Math docs, sub-sections of "all_commands.odf" To test: - load "embedded.odt" - File - Send - To MediaWiki -> on the Wiki, every thing must be there! ;) I'll mail you the details concerning the internal Wiki account. Hope this helps!
Created attachment 51417 [details] math_test_docs
@es regarding test suite: Thanks a lot for this extensive test suite. I'll include it in my regression test setup. @es regarding spec: Should be "natural" - all kinds of formulas should be supported. Unfortunately, I don't know, how complete the MathML library is that I used in the filter. Therefore, we should check, the results from your test suite.
First tests show that there are several symbols that seem to get translated to valid LaTeX, but that have to be output differently for the Mediawiki TeX syntax.
Integrated in fwk81.
MAV->MRU: Could you please verify the issue. Please remember, the fix does not allow to export OOo embedded math objects. It only allows a workaround to insert the formulas in wiki format ( they should be marked with "WikiMath" character style ), that will be correctly exported by the mediawiki filter. Please see the comment from HAUI from Sep 10 2007 for details. This comment includes also the link to the documents that could be used to test the issue.
MAV: I think you are wrong. With the new filter, true OOo math formulas are converted to mediawiki syntax!
You are completely right. This is a problem with integration, sorry.
.
The change has to be integrated in other cws than fwk81.
The OOo Wiki does not have the "math package" installed? Therefore, testing with the math test docs from es cannot be done there. Is there a chance to get the math things installed into the OOo Wiki? I don't like uploading test stuff to the official WikiPedia site. Regards Bernhard
The attached Math test docs from es pointed out a lot of problems in the transformation. The main reasons were that the MathML used by OOo is not MathML and the LaTeX interpreted by MediaWiki is not LaTeX... :-) I've compiled an update of the transformation that tries to work around most of the problems. I'll upload the sources soon. The result of the test summary can be checked here: http://de.wikipedia.org/wiki/Benutzer:Hauix/Odt2Wiki/Math
Created attachment 51970 [details] Revision 2756 of the ODT to Wiki transformation.
The attached transformation in revision 2756 is able to translate the kernel tests 1-9 of the math test docs into wiki format that can be successfully rendered by MediaWiki. Results of the other tests have still to be checked. Not supported by MediaWiki are the symbols \oiint, \oiiint, \adots. Neither supported is text strike-through. The transformation does not support to change fonts in math formulas. The rendered transformation result can be checked at http://de.wikipedia.org/wiki/Benutzer:Hauix/Odt2Wiki/Math.
As we still haven't finished the legal review it's possible that we must move the target to 3.1
I am setting the bug to 3.1 since the legal review is still in progress. mav->mh: Please set it back to 3.0 if the review process ends early enough to integrate the patch in 3.0.
What is the problem? The Sun Contributor Agreement (SCA)?
http://xsltml.sourceforge.net/ has been approved.
No, the problem just was that we haven't been notified in the meantime that the legal review was done. :-/ Perhaps we should make the transformation a part of the Wiki extension so that we don't need to wait until 3.1? Extension releases can be done at any time.
> Perhaps we should make the transformation a part of the Wiki extension... Do you mean the "Sun Wiki Publisher". I never have tried it. But since a wiki extension already exists it sounds for me just logical to bundle all wiki stuff in this extension.
Does this step (putting odt2wiki into extension) require a seperate issue?
Yes, exactly. The wiki publisher makes using the great export filter even easier as it uploads the code for you and also handles encoding problems (some browsers don't accept the generated files as they don't understand/use input with utf-8 encoding on all operating systems). We also could make a separate extension for the filter but I think it makes a lot of sense to bundle it with the wiki publisher.
I don't think that this needs a separate issue. Martin, what's your opinion? Shall we do it?
> ...but I think it makes a lot of sense to bundle it with the wiki publisher. Yes. Exporting to wiki is a function that is NOT used by the average user.
@mba: I'm fine with this.
We will add the tranformation to the extension and release a new version of it when we are done. For bookkeeping purposes we keep the 3.1 target as this denotes the code line where the fix will be applied. Sorry to haui for the delay - this shouldn't have happened.
I have made an overview of available Revisions of the wiki export filter. My results: ------------------------ Issue Revision Issue 48409 2639 Issue 81597 2711 Issue 81833 2723 Issue 82978 2730 Issue 81373 2756 ------------------------ So I came to the result that Revision 2756 which is attached to this issue is the newest version of the filter which should be included to the extension. Is this correct? Are there newer builds available somewhere?
I have set Target milestone OOo 3.1. I hope this is ok.
Ooops. Last comment is wrong issue, sorry.
When updating the "Sun Wiki Publisher" it would be nice if issue 82760 could be fixed.
Created attachment 56964 [details] Patch that updates r2756 to the latest development revision 2870.
Revision 2870 adds two minor bugfixes to the transformation regarding the WikiMath style, which allows to enter Wiki-TeX syntax directly within an OpenOffice.org document. Form the logs: Bugfix: No not apply any transformation to the contents marked as WikiMath. Bugfix: Join sibling nodes marked as WikiMath.
Thanks for the patch, looking forward to our new version of the Wiki Publisher. :-)
The filter is now integrated into the extension including the transformation in mav46 cws. Shouldn't we remove the wiki-filter from the office installation then? I would say yes. I will do it later if nobody complains.
Yes, the filter has now to be removed from the OOo build.
The filter will be removed from OOo installation for issue 99462.
mav->es: Please verify the issue.
@haui: My first try to send my document (http://www.openoffice.org/nonav/issues/showattachment.cgi/51417/math_test_docs.zip -> "all_commands" ) to wiki is a mess! Well, I think every single command exports fine but the *whole* Math shows a parsing error. I'm currently trying to install the Math extension on my internal Wiki (what a PITA to tune it! :( ) but I made first tests on my test account on mediawiki.org (http://www.mediawiki.org/wiki/User:Test4wiki). It looks like the Mediawiki or any other extension component gets confused by '\\' or so... Can you please try to debug it? Thanx!
@haui/mav: this is the last issue needing verification in this CWS. Please help tracking what's wrong with the test document.
The problem here is the construction \begin{array}{} ... \end{array}. The empty brackets should specify the format of the array and are not allowed to be empty in wiki.
So that's a fixed but failed?
I would say that it is just a new enhancement that has a bug. This is a special case, when array construction is used. I would handle it as a standalone issue.
mav->es: Probably you are right. It is fixed, but failed. The 'a°b' and 'a÷b' for example are also exported wrongly. Please send the bug back to me.
Reassigned
Reopen
The problem was that I have introduced the original code from the transformation site, but the code attached here contains additional changes that are necessary. The original code is now patched with the changes. The document containing "all_commands" math object can be exported successfully now.
It looks great! Thax! :) http://www.mediawiki.org/wiki/User:Test4wiki