Forrest
  1. Forrest
  2. FOR-125

produce formated plain text output

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.6
    • Fix Version/s: 0.7
    • Component/s: Core operations
    • Labels:
      None

      Description

      Please provide the option of generating formated plain text output.
      All request ending in .txt should generate a plain text output.

      Word wrapping should default to 80 characters
      It should allow for as much formating as possible
      Including:
      lists
      indents
      tables
      footnotes to hyperlinks
      Strong and em.



        Activity

        Hide
        Ross Gardler added a comment -
        I already did this once but then deleted all my files (doh!). Will need to redo it again soon in order to regenerate the site.
        Show
        Ross Gardler added a comment - I already did this once but then deleted all my files (doh!). Will need to redo it again soon in order to regenerate the site.
        Hide
        Nicola Ken Barozzi added a comment -
        FOP has the possibility of doing text output too. If we use that it would be easy to keep it in synch with DTD changes, as we need only to change the fo conversion.
        Show
        Nicola Ken Barozzi added a comment - FOP has the possibility of doing text output too. If we use that it would be easy to keep it in synch with DTD changes, as we need only to change the fo conversion.
        Hide
        Ross Gardler added a comment -
        I should have checked that. I just finished re-writing the XDoc to Text stylesheet. I'll convert to FO when I can. In the meantime I have to get working with SVN.
        Show
        Ross Gardler added a comment - I should have checked that. I just finished re-writing the XDoc to Text stylesheet. I'll convert to FO when I can. In the meantime I have to get working with SVN.
        Hide
        Juan Jose Pablos added a comment -
        I think that this issue is resolved. Can anyone closed?
        Show
        Juan Jose Pablos added a comment - I think that this issue is resolved. Can anyone closed?
        Hide
        Nick Chalko added a comment -
        Using SVN HEAD (is HEAD still correct for SVN)
        I just tried getting a index.txt with no luck.
        Show
        Nick Chalko added a comment - Using SVN HEAD (is HEAD still correct for SVN) I just tried getting a index.txt with no luck.
        Hide
        Juan Jose Pablos added a comment -
        :-( you are right, I do know why I have got that impression.

        I guess that the proper nam for HEAD is trunk, but everyone will understand that anyway
        Show
        Juan Jose Pablos added a comment - :-( you are right, I do know why I have got that impression. I guess that the proper nam for HEAD is trunk, but everyone will understand that anyway
        Hide
        Ross Gardler added a comment -
        I do have a semi-working style sheet (table formatting is bad). I'll attach to this bug. I intend to implement it and fix the remaining bugs but right now time is not on my side. Feel free to finish off what I started, otherwise I'll come back to this soon(ish).
        Show
        Ross Gardler added a comment - I do have a semi-working style sheet (table formatting is bad). I'll attach to this bug. I intend to implement it and fix the remaining bugs but right now time is not on my side. Feel free to finish off what I started, otherwise I'll come back to this soon(ish).
        Hide
        Dave Brondsema added a comment -
        (since this is a new feature, I'm removing it from 0.6 target)

        Using the FOPSerializer to text ends up being pretty ugly. http://xml.apache.org/fop/output.html#txt suggests some improvements, but I haven't tried yet because it'll require significant modifications to document2fo.xsl
        Show
        Dave Brondsema added a comment - (since this is a new feature, I'm removing it from 0.6 target) Using the FOPSerializer to text ends up being pretty ugly. http://xml.apache.org/fop/output.html#txt suggests some improvements, but I haven't tried yet because it'll require significant modifications to document2fo.xsl
        Hide
        Dave Brondsema added a comment -
        here's a simple patch to enable .txt rendering via FOPSerializer if anybody wants to try and make it look better. without improvement, it's not worth using this method.
        Show
        Dave Brondsema added a comment - here's a simple patch to enable .txt rendering via FOPSerializer if anybody wants to try and make it look better. without improvement, it's not worth using this method.
        Hide
        Nicola Ken Barozzi added a comment -
        If we put this as-is in 0.7, eventually it may be a seed for someone to make it better.
        Show
        Nicola Ken Barozzi added a comment - If we put this as-is in 0.7, eventually it may be a seed for someone to make it better.
        Hide
        Ross Gardler added a comment -
        This patch is what I have working minus some code I took from elsewhere that simply gave me a string consisting of a character repeated x times. I used it to manage layout, draw underlines on titles etc.

        In this patch I've stripped this code (as who actually owns it is unclear at this time) and inserted fixme's in all the places I used it. The resulting stylesheet is useable, but needs something to highlight headings etc.

        There are still some things need ironing out, the ones I am aware of are listed below, some of these are easy to fix (numbered lists) others less so (table layout):

        - numbered lists aren't numbered
        - lists within lists don't work
        - table layout is not even attempted
        - there is no neat wrapping of long lines of text
        - headings are no longer emphasised (need the script mentioned above)

        Be warned I have not had the time to test this in a wide range of documents, it functions for the few pages I need tet output on. Please have a go at improving things.
        Show
        Ross Gardler added a comment - This patch is what I have working minus some code I took from elsewhere that simply gave me a string consisting of a character repeated x times. I used it to manage layout, draw underlines on titles etc. In this patch I've stripped this code (as who actually owns it is unclear at this time) and inserted fixme's in all the places I used it. The resulting stylesheet is useable, but needs something to highlight headings etc. There are still some things need ironing out, the ones I am aware of are listed below, some of these are easy to fix (numbered lists) others less so (table layout): - numbered lists aren't numbered - lists within lists don't work - table layout is not even attempted - there is no neat wrapping of long lines of text - headings are no longer emphasised (need the script mentioned above) Be warned I have not had the time to test this in a wide range of documents, it functions for the few pages I need tet output on. Please have a go at improving things.
        Hide
        Diwaker Gupta added a comment -
        I would like to point out an alternate solution, that seems to me is much easier to implement, and works pretty well. This is also what is used by most Docbook DSSL/XSL stylesheets.

        What we can do is first render a "clean" HTML version of the page. This should be pretty easy, since the HTML conversion infrastructure is already in place. (just remove the menu and that tabs basically)

        After this, we just run it through a text browser (like lynx, w3m, or elinks) and take a text dump. This way, all the formatting issues -- lists, lists inside lists, headings, borders, tables, images -- all of it is taken care of automatically, and we don't need to reinvent the wheel doing that.

        I've done a LOT of text outputting with docbook and this method seems to work perfectly. IMFO, formatting directly to text might be more difficult, and perhaps redundant given that the text based browsers can already do a really good job.
        Show
        Diwaker Gupta added a comment - I would like to point out an alternate solution, that seems to me is much easier to implement, and works pretty well. This is also what is used by most Docbook DSSL/XSL stylesheets. What we can do is first render a "clean" HTML version of the page. This should be pretty easy, since the HTML conversion infrastructure is already in place. (just remove the menu and that tabs basically) After this, we just run it through a text browser (like lynx, w3m, or elinks) and take a text dump. This way, all the formatting issues -- lists, lists inside lists, headings, borders, tables, images -- all of it is taken care of automatically, and we don't need to reinvent the wheel doing that. I've done a LOT of text outputting with docbook and this method seems to work perfectly. IMFO, formatting directly to text might be more difficult, and perhaps redundant given that the text based browsers can already do a really good job.
        Hide
        Nick Chalko added a comment -
        I have used the text browser solution, when I need one or two files.
        However, I want solution that will easily handle all the documents in a site and one that will also work for a dynamic forrest install.
        Show
        Nick Chalko added a comment - I have used the text browser solution, when I need one or two files. However, I want solution that will easily handle all the documents in a site and one that will also work for a dynamic forrest install.
        Hide
        David Crossley added a comment -
        Show
        David Crossley added a comment - See some email discussion: http://marc.theaimsgroup.com/?t=107512563400001
        Hide
        Ross Gardler added a comment -
        Rick has built a the org.spache.forrest.plugin.text-output plugin for this.
        Show
        Ross Gardler added a comment - Rick has built a the org.spache.forrest.plugin.text-output plugin for this.

          People

          • Assignee:
            Rick Tessner
            Reporter:
            Nick Chalko
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development