This patch re-implements the full internal link functionality in the PDF output. Bookmarks, basic-links and named destinations again point to the exact spot instead of the top of the page. The intermediate area tree XML format is also supported (except for named destinations). The target PageViewport is still determined during the layout phase; the on-page position is calculated in PDFRenderer. Note that the code could be greatly simplified if we moved everything to the renderer. This wouldn't break anything in FOP, but it would prevent custom renderers from getting the target PageViewport key. The patch itself also breaks renderers who pick up that key (as the type of the InternalLink trait has changed), but not beyond repair: all that's needed is a minor code change wherever that trait is read and the caller expects a String. The vast majority of changes are in PDFRenderer, PDFFactory, XMLRenderer and AreaTreeParser. But some other modifications were also required (or at least highly desired). The most important of them are: - PageViewport needs to know for what IDs it is the first PV, independently of ATH. This info is needed in the renderer. - The dummy area generated by a fo:wrapper must be added to the area tree, at least if it has an ID. (DocBook generates empty wrappers for <indexterm>s.) [Note: I kept this a "generic" InlineArea. But maybe it would be better to create a subclass, e.g. DummyInlineArea.] - Two extra constructors for BookmarkData, for stuff read back from the area tree XML. Also made the tagging of subdata as "unresolved" conditional, because bookmarks read from AT XML are already resolved. - To avoid code duplication (and divergence), the actual adding of an INTERNAL_LINK trait to an area is now done exclusively in the LinkResolver. - Created an additional constructor for PDFDestination. - New method PDFState.getBaseTransform() returns the transformation to be used for fixed-positioned areas. I also added some helper functions here and there to move duplicated code into (esp. in PDFFactory). And on at least two occasions I changed existing comments because they puzzled me rather than helping me understand the purpose of the commented-upon object. Whether all these changes are acceptable is of course up to the team. If any changes or extensions are required, I'm willing to provide them.
Created attachment 19922 [details] Proposed patch file
First of all, thanks a lot for working on this patch, Paul. I've looked at it during the last two days since I realized I couldn't really apply my own changes [1] in this area without processing your patch first because of possible conflicts. Anyway, in my own tests and your patch locally applied, none of the GoTo Actions get properly positioned since PageViewport.setFirstWithID() is never called (except from the AreaTreeParser which is only active in special environments. PDFRenderer.getTargetableID() never returns an ID because those "first IDs" are not filled. Can you comment on that? I'm wondering if it weren't simpler just to track "first IDs" inside PDFRenderer only, i.e. whenever an area with an ID is encountered its position is recorded. If later the same ID is recorded but from a page logically before the earlier page (out-of-order rendering), the position would be overwritten. It could possibly also remove the necessity to extend the area tree XML format with the "idfirsts" trait. Just a thought. [1] Just FYI, my changes I mentioned above are targeted at restoring proper package dependencies, as the pdf package currently relies on the area package which is bad if we plan to extract the PDF library to XML Graphics Commons at some point.
Hi Jeremias, > Anyway, in my own tests and your patch locally applied, none of the > GoTo Actions get properly positioned since PageViewport.setFirstWithID() > is never called (except from the AreaTreeParser which is only active in > special environments. It's also called from AreaTreeHandler.associateIDWithPageViewport (near the top of the patch file). Maybe this didn't make it into your source because there are several conflicts with a more recent commit. They're all trivial though; easy to resolve. > I'm wondering if it weren't simpler just to track "first IDs" inside > PDFRenderer only, i.e. whenever an area with an ID is encountered its > position is recorded. If later the same ID is recorded but from a page > logically before the earlier page (out-of-order rendering), the position > would be overwritten. Yes, this was one of the options I presented on fop-dev while I was working on this. It would make PDFRenderer more complicated, but you could kick the entire link resolution stuff out of the layout code (PV, ATH), so overall the code would become simpler. The problem is that this would break other (custom) renderers that expect the INTERNAL_LINK trait to contain the target PV key. > It could possibly also remove the necessity to extend the area tree XML > format with the "idfirsts" trait. Just a thought. I take it you don't like that attribute. Neither do I. It's ugly, and every time I look at it I get the feeling that it shouldn't be necessary. But it was the best I could come up with given the chosen approach (= resolve to page level in layout phase, determine exact location in renderer). There's a workaround possible though (in PDFRenderer): store the first occurrence of each ID on every PV in a map (with the ID as key, and a list of PV->location maps as value). Whenever an internal link trait is parsed, the ID plus the key of the first PV containing that ID are extracted. At that moment, if the location of that ID on that PV is already in the map, the resolution can be completed. If not, it'll have to wait (but this is also the case in the current patch). The renderer wouldn't even have to worry about which page comes first, because the link trait already contains that information. If you want me to implement this, I will.
(In reply to comment #3) > Hi Jeremias, > > > Anyway, in my own tests and your patch locally applied, none of the > > GoTo Actions get properly positioned since PageViewport.setFirstWithID() > > is never called (except from the AreaTreeParser which is only active in > > special environments. > > It's also called from AreaTreeHandler.associateIDWithPageViewport (near the top > of the patch file). Maybe this didn't make it into your source because there are > several conflicts with a more recent commit. They're all trivial though; easy to > resolve. You're right. Looks like I clicked too quickly through the "apply patch" dialog. <snip/> > > I'm wondering if it weren't simpler just to track "first IDs" inside > > PDFRenderer only, i.e. whenever an area with an ID is encountered its > > position is recorded. If later the same ID is recorded but from a page > > logically before the earlier page (out-of-order rendering), the position > > would be overwritten. > > Yes, this was one of the options I presented on fop-dev while I was working on > this. It would make PDFRenderer more complicated, but you could kick the entire > link resolution stuff out of the layout code (PV, ATH), so overall the code > would become simpler. The problem is that this would break other (custom) > renderers that expect the INTERNAL_LINK trait to contain the target PV key. Well, that's me not being able to catch up with everything lately. :-( I get the impression that it should be possible to do this without breaking backwards-compatibility too much. Maybe we should do an inventory of people who are maintaining external renderers so we have an idea who would be affected and how much. > > It could possibly also remove the necessity to extend the area tree XML > > format with the "idfirsts" trait. Just a thought. > > I take it you don't like that attribute. Neither do I. It's ugly, and every time > I look at it I get the feeling that it shouldn't be necessary. But it was the > best I could come up with given the chosen approach (= resolve to page level in > layout phase, determine exact location in renderer). There's a workaround > possible though (in PDFRenderer): store the first occurrence of each ID on every > PV in a map (with the ID as key, and a list of PV->location maps as value). > Whenever an internal link trait is parsed, the ID plus the key of the first PV > containing that ID are extracted. At that moment, if the location of that ID on > that PV is already in the map, the resolution can be completed. If not, it'll > have to wait (but this is also the case in the current patch). The renderer > wouldn't even have to worry about which page comes first, because the link trait > already contains that information. > > If you want me to implement this, I will. I'm in no position to "want" anything. :-) I can only make wishes and be happy that there are people around helping us out. But consider the removal of the idfirsts attribute a wish. *g* Since I'll be leaving for ApacheCon in a few hours I may not have enough time to finish applying the current patch. If you could look into the issue until the end of the week that'd be awesome. But don't worry if you don't make it. BTW, another wish if I may: Would you please submit an ICLA (and CCLA if necessary) [2]? Your patch is a borderline case where an ICLA on file becomes necessary. Thanks a lot! You can submit the document by fax or by sending a scanned version to BOTH secretary.at.apache.org AND legal-archive.at.apache.org. [2] http://www.apache.org/licenses/#clas
(In reply to comment #4) > I get the impression that it should be possible to do this without breaking > backwards-compatibility too much. Maybe we should do an inventory of people > who are maintaining external renderers so we have an idea who would be > affected and how much. Yes, if nobody picks up this trait, then we don't have to worry about it either. Within FOP, only PDFRenderer uses it. Notice that my patch does break custom renderers at this point, but they can be fixed with a one-line code change. The only way not to break them at all would be to pass PV key and ID in separate traits. >> It could possibly also remove the necessity to extend the area tree XML >> format with the "idfirsts" trait. Just a thought. > I'm in no position to "want" anything. :-) I can only make wishes and be happy > that there are people around helping us out. But consider the removal of the > idfirsts attribute a wish. *g* Well, I think I can do it this week. The more I look at that attribute, the more I hate it, so I guess I'm motivated :-) > Since I'll be leaving for ApacheCon in a few hours I may not have enough time > to finish applying the current patch. I wish you all a great time at the conference and in Amsterdam (my hometown). At least the weather is fine! > BTW, another wish if I may: Would you please submit an ICLA (and CCLA if > necessary) OK, I'll take care of it.
OK, I've removed the idfirsts attribute from the area tree XML. I found a simpler and better solution than tracking the ID positions on all pages in the PDFRenderer though. Since the AT XML is always parsed in document order, all that's needed is for the AreaTreeParser.Handler to call currentPageViewport.setFirstWithID() whenever a new PROD_ID is encountered. This way PageViewport.isFirstWithID() remains usable no matter whether the source is rendered straight from the XSL-FO or from an XML area tree. Otherwise the effort to determine the correct PV would have to be duplicated in every interested renderer. I'll upload a revised patch against the current source tree.
Created attachment 20131 [details] Revised patch, no more idfirsts attribute in area tree XML
Thanks a lot, Paul. I'm quite comfortable with the new patch, and so I applied it: http://svn.apache.org/viewvc?view=rev&rev=535866 BTW, your ICLA has already been recorded. Thanks for that, too.
> I'm quite comfortable with the new patch, and so I applied it: Thanks, I'm very glad that my patch is of use!
*** Bug 41532 has been marked as a duplicate of this bug. ***
Created attachment 20211 [details] small patch file Patch file to fix j2se1.4 dependency.
I discovered recently that FOP still aims to be JDK1.3 compliant so I updated my Eclipse IDE environment today and discovered this small J2SE1.4 dependency on the use of String.split(). This small patch should fix this.
Created attachment 20212 [details] small patch file checkstyled :-)
(In reply to comment #12) > I discovered recently that FOP still aims to be JDK1.3 compliant so I updated my > Eclipse IDE environment today and discovered this small J2SE1.4 dependency on > the use of String.split(). This small patch should fix this. Thanks for spotting this, Adrian. Looks like I'm not compiling against J2SE 1.3 often enough. Patch applied (this one anyway).
(In reply to comment #14) >> I discovered recently that FOP still aims to be JDK1.3 compliant Sorry, I wasn't aware of this. >> This small patch should fix this. It does, but the patched code also throws a StringIndexOutOfBoundsException if for some reason the attribute value is in brackets but doesn't contain a comma. I think it's best to add a comma check to the innermost if. This will also bring the code in line with the description (my original code wasn't, I'm afraid). So here's an even smaller patch.
Created attachment 20215 [details] Patch to check for comma together with brackets
(In reply to comment #16) > Created an attachment (id=20215) [edit] > Patch to check for comma together with brackets > Patch applied. Thanks, Paul.
batch transition pre-FOP1.0 resolved+fixed bugs to closed+fixed