Apache OpenOffice (AOO) Bugzilla – Issue 54195
Compare Document does not compare only differences
Last modified: 2017-05-20 11:03:51 UTC
If you compare doc1 with doc2, even the same content will be included and crossed out. The differences are the word in the front of the paragraph and the word at the end of the paragraph. Think how much more efficient it can be if only the two differences are spotted with the other content being exactly the same. I think the priority of this issue is between 3 and 2.
Created attachment 29303 [details] sample doc1
Created attachment 29304 [details] Sample doc2
I do not think that this is a bug. It seems that always if OOo finds a different in a paragraph, the complete paragraph starting with the change will be marked as "changed". There might be more useful procedures how to inform users concerning changes but I believe currently all works "as designed", and the task might be to draft an enhancement request concerning a more powerful comparison tool. A goal might look somehow like <http://de.wikipedia.org/w/index.php?title=Segelflugzeug&diff=8925686&oldid=8925668> Can someone contribute a document describing the current design of this tool?
Well, I have just checked with two .sxw files, using m135, and it behaves properly, highlighting only the change and not the whole paragraph. That would make ths a regression and a nasty one. I will try your .odt test cases.
OK. the issue seems to be that where there are two or more changes in one paragraph, the progrramme marks everything between them as changed, even when it isn't. That's to say that when you change the first and last words of a sentence, all the words in between are marked as changed, even though they are not. This is clearly wrong. I don't know whether it is designed. But it is very confusing to use.
MRU->AMA: this is the designed behaviour when Redlining was implemented years ago. This problem should be taken in mind when the Redlining subcomponent will be redesigned in the future.
*** Issue 56491 has been marked as a duplicate of this issue. ***
I'm not sure what the consensus is at this time with this issue, but whether this is a bug or request for an enhancement, the behavior is not as usable as it could be. Anyone, like myself, editing large documents and proof reading and verifying the changes cannot use this facility as it stands; it is just too time consuming. If the functionality is as designed, I request that it is re-looked, please. Document Compare is more useful and less time consuming when only the exact differences are marked. Reluctantly, I'll be purchasing MS Word 2003; I'm currently using their trail. Fortunately for me the task at hand is large and I cannot use the time consuming document compare of OOo writer. I also request that this is escalated to a P2.
Well... I'll just add that MS-Word 2000 compares 2 documents perfectly, so it can't be impossible, but I don't know exactly how they do it. My intuitive guess is that they compare from the start of the documents AND from the end of them simultanously. Maybe it is somewhat like "bubble sorting" ... a kind of recursive routine?! I certainly agree with those who regard it as a bug, and I hope someone will correct it.
In fairness I should add that I just tried comparing 2 documents with few differences, but one significant change is that 2 paragraphs on the first page have changed order, and each of them had one word changed. That was not a problem for OOo 1.1.5, although both whole paragraphs were marked as both deleted and inserted, but MS-Word 2000 marked almost the whole document as both deleted and inserted!
I just posted on the main boards and a frequent poster asked me to re-post my comments here. I have been evaluating OOo Writer as a possible substitute for MSWord in my editorial workflows, and I thought it might be useful to share my evaluation summary, even though it’s largely critical and I had to go with MSWord. I’m hopeful that an eventual version of OOo will handle this adequately. Of vital importance to an editorial (or legal) workflow is the ability to redline a document. This stands in for the traditional proofreaders’ marks in many editorial workflows for many publishing houses. And is essential for collaborative work in any industry. In comparing OOo with MSWord, I took a 62 line (default settings) preface from a book I’m working on where the edits are frequent (averaging a word per sentence) and ran compare documents in MSWord to highlight changes “before” and “after.” The resulting document was a 67 line redline with fairly efficient, minimal effective highlighting. Using OOo’s “compare documents” function on the same “before” and “after” files, the resulting redline was 101 lines long, and usually nuked entire paragraphs, replacing them entirely with the "after" version where a word or two would have done the trick. Note that the length of the file increased 63% in redlining, where a minimal effective comparison needed only 8% more lines. Having run this evaluation, I see that the tools for accepting and rejecting changes are there, the changes themselves are simply poorly perceived by the application, and developers should consider this particular feature to be currently almost [i]entirely unusable[/i]. It’s great that you’ve started working on it, but really this isn’t even close to a feature that you should put on a menu right now to say “we have that, too.” You simply don't yet. Here’s hoping it’s implemented soon! Good luck!
I also found this problem, which I do think should be considered a bug, since it defeats the whole purpose of being able to "compare" documents. What use would it be if the changes are not being recognized precisely by the function? I also want to comment on the way the compare document function interacts with the user. When you use the filtering option and choose to see only deletions, for example, the insertions still remain underlined and color-coded. If you only want to see what is not in doc1 (original), that is in doc2 (edited), then that would be quite confusing. What I think should be considered is allowing the filters to determine which would be identified in the document. So when only deletions are selected in the filtering, insertions should not be emphasied. Of course this is just something that I thought would make it better, and is also something that people would be able to adjust to of course - in case it was not taken as an improvement. The real issue is the former of course, which needs to be addressed already. It's 2007 now! I cannot really compare with how the MS function works, because I never did get the chance to use it when I was still in MS.
I too am having major problems with document compares with minimal edits, whole 7+ line paragraphs are marked out and then marked as "new" though only a phrase or two are changed. Also, many are commenting on how good Word does. If Word is good, Wordperfect (WP) is excellent in document compare. That is one of the three reasons that we chose WP over Word in our office. I would like to suggest that WP document compare be investigated as well for implementation ideas.
*** Issue 89666 has been marked as a duplicate of this issue. ***
Maybe I have found a relatively easy way to resolve this issue and maybe even add new features to OOo: There exists some free compare-documents-utilities etc. which can be found on the following website: http://www.componentsoftware.com/ According to my tests, one of them, CSDiff, can compare two documents better than both OOo 2.4.1 and MS-Word 2000. It can compare txt-files directly, and it seems to be able to compare two Word-documents directly, by opening them through MS-Word, but I haven't tested that, because I don't have MS-Word. Maybe it could be integrated into OOo in the same way, by entering into an agreement with ComponentSoftware Inc. Maybe even their version control systems, CS-RCS Basic or CS-RCS Pro, could be integrated into OOo?
Mmh, CSDiff maybe a cool tool, but for integration into OOo it has to be open source. And of course we need this for Linux, Solaris and MacOS as well.
There are opensource programs like meld[1] or KDiff3[2]. Problem is that they only work with text files, so one has to use odf2txt[3] or antiword[4] first. Though I doubt that they help for improving OOo itself, but what about hooking them up trough an extension[5]? [1] http://meld.sourceforge.net/ [2] http://kdiff3.sourceforge.net/ [3] http://stosberg.net/odt2txt/ [4] http://www.winfield.demon.nl/ [5] http://extensions.services.openoffice.org/
Did someone try http://extensions.services.openoffice.org/project/DeltaXMLODTCompare
Greetings! Yes, I have tried the DeltaXMLODTCompare extension. Although the comparison is rather slow, the comparison quality is excellent. While this extension clearly shows how much better a job could be done and suggests a way forward (comparing directly at the XML level), it is not a solution to this issue, however, because its integration into Writer is not sufficient: - The changes are shown in colour but, from what I can see, they can not be Accepted / Rejected, making a manual (!) revision necessary. - The comparison has to be disk based, i.e., one needs two input files, and an output file is created, rather than this running transparently. As a result, one has to select *three* filenames in a dialog, and then open the third file. For a solution to the present issue one would require: 1. Meaningful highlighting of differences. 2. The ability to review and accept / reject individual identified changes. 3. It should be possible to launch a comparison conveniently and view the results. Ideally, like for the current built-in "Compare" one should be able to say "Compare the current document to a reference file on disk" and get the resulting "diff" document opened by just selecting *one* filename (the name of the reference file on disk). With several free good quality diff tools for plain text around, I find it astounding that this issue has been unresolved that long. I consider the lack of an efficient compare feature a serious short-coming of a modern word processor. If I could, I would have given this issue all my 5 votes ...
Compare documents is useless. This function should not be available in the menu until fixed. This feature that is essential to anyone who works collaboratively on documents. This is the first serious flaw I've encountered in OOo. High priority for a straightforward fix. In my specific case, I'm working on a book chapter and got edits back. Compare finds the whole document changed rather than a word or two in each paragraph.
*** Issue 117611 has been marked as a duplicate of this issue. ***
Created attachment 80175 [details] Example of incorrect compare I've seen this discussed as an old problem that has not been fixed. What it looks like to me is that whenever there is a comma (for example) in the section changed and anything else is changed, then it marks the entire paragraph (or at least the whole sentence) as changed. This is incorrect, and renders the feature useless.
As far as I can tell this issue still has not been addressed. I guess it's not important to OpenOffice. Too bad, because it's important to me.
(In reply to James from comment #23) > As far as I can tell this issue still has not been addressed. I guess it's > not important to OpenOffice. Too bad, because it's important to me. The problem is not one of importance. It is about having open-source volunteers to work on this complex area (including the built-in tracking of changes) with the capacity, capability, and will to invest in it. It is likely that this should be resolved as a "Won't Fix" simply because no such effort is foreseeable at this time. The issue could be re-opened if circumstances change.
You have a point Orcmid. But you are making my point, just with a slightly different emphasis. I wish I had the skills to work on this problem but as it is I have to rely on others. Regardless, even with this problem I will still continue to use and appreciate OpenOffice.
Reset assigne to the default "issues@openoffice.apache.org".