Issue 54195 - Compare Document does not compare only differences
Summary: Compare Document does not compare only differences
Status: CONFIRMED
Alias: None
Product: Writer
Classification: Application
Component: editing (show other issues)
Version: 3.3.0 or older (OOo)
Hardware: PC Windows XP
: P3 Trivial with 28 votes (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords: oooqa
: 56491 89666 117611 (view as issue list)
Depends on:
Blocks:
 
Reported: 2005-09-04 05:15 UTC by tsehungtin
Modified: 2017-05-20 11:03 UTC (History)
9 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
sample doc1 (6.36 KB, application/vnd.oasis.opendocument.text)
2005-09-04 05:16 UTC, tsehungtin
no flags Details
Sample doc2 (6.36 KB, application/vnd.oasis.opendocument.text)
2005-09-04 05:16 UTC, tsehungtin
no flags Details
Example of incorrect compare (14.22 KB, application/vnd.oasis.opendocument.text)
2013-01-26 18:13 UTC, mhender668
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description tsehungtin 2005-09-04 05:15:15 UTC
If you compare doc1 with doc2, even the 
same content will be included and crossed 
out. 

The differences are the word in the front 
of the paragraph and the word at the end 
of the paragraph. Think how much more 
efficient it can be if only the two 
differences are spotted with the other 
content being exactly the same. 

I think the priority of this issue is 
between 3 and 2.
Comment 1 tsehungtin 2005-09-04 05:16:17 UTC
Created attachment 29303 [details]
sample doc1
Comment 2 tsehungtin 2005-09-04 05:16:48 UTC
Created attachment 29304 [details]
Sample doc2
Comment 3 Rainer Bielefeld 2005-09-04 08:51:30 UTC
I do not think that this is a bug. It seems that always if OOo finds a different
in a paragraph, the complete paragraph starting with the change will be marked
as "changed". There might be more useful procedures how to inform users
concerning changes but I believe currently all works "as designed", and the task
might be to  draft an enhancement request concerning a more powerful comparison
tool.

A goal might look somehow like
<http://de.wikipedia.org/w/index.php?title=Segelflugzeug&diff=8925686&oldid=8925668>

Can someone contribute a document describing the current design of this tool?
Comment 4 ingenstans 2005-09-05 09:38:10 UTC
Well, I have just checked with two .sxw files, using m135, and it behaves 
properly, highlighting only the change and not the whole paragraph. That would 
make ths a regression and a nasty one. I will try your .odt test cases.
Comment 5 ingenstans 2005-09-05 09:47:37 UTC
OK. the issue seems to be that where there are two or more changes in one 
paragraph, the progrramme marks everything between them as changed, even when it 
isn't. That's to say that when you change the first and last words of a 
sentence, all the words in between are marked as changed, even though they are 
not. This is clearly wrong. I don't know whether it is designed. But it is very 
confusing to use. 
Comment 6 michael.ruess 2005-09-30 13:24:06 UTC
MRU->AMA: this is the designed behaviour when Redlining was implemented years
ago. This problem should be taken in mind when the Redlining subcomponent will
be redesigned in the future.
Comment 7 michael.ruess 2005-10-26 14:43:50 UTC
*** Issue 56491 has been marked as a duplicate of this issue. ***
Comment 8 evdm 2005-10-26 20:04:29 UTC
I'm not sure what the consensus is at this time with this issue, but whether
this is a bug or request for an enhancement, the behavior is not as usable as it
could be.

Anyone, like myself, editing large documents and proof reading and verifying the
changes cannot use this facility as it stands; it is just too time consuming.

If the functionality is as designed, I request that it is re-looked, please.
Document Compare is more useful and less time consuming when only the exact
differences are marked.

Reluctantly, I'll be purchasing MS Word 2003; I'm currently using their trail.
Fortunately for me the task at hand is large and I cannot use the time consuming
document compare of OOo writer.

I also request that this is escalated to a P2.
Comment 9 henrik_roseno 2006-10-21 00:21:48 UTC
Well... I'll just add that MS-Word 2000 compares 2 documents perfectly, so it 
can't be impossible, but I don't know exactly how they do it.
My intuitive guess is that they compare from the start of the documents AND 
from the end of them simultanously.
Maybe it is somewhat like "bubble sorting" ... a kind of recursive routine?!

I certainly agree with those who regard it as a bug, and I hope someone will 
correct it.
Comment 10 henrik_roseno 2006-11-12 10:10:48 UTC
In fairness I should add that I just tried comparing 2 documents with few 
differences, but one significant change is that 2 paragraphs on the first page 
have changed order, and each of them had one word changed. That was not a 
problem for OOo 1.1.5, although both whole paragraphs were marked as both 
deleted and inserted, but MS-Word 2000 marked almost the whole document as both 
deleted and inserted!
Comment 11 wants_it_to_work 2006-12-22 21:48:51 UTC
I just posted on the main boards and a frequent poster asked me to re-post my
comments here.

I have been evaluating OOo Writer as a possible substitute for MSWord in my
editorial workflows, and I thought it might be useful to share my evaluation
summary, even though it’s largely critical and I had to go with MSWord. I’m
hopeful that an eventual version of OOo will handle this adequately.

Of vital importance to an editorial (or legal) workflow is the ability to
redline a document. This stands in for the traditional proofreaders’ marks in
many editorial workflows for many publishing houses. And is essential for
collaborative work in any industry.

In comparing OOo with MSWord, I took a 62 line (default settings) preface from a
book I’m working on where the edits are frequent (averaging a word per sentence)
and ran compare documents in MSWord to highlight changes “before” and “after.”
The resulting document was a 67 line redline with fairly efficient, minimal
effective highlighting.

Using OOo’s “compare documents” function on the same “before” and “after” files,
the resulting redline was 101 lines long, and usually nuked entire paragraphs,
replacing them entirely with the "after" version where a word or two would have
done the trick. Note that the length of the file increased 63% in redlining,
where a minimal effective comparison needed only 8% more lines.

Having run this evaluation, I see that the tools for accepting and rejecting
changes are there, the changes themselves are simply poorly perceived by the
application, and developers should consider this particular feature to be
currently almost [i]entirely unusable[/i]. It’s great that you’ve started
working on it, but really this isn’t even close to a feature that you should put
on a menu right now to say “we have that, too.” You simply don't yet.

Here’s hoping it’s implemented soon! Good luck!
Comment 12 gvsa123 2007-07-27 17:51:11 UTC
I also found this problem, which I do think should be considered a bug, since 
it defeats the whole purpose of being able to "compare" documents. What use 
would it be if the changes are not being recognized precisely by the function?

I also want to comment on the way the compare document function interacts with 
the user. When you use the filtering option and choose to see only deletions, 
for example, the insertions still remain underlined and color-coded. If you 
only want to see what is not in doc1 (original), that is in doc2 (edited), then 
that would be quite confusing. What I think should be considered is allowing 
the filters to determine which would be identified in the document. So when 
only deletions are selected in the filtering, insertions should not be 
emphasied.

Of course this is just something that I thought would make it better, and is 
also something that people would be able to adjust to of course - in case it 
was not taken as an improvement. The real issue is the former of course, which 
needs to be addressed already. It's 2007 now!

I cannot really compare with how the MS function works, because I never did get 
the chance to use it when I was still in MS.
Comment 13 amcguire 2007-09-17 18:44:17 UTC
I too am having major problems with document compares with minimal edits, whole
7+ line paragraphs are marked out and then marked as "new" though only a phrase
or two are changed.

Also, many are commenting on how good Word does.  If Word is good, Wordperfect
(WP) is excellent in document compare.  That is one of the three reasons that we
chose WP over Word in our office.

I would like to suggest that WP document compare be investigated as well for
implementation ideas.
Comment 14 michael.ruess 2008-05-20 08:59:05 UTC
*** Issue 89666 has been marked as a duplicate of this issue. ***
Comment 15 henrik_roseno 2008-06-26 15:39:26 UTC
Maybe I have found a relatively easy way to resolve this issue and maybe even 
add new features to OOo: 
There exists some free compare-documents-utilities etc. which can be found on 
the following website: 
http://www.componentsoftware.com/ 
According to my tests, one of them, CSDiff, can compare two documents better 
than both OOo 2.4.1 and MS-Word 2000. It can compare txt-files directly, and it 
seems to be able to compare two Word-documents directly, by opening them 
through MS-Word, but I haven't tested that, because I don't have MS-Word.
Maybe it could be integrated into OOo in the same way, by entering into an 
agreement with ComponentSoftware Inc. 
Maybe even their version control systems, CS-RCS Basic or CS-RCS Pro, could be 
integrated into OOo?
Comment 16 andreas.martens 2008-06-26 16:06:35 UTC
Mmh, CSDiff maybe a cool tool, but for integration into OOo it has to be open
source.
And of course we need this for Linux, Solaris and MacOS as well.
Comment 17 hhielscher 2008-06-26 17:03:31 UTC
There are opensource programs like meld[1] or KDiff3[2]. Problem is that they
only work with text files, so one has to use odf2txt[3] or antiword[4] first.

Though I doubt that they help for improving OOo itself, but what about hooking
them up trough an extension[5]?

[1] http://meld.sourceforge.net/
[2] http://kdiff3.sourceforge.net/
[3] http://stosberg.net/odt2txt/
[4] http://www.winfield.demon.nl/
[5] http://extensions.services.openoffice.org/
Comment 18 Mathias_Bauer 2009-04-20 11:15:26 UTC
Did someone try

http://extensions.services.openoffice.org/project/DeltaXMLODTCompare
Comment 19 kreil 2010-03-07 05:28:24 UTC
Greetings!

Yes, I have tried the DeltaXMLODTCompare extension.

Although the comparison is rather slow, the comparison quality is excellent.

While this extension clearly shows how much better a job could be done and
suggests a way forward (comparing directly at the XML level), it is not a
solution to this issue, however, because its integration into Writer is not
sufficient:
- The changes are shown in colour but, from what I can see, they can not be
Accepted / Rejected, making a manual (!) revision necessary.
- The comparison has to be disk based, i.e., one needs two input files, and an
output file is created, rather than this running transparently. As a result, one
has to select *three* filenames in a dialog, and then open the third file.

For a solution to the present issue one would require:
1. Meaningful highlighting of differences.
2. The ability to review and accept / reject individual identified changes.
3. It should be possible to launch a comparison conveniently and view the results.

Ideally, like for the current built-in "Compare" one should be able to say
"Compare the current document to a reference file on disk" and get the resulting
"diff" document opened by just selecting *one* filename (the name of the
reference file on disk).

With several free good quality diff tools for plain text around, I find it
astounding that this issue has been unresolved that long. I consider the lack of
an efficient compare feature a serious short-coming of a modern word processor.
If I could, I would have given this issue all my 5 votes ...
Comment 20 fasteddyb 2010-05-19 22:41:26 UTC
Compare documents is useless. This function should not be available in the menu
until fixed. This feature that is essential to anyone who works collaboratively
on documents. This is the first serious flaw I've encountered in OOo. High
priority for a straightforward fix.

In my specific case, I'm working on a book chapter and got edits back. Compare
finds the whole document changed rather than a word or two in each paragraph. 
Comment 21 michael.ruess 2011-03-31 10:45:16 UTC
*** Issue 117611 has been marked as a duplicate of this issue. ***
Comment 22 mhender668 2013-01-26 18:13:46 UTC
Created attachment 80175 [details]
Example of incorrect compare

I've seen this discussed as an old problem that has not been fixed. What it looks like to me is that whenever there is a comma (for example) in the section changed and anything else is changed, then it marks the entire paragraph (or at least the whole sentence) as changed. This is incorrect, and renders the feature useless.
Comment 23 James 2016-03-18 17:21:05 UTC
As far as I can tell this issue still has not been addressed. I guess it's not important to OpenOffice. Too bad, because it's important to me.
Comment 24 orcmid 2016-03-18 17:56:03 UTC
(In reply to James from comment #23)
> As far as I can tell this issue still has not been addressed. I guess it's
> not important to OpenOffice. Too bad, because it's important to me.

The problem is not one of importance.  It is about having open-source volunteers to work on this complex area (including the built-in tracking of changes) with the capacity, capability, and will to invest in it.  It is likely that this should be resolved as a "Won't Fix" simply because no such effort is foreseeable at this time.

The issue could be re-opened if circumstances change.
Comment 25 James 2016-03-18 18:39:08 UTC
You have a point Orcmid. But you are making my point, just with a slightly different emphasis.  I wish I had the skills to work on this problem but as it is I have to rely on others. Regardless, even with this problem I will still continue to use and appreciate OpenOffice.
Comment 26 Marcus 2017-05-20 11:03:51 UTC
Reset assigne to the default "issues@openoffice.apache.org".