Diff is generally designed for source code, which is a line by line format. Html which may format a whole paragraph in one line makes the diff outputs less usefull when only small changes are made.
I suggest that wdiff (gnu? Word Diff) be an optional view when its not obvious what the changes were.
You could even get fancy and try to select the appropriate diff view depending on how much has changed.
Note this only effects presentation, rcs will still store files using line based diffs.
--
MathewBoorman - 24 Apr 2000
Interesting idea. Got an algorithm to do it?
diff works on a line (\n) basis, maybe breaking paragraphs down into sentense units and diffing on that. I have a feeling it might required more detailed knowledge of the content structure by the twiki parser.
--
NicholasLee - 24 Apr 2000
Interesting idea. Nicholas' algorithm would be an easy compromise / solution to do a regular
diff after breaking up all paragraphs into single sentences (
[\.\?\:] delimiter ) . That is what WinDiff is doing with acceptable results (WinDiffthe GUI diff tool that comes with Micro$oft Visual Studio.)
--
PeterThoeny - 24 Apr 2000
From the gnu site:
http://www.gnu.org/software/wdiff/wdiff.html
=The program wdiff is a front end to diff for comparing files on a word per word basis. A word is anything between
whitespace. This is useful for comparing two texts in which a few words have been changed and for which paragraphs have
been refilled. It works by creating two temporary files, one word per line, and then executes diff on these files. It
collects the diff output and uses it to produce a nicer display of word differences between the original files.=
--
MathewBoorman - 24 Apr 2000
I've made a modification to cvsweb once that did exactly this. You can also use strike-through and such to show additions/deletions. If I ever get some time, I'll try and add it to TWiki.
--
MattQuail - 20 Jul 2000
I had a go at hacking this in, but didn't get it finished. I suspect the best way to do this is to post-process the output of line-by-line diff, using wdiff to highlight the actual words changed.
See
DiffsHardToRecognize for another discussion of this issue.
--
RichardDonkin - 01 Aug 2001
Long time ago I found this somewhere on the web...
It might ease up on the requirement for the wdiff binary. I don't know if there will be a performance issue though
--
JornH@personNOSPAMPLEASENOSPAM.dk aka
TWikiGuest - 20 Sep 2001
The GNU wdiff supports:
--start-delete argument
Has the same effect as -w.
-w argument
Use argument as the "start delete" string. This
string will be output prior to every sequence of
deleted text, to mark where it starts. By default,
no start delete string is used unless there is no
other means of distinguishing where such text
starts; in this case the default start delete
string is [-.
--end-delete argument
Has the same effect as -x.
-x argument
Use argument as the "end delete" string. This
string will be output after every sequence of
deleted text, to mark where it ends. By default,
no end delete string is used unless there is no
other means of distinguishing where such text ends;
in this case the default end delete string is -].
and
--start-insert argument
Has the same effect as -y.
-y argument
Use argument as the "start insert" string. This
string will be output prior to any sequence of
inserted text, to mark where it starts. By
default, no start insert string is used unless
there is no other means of distinguishing where
such text starts; in this case the default start
insert string is {+.
--end-insert arguments
Has the same effect as -z.
-z argument
Use argument as the "end insert" string. This
string will be output after any sequence of
inserted text, to mark where it ends. By default,
no end insert string is used unless there is no
other means of distinguishing where such text ends;
in this case the default end insert string is +}.
so couldn't you extract the file versions and simple call wdiff with
-w <s> and
-x </s> to get
deleted text
and use some other font/color for inserted text. A change would be a deletion followed by an insertion. I tried it on the edited output of a man page and it worked fine.
--
JohnRouillard - 08 Dec 2001
Unless you are willing to make a fairly significant change to the TWiki code, it's simplest to run a line-based diff first, then do wdiff on every pair of line-based diffs (only effect of wdiff is to change the highlighting, i.e. it will still show some unchanged lines for context). Of course, this would be rather slow, so it might be better to do just word-based diff instead, or to build a Perl function that does the word diff, which might be more efficient since it avoids running wdiff for every set of changes.
--
RichardDonkin - 12 Dec 2001
See also
DiffsHardToRecognize
--
WolfgangSlany - 31 Dec 2003
I'm going to defer this until Dakar - unless someone wants to play
--
SvenDowideit - 09 May 2004