Tags:
create new tag
, view all tags
Diff is generally designed for source code, which is a line by line format. Html which may format a whole paragraph in one line makes the diff outputs less usefull when only small changes are made.

I suggest that wdiff (gnu? Word Diff) be an optional view when its not obvious what the changes were.

You could even get fancy and try to select the appropriate diff view depending on how much has changed.

Note this only effects presentation, rcs will still store files using line based diffs.

-- MathewBoorman - 24 Apr 2000

Interesting idea. Got an algorithm to do it?

diff works on a line (\n) basis, maybe breaking paragraphs down into sentense units and diffing on that. I have a feeling it might required more detailed knowledge of the content structure by the twiki parser.

-- NicholasLee - 24 Apr 2000

Interesting idea. Nicholas' algorithm would be an easy compromise / solution to do a regular diff after breaking up all paragraphs into single sentences ( [\.\?\:] delimiter ) . That is what WinDiff is doing with acceptable results (WinDiffthe GUI diff tool that comes with Micro$oft Visual Studio.)

-- PeterThoeny - 24 Apr 2000

From the gnu site: http://www.gnu.org/software/wdiff/wdiff.html

=The program wdiff is a front end to diff for comparing files on a word per word basis. A word is anything between whitespace. This is useful for comparing two texts in which a few words have been changed and for which paragraphs have been refilled. It works by creating two temporary files, one word per line, and then executes diff on these files. It collects the diff output and uses it to produce a nicer display of word differences between the original files.=

-- MathewBoorman - 24 Apr 2000

I've made a modification to cvsweb once that did exactly this. You can also use strike-through and such to show additions/deletions. If I ever get some time, I'll try and add it to TWiki.

-- MattQuail - 20 Jul 2000

I had a go at hacking this in, but didn't get it finished. I suspect the best way to do this is to post-process the output of line-by-line diff, using wdiff to highlight the actual words changed.

See DiffsHardToRecognize for another discussion of this issue.

-- RichardDonkin - 01 Aug 2001

Long time ago I found this somewhere on the web...

It might ease up on the requirement for the wdiff binary. I don't know if there will be a performance issue though

  • HtmlDiff.pl: A pure perl solution not requiring wdiff

-- JornH@personNOSPAMPLEASENOSPAM.dk aka TWikiGuest - 20 Sep 2001

The GNU wdiff supports:

       --start-delete argument
              Has the same effect as -w.

       -w argument
              Use argument as the "start  delete"  string.   This
              string  will  be  output prior to every sequence of
              deleted text, to mark where it starts.  By default,
              no  start  delete string is used unless there is no
              other  means  of  distinguishing  where  such  text
              starts;  in  this  case  the  default  start delete
              string is [-.


       --end-delete argument
              Has the same effect as -x.

       -x argument
              Use argument as  the  "end  delete"  string.   This
              string  will  be  output  after  every  sequence of
              deleted text, to mark where it ends.   By  default,
              no  end  delete  string  is used unless there is no
              other means of distinguishing where such text ends;
              in this case the default end delete string is -].

and

       --start-insert argument
              Has the same effect as -y.

       -y argument
              Use  argument  as  the "start insert" string.  This
              string will be output  prior  to  any  sequence  of
              inserted   text,  to  mark  where  it  starts.   By
              default, no start  insert  string  is  used  unless
              there  is  no  other  means of distinguishing where
              such text starts; in this case  the  default  start
              insert string is {+.


       --end-insert arguments
              Has the same effect as -z.

       -z argument
              Use  argument  as  the  "end  insert" string.  This
              string  will  be  output  after  any  sequence   of
              inserted  text, to mark where it ends.  By default,
              no end insert string is used  unless  there  is  no
              other means of distinguishing where such text ends;
              in this case the default end insert string is +}.

so couldn't you extract the file versions and simple call wdiff with -w <s> and -x </s> to get deleted text and use some other font/color for inserted text. A change would be a deletion followed by an insertion. I tried it on the edited output of a man page and it worked fine.

-- JohnRouillard - 08 Dec 2001

Unless you are willing to make a fairly significant change to the TWiki code, it's simplest to run a line-based diff first, then do wdiff on every pair of line-based diffs (only effect of wdiff is to change the highlighting, i.e. it will still show some unchanged lines for context). Of course, this would be rather slow, so it might be better to do just word-based diff instead, or to build a Perl function that does the word diff, which might be more efficient since it avoids running wdiff for every set of changes.

-- RichardDonkin - 12 Dec 2001

See also DiffsHardToRecognize

-- WolfgangSlany - 31 Dec 2003

I'm going to defer this until Dakar - unless someone wants to play

-- SvenDowideit - 09 May 2004

Topic attachments
I Attachment History Action Size Date Who Comment
Perl source code filepl HtmlDiff.pl r1 manage 5.6 K 2001-09-20 - 22:28 TWikiGuest A pure perl solution not requiring wdiff
Edit | Attach | Watch | Print version | History: r17 < r16 < r15 < r14 < r13 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r17 - 2006-04-29 - SamHasler
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.