Bug: Preview does no Quote hrefs and Catches hrefs Outside Anchors
When previewing, all links should be disabled. However, if an
HTML hyperlink does not surround its target in quotes, it is not caught by the regular expression filter. Plus, if an href exists outside an anchor tag, it still gets caught. Also, forms and onclicks are not caught.
That is, this would not get caught:
<a href=http://www.google.com>google</a>
and this would:
href="not_a_link"
Changing the regular expression to one that's a little smarter seems to fix this. I think I've anticipated nearly every href case, but I'm sure I've forgotten something (see patch below). I'm nearly positive there's probably a
CPAN resource that already knows how to do all this... but I'm too lazy to look.
Note that on 17 Aug 2003, I changed this patch to include finding onclicks and forms as well. These are not tested here, but I think it's easy to see if the hrefs below cause problems, forms and onclicks will as well. The patch below corrects all problems mentioned here (to at least some degree).
Test case
Preview this topic and view these two cases (the same as the ones above):
google (bug: not caught by preview)
href="not_a_link" (bug: caught by preview)
To fix this, try applying this patch to preview:
===================================================================
RCS file: preview,v
retrieving revision 1.1
retrieving revision 1.5
diff -r1.1 -r1.5
143c143,146
< $ptext =~ s@(href=".*?")@href="/cgi-bin/oops/Codev/PreviewDoesNoQuoteHrefs\?template=oopspreview"@goi;
---
> $ptext =~ s@(?<=<a\s)([^>]*)(href=(?:".*?"|[^"].*?(?=[\s>])))@$1href="/cgi-bin/oops/Codev/PreviewDoesNoQuoteHrefs\?template=oopspreview"@goi;
> $ptext =~ s@<form(?:|\s.*?)>@<form action="/cgi-bin/oops/Codev/PreviewDoesNoQuoteHrefs">\n<input type="hidden" name="template" value="oopspreview">@goi;
> $ptext =~ s@(?<=<)([^\s]+?[^>]*)(onclick=(?:"location.href='.*?'"|location.href='[^']*?'(?=[\s>])))@$1onclick="location.href='/cgi-bin/oops/Codev/PreviewDoesNoQuoteHrefs\?template=oopspreview'"@goi;
>
The regular expression is a bit more complicated. To illustrate, I'll diagram the href regex (the first regex):
- It makes sure that
<a comes before an href with a separator between the two
- It makes sure that there is no
> between the <a and the href
- It looks for an href=
- It then looks for a URL included in quotes or an unquoted URL with a space or a > after it
- It doesn't grab more than it needs
- And it preserves whatever was in front of and behind the href
I know that URLs should have quotes around them, but a decent number of people will forget quotes, I'm sure.
The onclick is very similar and the form is much simpler.
Environment
| TWiki version: |
February 2003 |
| TWiki plugins: |
none |
| Server OS: |
Linux |
| Web server: |
Apache |
| Perl version: |
5.6.0 |
| Client OS: |
Windows 2000 |
| Web Browser: |
MSIE6.0 |
--
TedPavlic - 18 Jul 2003
--
TedPavlic - 17 Aug 2003
Follow up
Fix record
diff -r1.45 preview
143,144c143,148
< # do not allow click on link before save:
< $ptext =~ s@(href=".*?")@href="%SCRIPTURLPATH%/oops%SCRIPTSUFFIX%/%WEB%/%TOPIC%\?template=oopspreview"@goi;
---
> # do not allow click on link before save: (mods by TedPavlic)
> my $oopsUrl = '%SCRIPTURLPATH%/oops%SCRIPTSUFFIX%/%WEB%/%TOPIC%';
> $ptext =~ s@(?<=<a\s)([^>]*)(href=(?:".*?"|[^"].*?(?=[\s>])))@$1href="$oopsUrl?template=oopspreview"@goi;
> $ptext =~ s@<form(?:|\s.*?)>@<form action="$oopsUrl">\n<input type="hidden" name="template" value="oopspreview">@goi;
> $ptext =~ s@(?<=<)([^\s]+?[^>]*)(onclick=(?:"location.href='.*?'"|location.href='[^']*?'(?=[\s>])))@$1onclick="location.href='$oopsUrl\?template=oopspreview'"@goi;
Fix is now in
TWikiAlphaRelease (for
CairoRelease) and TWiki.org. Thanks Ted
--
PeterThoeny - 29 Aug 2003
Category:
TWikiPatches