Question
HTML 4.01 states:
2.1.2 Fragment identifiers
Some URIs refer to a location within a resource. This kind of URI ends with "#" followed by an anchor identifier (called the fragment identifier). For instance, here is a URI pointing to an anchor named section_2:
http://somesite.com/html/top.html#section_2
I believe in TWiki that fragment identifiers are required to be
WikiWords. I believe this requirement is too strict and that htmlization of fragment identifiers is not complex enough to justify this requirement. HTML4.01 only specifies that the fragment identifier (which must correspond to an anchor
name) is
cdata:
name = cdata [CS]
This attribute names the current anchor so that it may be the destination of another link. The value of this attribute must be a unique anchor name. The scope of this name is the current document. Note that this attribute shares the same name space as the id attribute.
The way to anchor a fragment identifer in TWiki is:
[[#AnchorRegex][user-text]]
and to anchor the name, is:
#AnchorRegex user-text
where:
$upperAlpha = "[:upper:]";
$lowerAlpha = "[:lower:]";
$numeric = "[:digit:]";
$mixedAlpha = "[:alpha:]";
$mixedAlphaNum = "${mixedAlpha}${numeric}";
$anchorRegex = qr/\#[${mixedAlphaNum}_]+/;
However please note the XHTML 1.0.2 spec says:
C.8. Fragment Identifiers
In XML, URI-references [RFC2396 [p.31] ] that end with fragment identifiers of the form "#foo"
do not refer to elements with an attribute name="foo"; rather, they refer to elements with an
attribute defined to be of type ID, e.g., the id attribute in HTML 4. Many existing HTML clients
don’t support the use of ID-type attributes in this way, so identical values may be supplied for
both of these attributes to ensure maximum forward and backward compatibility (e.g., <a
id="foo" name="foo">...</a>).
Further, since the set of legal values for attributes of type ID is much smaller than for those of
type CDATA, the type of the name attribute has been changed to NMTOKEN. This attribute is
constrained such that it can only have the same values as type ID, or as the Name production in
XML 1.0 Section 2.3, production 5. Unfortunately, this constraint cannot be expressed in the
XHTML 1.0 DTDs. Because of this change, care must be taken when converting existing HTML
documents. The values of these attributes must be unique within the document, valid, and any
references to these fragment identifiers (both internal and external) must be updated should the
values be changed during conversion.
Note that the collection of legal values in XML 1.0 Section 2.3, production 5 is much larger than
that permitted to be used in the ID and NAME types defined in HTML 4. When defining fragment
identifiers to be backward-compatible, only strings matching the pattern
[A-Za-z][A-Za-z0-9:_.-]* should be used. See Section 6.2 of [HTML4 [p.31] ] for more
information.
Finally, note that XHTML 1.0 has deprecated the name attribute of the a, applet, form, frame,
iframe, img, and map elements, and it will be removed from XHTML in subsequent versions.
Thus in TWiki, I believe the following enhancement should be applied:
$anchorRegex = qr/\#[A-Za-z][A-Za-z0-9:_.-]+/;
and perhaps as a secondary consideration, the use of
name be changed to
(or used in conjunction with) the
id attribute
(though I'm not sure how this might affect styles, etc).
Environment
--
JonathanCline - 25 Aug 2003
Answer
I don't agree that an #anchor_reference must be a
WikiWord. The URL
http://twiki.org/cgi-bin/view/Support/FragmentIdentifierWikification#Answer
would be a valid URL to the
Answer heading; in fact, this is how the %TOC% also works. As for the XML+Regex part of this topic, I'm in deep water so I won't comment on that.
--
TorbenGB - 26 Aug 2003
Let me clarify. You are correct, anchors need not be wiki words. Many anchors are created within the heading scope automatically. The #Answer anchor above is a result of automatic anchoring within h2, via:
<h2 name="#Answer">Answer</h2>
This enhancement request refers only to fragment identifiers. These are generated when the user types:
this is a sentence which refers to [[#SomewhereElse][somewhere else below]]
...
#SomewhereElse this is the anchor for somewhere else
I'll add a test vector below:
How to reproduce the problem
testing one two this one will work
testing three four this one will not although the anchor is valid
testing one two got here
#testingthreefour testing three four will never get here because twiki does not make an anchor from this line
--
JonathanCline - 26 Aug 2003
See also
NonRestrictiveWikiAnchors
--
JonathanCline - 26 Aug 2003
I have also thought this is too restrictive.
--
MartinCleaver - 27 Aug 2003