TWiki Formatted Search in Topics
Abstract
%SEARCH is extended to permit formatting and
displaying of each location in a topic which
meets the specified search criterea.
Overview
Normally the results of a
FormattedSearch summarizes each topic
meeting the search criterea. With the
format="..." parameter
there is considerable flexibility in how the search topic
information can be presented. With the below patch to
TWiki, the
%SEARCH is extended to permit formatting and
displaying of each location in a topic which
meets the specified search criterea. This is done by adding
a
hitformat parameter to the
%SEARCH, and
generalizing the
format string in an upward-compatible manner.
Syntax
To indicate that a
FormattedSearchinTopics is desired,
a
hitformat="..." parameter is specified in the
%SEARCH{...}%. Whenever a hit is found in a topic, the
string specified by
hitformat is used to format the hit,
with the string "$hit" in the string being replaced by the
indicated text in the hit. (See the discussion below
on
$pattern(...) for how the "indicated text" is specified.)
format="..." is used similar to a normal
FormattedSearch,
except that the
$pattern(...) variable is mandatory and used to
specify exactly what in a topic will constitute a hit. Unlike a normal
FormattedSearch, the character following the
$pattern need not be
a "("; it may be any character and constitutes the starting delimiter.
The ending delimiter to the
$pattern specification will be
the same character, except '(' will have an ending delimiter ')',
'<' will have '>', '{' will have '}', and '[' will have ']'. Any use of
the starting or ending delimiters in the actual pattern must be preceded by '\'.
This generalization of delimiters is available because, as discussed below,
every pattern string must indicate the hit by use of
(...) within the string,
and it gets untidy adding '\' everywhere. A wise choice of delimiters
allows easier specification of a pattern string without excessive '\'s.
Examples of pattern strings would be
$pattern(abc\(def\)ghi)
$patternxabc(def)ghix
$pattern@abc(def)ghi@
$pattern<abc(def)ghi>
The pattern string is a perl
RegularExpression, in which each nested
(...)
(or
\(...\)
if '(' and ')' were chosen for delimiters) specifies what will be substituted
for
$hit in the
format="..." string if the
entire pattern string matches
some text in the topic. This permits only a portion of a hit to be selected. It
should be noted that the search within a topic is automatically case-insensitive.
To prevent display of the actual
%SEARCH{...}% string in the search results,
hits with the string "hitformat=" in them are ignored, and hits with "%SEARCH" in
them will not cause a second
%SEARCH to be performed.
There are several processing steps in a
FormattedSearchinTopics, and different
information is available at each step for output as the results:
- The webs specified by the
web='...' are searched for topics matching the "text", search="text", and topic="..." parameters. It is these parameters (as modified by regex="..." parameters, etc.) which determine what topics will be inspected for hits. In essence, this is a first-level quick search to limit the topics which will be searched more carefully for hits.
- For these topics, the
hitformat and format parameter strings are used to format the output as follows:
- The
hitformat string is divided into three parts: The prehit text before the "$hit" string, the "$hit" string, and the posthit text after the "$hit" string.
- The
format string is divided also into three parts: the prepattern string before the $pattern(...), the pattern string in the $pattern(...), and the postpattern string after the $pattern(...).
- The pattern string has various hit strings indicated by
(...).
- Taking liberties with new lines, the output of a successful FormattedSearchinTopics will be presented as follows:
Optional header string as specified by the header="..." parameter
prepattern string for topic 1
prehit string
hit #1 in topic 1
posthit string
prehit string
hit #2 in topic 1
posthit string
...
postpattern string for topic 1
prepattern string for topic 2
prehit string
hit #1 in topic 2
posthit string
...
Example
If we want to display the paragraphs whereever "GPL" is mentioned, a search like:
%SEARCH{ "GPL" hitformat=" * $hit<br>" scope="text" regex="on" nosearch="on" nototal="on" header="*Web: $web*" format="<br>Topic: [[$topic]]<br>$pattern(\([^\n\r]*GPL[^\n\r]*\))<br>"}%
would generate output in a virgin Feb2003 release of TWiki like:
Web: TWiki
Topic:
FormattedSearchinTopics * If we want to display the paragraphs whereever "GPL" is mentioned, a search like:
Topic:
GnuGeneralPublicLicense * TWiki has a GPL (GNU General Public License). What is GPL?
* TWiki is distributed under the GNU General Public License, see
TWikiDownload. GPL is one of the free software licenses that protects the copyright holder, and at the same time allows users to redistribute the software under the terms of the license. Extract:
* * See the GNU General Public License for more details, published at
http://www.gnu.ai.mit.edu/copyleft/gpl.html * Please note that TWiki is not distributed under the LGPL (Lesser General Public Licence), which implies TWiki can only be used with software that is licensed under conditions compliant with the GPL. Embedding in proprietary software requires an alternative license. Contact the author for details.
Topic:
TWikiFuncModule *
http://www.gnu.org/copyleft/gpl.htmlTopic:
TWikiSite * * TWiki is developed as Free Software under the
GNU/GPLTopic:
WebHome * * TWiki is developed as Free Software under the
GNU/GPL
Related Discussions
--
HarryFelder - 07 Mar 2003
Harry, this is a well defined spec and implementation. It is also powerful and flexible because you can specify a pre-pattern for each topic.
Nevertheless, the
MultipleSearchesInSameTopic is easier to understand and use. This is what is now in the core code. The
topic="" parameter is pending, see
SearchTopicNameAndTopicText.
--
PeterThoeny - 29 Sep 2003
Harry, I tried to install your patch but it failed.
The patch wants to change this file:
lib/TWiki/Search.pm~ Sat Jan 4 20:36:46 2003
But the installation file is actually:
Jan 5 2003 lib/TWiki/Search.pm
Any chance of fixing this? I don't want to wait for the next version of TWiki to appear and I find your spec. easy to understand...
Peter, why not put both implementations into the core?
--
SimonHardyFrancis - 15 Jul 2004
Category:
TWikiPatches