Question
Where can I find more information to help me understand syntax of searches in TWiki? I was studying
FormattedSearch today to begin to understand how to develop some of my own searches and realized there was some basic concepts about the syntax of a search that are not covered there. I am referring to elements such as
"scope",
"regex",
"nosearch", etc. Perhaps these are basic syntax related to Unix searches or something but a quick search of Google turned up nothing obviously related. I also dug around in TWiki documentation but found no further explanation about searches.
Just a little direction of where I might look for more information would be much appreciated. If I find some useful stuff, perhaps I could add some more basic instructions on searches in the TWiki documentation.
Thanks!
- TWiki version: Dec 01
- Web server: Apache
- Server OS:
- Web browser:
- Client OS:
--
LynnwoodBrown - 18 May 2002
Answer
Take a look at the (first?) table in
Codev.InlineSearch -- it provides some hint about each of the parameters you asked about. If that's not enough, post a more specific question.
--
RandyKramer - 19 May 2002
Thanks Randy - That's what I was looking for. Looking back at
FormattedSearch, you would have thought that I would have followed the link to
TWikiVariables which has much the same information. I guess I just wasn't thinking of the search function as a variable but of course it is.
My next step is to study
RegularExpression. The individual elements of a regular expression are simple enough but as they get grouped in more complex expressions, I get quickly lost. I'd love to find a tutorial not so much on how to
build a regular expression, as how to
deconstruct one.
Thanks again!
--
LynnwoodBrown - 19 May 2002
I'd like to help you with regular expressions, and someday I probably will, but right now the information I have is:
- not easily accessible
- not well organized
I have some stuff on my home TWiki. I plan to eventually move it to WikiLearn, and may make some steps in that direction in the next week or so.
Also, on
Codev.TWikiInLeo, I started annotating some parts of TWiki, including some of the regular expressions (one generated 10 long lines of comments). The thing is, you'd have to download Leo and my file, and then search for the regular expression I annotated, and it may not be one you are interested in.
A few months ago (when I annotated part of TWiki), I was "into" regular expressions. At this point I remember only a few things. There are some tricks to watch for -- the normal syntax of a Perl substitution "command" is something like s/<regex1>/<regex2>/. However, the delimiter ("/") can be almost any other character (maybe it can be any other character), which seems very confusing. (And, I actually found at least one substitution command that did use a different character as the delimiter -- forget whether it was on TWiki (doubtful) or somewhere else.)
The way I expect to "learn" regular expressions is by "deconstructing" them (I hope we're on the same wavelength here) -- that is taking a regular expression and analyzing it to figure out what it does. My intent would then be to document each such regular expression that I "deconstruct" either on WikiLearn or in something like Leo docs for a particular program.
If you deconstruct any interesting regular expressions and want to document them in a similar manner on WikiLearn, your contribution will be appreciated.
Come to think of it, I think I'll go look for that "10 liner" and put it on WikiLearn -- after I get it there I'll come back here and tell you the page name.
I'm back -- it's on
RegexRegularExpressionsExplained (it may get moved sometime in the future, perhaps to a page like
RedTWiki20010901TWikiPmExtractNameValuePair (to reflect the subroutine that the regular expression came from, in the 20010901 release of TWiki).
I don't know whether the page will appear properly in view -- you might need to view it in edit or raw view to see it correctly.
But now, digressing, why do you want to learn about regular expressions, and specifically in TWiki? Do you want to learn the TWiki code in order to help develop or customize it? If so, and depending on your level of experience in Perl and so forth (HTML, CGI, etc.), maybe you'd be interested in helping with an effort I started to document the TWiki code for Perl and TWiki newbies?
You can view what I started at
TWikiInLeo. It is very rough, incorrect, and, unfortunately based on an old release of TWiki. (I started my efforts on the 20010315 release (beta) of TWiki, then restarted on the 20010901 release, which is not the current release.)
If you are interested in learning TWiki, and are something of a newbie in Perl or TWiki (or both), let's talk some more. Leo might not be the best choice for what I started, so we can consider something different. (But Leo isn't bad, and is continuing to be developed.)
--
RandyKramer - 19 May 2002
Randy - Thanks for additional info! You got exactly what I meant by "deconstructing" regular expressions. My primary intent is simply to be able to be able to use searches in my TWiki pages. Even just copying and modifying a search out of some topic here or TWiki.org can be a big challenge (trying to figure out what part to change, etc.). This is why I've made a call to create something like a
TWikiToolChest that includes a whole bunch of pre-defined and annotated searches that a user can copy and modify.
Actually I had already noted with interest the stuff you were doing with
TWikiInLeo. I'm not yet ready to delve into the code of TWiki, but if I keep using it like I have been these past couple of months, who knows? And when I do, what you're working on would be just the thing!
- Just so as not to mislead you, I'm not actively working on that at this time -- I hope to someday get back into it, but I also have hopes that other's who are interested might build on it before I get back to it. --rhk
I will look further at
RegexRegularExpressionsExplained and will contribute an deconstruction example if I do one. Actually, the kinds of regular expressions I'm interested in are probably pretty simple. However, even then, they can be difficult to understand if one does not have any background with regular expression.
Just to take one example, here's a search example from
FormattedSearch:
%SEARCH{ "[T]opicClassification.*?value=\"[P]ublicFAQ\"" scope="text" regex="on" nosearch="on" nototal="on" format="| [[$topic]] | $formfield(OperatingSystem) | $formfield(OsVersion) |" }%
I assume that the section
"[T]opicClassification.*?value=\"[P]ublicFAQ\"" is a regular expression. It's the kind of search that I could see using alot because it's looking for the meta data associated with a form. I would like to break this down to better understand what's its suppose to do. For example, why is the "T" in TopicClassification inclosed in brackets. From my bit of reading about regular expressions, I understand that brackets indicated to search for any of several letters within, yet here there is only one letter.
Well, I'll figure it out eventually and thanks again for the help!
--
LynnwoodBrown - 20 May 2002
Re the last question - I think the use of [T]opicClassification is to match exactly
TopicClassification but not have the regular expression turned into a
WikiWord HTML link by TWiki when rendering the page.
You can play with regular expressions using 'perl -e', e.g.
perl -p -e 's/foo/bar/' testfile will run that substitute command on all lines in testfile. You can also try
perl -n -e '/foo/ && print' testfile, which is a
grep equivalent, only matching lines are printed (i.e. the print of current line happens only if the first part succeeds).
It's best to build up from simple regular expressions, as in the various regex tutorials on the net - the ones in TWiki are quite complex, so are not the best place to start. I think writing and using regexes is a better way to learn them, rather than just reading and deconstructing them - the latter can be done as a second phase perhaps.
- Interesting that you find writing and using regexes is a better way to learn them -- I find just the opposite. I should point out that I don't actually code that often -- more often I need (or want) to read and understand code. I'm sure you can learn either way -- you might learn faster by writing them, but since I have less need to write them and more need to understand them, I spend more time doing that. (This applies not just to regexes, but other programming related things as well.) My learning comes from interpreting them, and thus I think that is the efficient way to learn. (I'm trying to say that the efficient way of learning depends on the person, his current knowledge, his current "goals" (as in trying to understand existing code vs. needing to write a new program), and probably other things that don't come to mind right now.) --rhk
--
RichardDonkin - 20 May 2002
Thanks to all for more clarification. Reading through all this brings me back to my original intent which is simply to be able to create (or modify) some formatted searches which is essential for even basic TWiki applications. The truth be told, I don't really want to invest the time to become proficient in regular expressions. I never even heard of them until using TWiki! No doubt they are powerful, wonderful things. One more powerful, wonderful thing I have scant little time to learn.
All this make me wish, once again, that I had a
few well annotated, basic TWiki formated search strings that I could modify to fit my needs. Something along the lines of: "Here's a search string that searches meta data for "topic classification." Change "TopicClass" to whatever classification you're looking for and you will get a bulleted list of topics and summaries." A few more nuts and bults for my
TWikiToolChest.
--
LynnwoodBrown - 21 May 2002
The actual reason for the
"[T]opicClassification.*?value=\"[P]ublicFAQ\"" regex is to exclude the topic containing the regex search from being included in the search result. In case we would simply write
"TopicClassification.*?value=\"PublicFAQ\"" we would get a hit for the topic containing the search, which is undesirable.
- It took me a minute (or longer
) to grasp the point in the previous paragraph. The following is an attempt to make it easier to understand -- if you already understand the point, skip the next paragraph: The regular expression "[T]opic Classification" will match the same strings that the regular expression "Topic Classification" will match. However, the search phrase (regular expression) "[T]opic Classification", will not be found by either regular expression -- thus the page containing the search phrase "[T]opic Classification", will not be returned as one of the search results. If you wanted to find the page with the search phrase "[T]opic Classification", you'd have to write a regular expression that did something like "escaped" the "[" and "]", perhaps something like this: "\[T\]opic Classification". Aside: The previous explanation is one of the types of things I'd like to provide on WikiLearn -- and preferably by moving the previous explanation to another page and leaving a note on this page something like: If you don't immediately understand the previous paragraph, try Wikilearn.TopicClassificationRegularExpressionExplained. --(rhk) RandyKramer - 21 May 2002
--
PeterThoeny - 21 May 2002
Hmmm... I'm wondering why the comments I posted here this morning were edited out? Too off-topic? Editorial perogative? Mistake? I thought maybe I had forgotten to save it but I see under Diffs that it was deleted.
--
LynnwoodBrown - 21 May 2002
Your edit got replaced by a followup edit. This is caused if someone else breaks your edit lock. It also happens if the first person saves changes with a "release edit lock", then goes back in the browser to do more changes, and between these two events someone else edits the topic. Anyway, I restored your edit.
On another note, there is no real convention to add comments into another edit. I prefer signed bullets, possibly in italic. Reformatted above edits.
--
PeterThoeny - 22 May 2002
I sincerely apologize. Sometimes I start editing an older copy of a page (for example, left on my browser from the night before) and I just used the back button on my browser to go back to the edit page from the night before, then edited. I try not to do that, but since going back to the edit page that way is a common thing for me to do, I sometimes do it long after the edit lock has expired.
I hope this will serve as a reminder to me to be more careful.
--
RandyKramer - 22 May 2002
No big deal. After I posted my last comment, I realized that I was taking it a bit personally anyway - which was pretty silly. If I felt at all strongly about my comment, I could always re-insert it myself. Getting juse to wiki-culture, I suppose... In any case, I learned something about TWiki foibles - which was useful. Thanks, Peter, for clarifying and straigthening out the edits. And thanks again to everyone for helping me better understand searches!
--
LynnwoodBrown - 22 May 2002