There is a need for better meta data handling by Plugins as discussed in
PluginApiForTopicHandling and other places. Also for TWiki internal use we should
SimplifyInternalMetaDataHandling.
Here is a sample code (proposed) of how easily Plugins could manipulate topic text with meta data:
# read topic text, then separate text and meta data
my $text = readTopicText( $web, $topic );
my $meta = metaExtractFromText( $text );
# manipulate topic text
$text =~ s/old/new/g;
# manipulate meta data
my $value = metaGet( "FIELD", "TopicClassification" );
$value = "NewValue" if( $condition );
metaSet( $meta, "FIELD", "TopicClassification", $value );
# merge text and meta data, then save topic text
$text = metaMergeWithText( $text, $meta );
TWiki::Func::saveTopicText( $web, $topic, $text );
The API needs to be refined to handle also meta data with multiple items of the same type like META:FILEATTACHMENT.
This interface is lightweight, is easy to use, and can be implemented with little effort
--
PeterThoeny - 28 Mar 2004
Given that we are sticking with a functional interface instead of using OO, how about keeping meta and text separate?
my $text = TWiki::Func::readTopicText( $web, $topic );
TWiki::Func::saveTopicText( $web, $topic, $text );
my $value = TWiki::Func::readTopicMeta( $web, $topic, $key );
TWiki::Func::writeTopicMeta($web, $topic, $key, $value);
--
MartinCleaver - 28 Mar 2004
I'd rather this than be forced to know that META is interleaved with text.
Peter, in
PluginApiForTopicHandling you wrote "expose metadata to plugins. Shudder :eek:". I don't understand this remark. Exposing metadata to plugnis is exactly what you
have been doing, by forcing the plugin author to know it is stored interleaved in the topic text. Or am I missing something here?
--
CrawfordCurrie - 28 Mar 2004
And if you want to be quick and efficient:
my $value = TWiki::Func::readTopicMeta( $web, $topic, $key );
TWiki::Func::writeTopicMeta($web, $topic, $key, $value);
my %values = TWiki::Func::readTopicMetaHash( $web, $topic);
TWiki::Func::writeTopicMetaHash($web, $topic, %values);
--
MartinCleaver - 28 Mar 2004
my turn
if we continue to embed the meta data into the text in the programatic interface (both internally and for plugins, we force the Store code to either
- always be one flat file per topic (with embeded meta data only) OR
- to combine the seperated data (from some non-linear data store)
both of which are significantly more limiting than a much more generic interface as shown by Martin
what if we consider the topic text to be metadata named
TEXT ? then we don't need readTopicText at all.
--
SvenDowideit - 28 Mar 2004
Sven's
last comment is perhaps going a little bit too far at this stage. However in all other practical respects he is right.
--
CC
I agree with Sven. I see no reason why we can't simplify the API of the TWiki::Meta object and get behind that as the "plugin API for metadata handling." Bring back
readTopic and add an orthogonal
saveTopic and we'd be set.
--
WalterMundt - 28 Mar 2004 - 10:41
Martin is right. The LOGICAL topic contains the four categories:
- The Textual body that is displayed
- The Metadata information
- The Access Control information
- Any topic-sepcific settings
Currently, TWiki has all four parts stored in the same file. Apart from leading to some logical inconsistencies, for example, having a META statement in verbatim block. The code that extracts metadata is not a true parser. See
TWiki:Codev/AccidentalMetaDeclInTopicTextIsNasty
and other places.
I've seen a Wiki based on YAML where a variation on Sven's idea was implemented. It was a single file, yes, so
RCS applied easily. But YAML - see
http://www.yaml.org
as well as late-model Ruby and Perl - is a serialization tool.
Imagine a perl struct (or an object if you play closure games) that had fields for the metadata, the settings, the access permission and the topic data. The metadat might look something like:
'METADATA' => {
'MOVED' => {
'to' => 'TWiki.ManagingTopics',
'from' => 'TWiki.RenameTopic',
'date' => '999329908',
'by' => 'MikeMannix'
},
'ATTACHMENT' => {
'comment' => 'Just a sample',
'version' => '',
'date' => '964294620',
'user' => 'thoeny',
'name' => 'Sample.txt',
'path' => 'C:\\DATA\\Sample.txt',
'attr' => '',
'size' => '30'
},
'PARENT' => [
'WebHome',
'Main.WebHome'
],
This is purely illustrative.
The point is that the YAML cod can slurp this up and return any node or branch. So there is a getTopic() call that returns the topicObject.
The point is with serialization it doens't matter how that object is repesented internally. YAML is just a convention that means the serialization is text and human readbale. We could use
DataDumper() instead.
So this isn't so much an API for handling metadata as it is that the topic is an object.
I've proposed elsewhere the idea of moving metadata into a DBHash, moving access control into another DBHash. We can make those four components I mentioned above separate files, but the getTopicobject() would hide the details.
This is important in the long term. With this kind of interface we are saying that the storage is hidden completely. At the moment the file code is a bit littered. But we can extend the model so that each web could be stored differently. THis has implications for corporate designs and implementations. %MAINWEB% can be at corporate HQ and accessed over the network via FTP, for example. Here's a YAML description of web ...
webs:
# Only set here things that don't need to be set in the per web preferences
#
Main:
title: "Sys" # rename main to sys
searchable: true # this is the default can be overridden in Web Preferences
skin: tiger # default skin for this web can be overridden in Web Preferences
autolink: true # can be overridden in Web Preferences
storage: Compatible # Short form
Wiki:
title: "Config" # more user friendly
storage:
Topic: Filesys text
Revision: Filesys RCS # Could be CVS
Meta: Filesys YAML
Access: Filesys hash # no rev history
Prefs: Filesys hash # no rev history
Know:
# title defaults if not specified
storage:
# corporate central
Topic: Http URI(.../topic) parameters
Meta: Http URI(.../meta) parameters
Access: Http URI(.../access) parameters
Prefs: Http URI(.../prefs) parameters
Sandbox:
storage:
Topic: MySQL databasename parameters
Meta: MySQL databasename parameters
Access: Filesys hash # no rev history
Prefs: Filesys hash # no rev history
Junk:
storage
Topic: Filesys history=null # throw away revisions
Meta: Filesys history=null # throw away attachment revisions
Access: Null # returns "OK" gracefully
Prefs: Filesys hash # no rev history
Why am I showing YAML rather than put this information in a Topic and managing it that way?
That's not the point. YAML is about serialization of internal data structures. The YAML is human readable so I'm using it here to represent how and what the data is and is organized. How it is actually represented ... heck who knows. I could equally well use YAML to describe a template ... So its like Zen and the finger pointing at the moon. Don't obsess about the finger, its the acto of pointing that counts.
Its a case of "this data structure describes the web and how to access it and what the charecteristics and capabilities of the storage are".
Walter is right; Sven is right. The YAML is just showing a representation. A reprsentation of the internal form in a human readable format. (Perhaps think of it as a debuging tools, eh?) The API hides not only the structure of the internal form but how it is "serialized" by
saveTopic() and "un-serialized" by
readTopic(). Now the
my $text = readTopicText( $web, $topic );
my $meta = metaExtractFromText( $text );
becomes a non issue, a wong way of approaching things. The metadata
and the text are in the
topicOhject - but you don't need to know how they are represented or stored. Hand, by refernce of course, the
topicObject around.
And the perl YAML code makes all of this easy.
Brian Ingerson's Perl implementation is available at yaml.freepan.org. It include a command line tool you can experiment with. I have. Sadly work pressure means I can't continue with this line of experimentation.
sigh
--
AntonAylward - 30 Sep 2004
Dodging the cruft, the specific proposal here is to export the TWiki::Meta class definition to plugins. This involves specific steps:
- Documentation of the Meta class
- Un-deprecation of the
readTopic method in the plugins API
- Adding a
saveTopic method to the plugins API to the following spec:
saveTopic( $web, $topic, $meta, $text, $options )
-
$web - web for the topic
-
$topic - topic name
-
$meta - reference to TWiki::Meta object
-
$text - text of the topic (without embedded meta-data!!!
-
$options - hash of save options
$options may include:
dontlog |
don't log this change in twiki log |
comment |
comment for save |
minor |
True if this is a minor change, and is not to be notified |
--
CrawfordCurrie - 04 Mar 2005
Done in r3738
--
CrawfordCurrie - 05 Mar 2005
Questions:
- Don't we constrain ourselfs for a more flexible interface in the future by exposing TWiki::Meta?
- Plugins are supposed to use only the TWiki::Func API. Now we expect them to break this rule by using TWiki::Meta?
--
PeterThoeny - 05 Mar 2005
Perhaps if you shared what your vision is for a more flexible API I could comment. From where we stand today, the current interface:
- severely restricts the flexibility of the core impementation (by requiring META to be embedded in text),
- adds unnecessary processing overhead for plugins authors (parsing META),
- forces code duplication (a plugin author seriously manipulating meta-data has to duplicate TWiki::Meta - see DBCachePlugin),
- is very inefficient, as meta has to be re-inserted before calling handlers and removed again inthe core.
Many plugins are simple enough not to have to access core functionality but just about eveything that tries to manipulate
meta-data is seriously hampered. I'm not proposing breaking the rules. I'm proposing changing the rules, and adding TWiki::Meta
to the API available to plugins authors, something that should have been done from day 1.
--
CrawfordCurrie - 06 Mar 2005
I agree with Crawford. More often than not I have had to use "illegal" calls from Core to achieve the desired plugin functionality. If you examine the
PluginsConformanceReport topic, you will find that many other plugins do as well.
--
ThomasWeigert - 06 Mar 2005