Tags:
create new tag
view all tags
There is a need for better meta data handling by Plugins as discussed in PluginApiForTopicHandling and other places. Also for TWiki internal use we should SimplifyInternalMetaDataHandling.

Here is a sample code (proposed) of how easily Plugins could manipulate topic text with meta data:

    # read topic text, then separate text and meta data
    my $text = readTopicText( $web, $topic );
    my $meta = metaExtractFromText( $text );

    # manipulate topic text
    $text =~ s/old/new/g;

    # manipulate meta data
    my $value = metaGet( "FIELD", "TopicClassification" );
    $value = "NewValue" if( $condition );
    metaSet( $meta, "FIELD", "TopicClassification", $value );

    # merge text and meta data, then save topic text
    $text = metaMergeWithText( $text, $meta );
    TWiki::Func::saveTopicText( $web, $topic, $text );

The API needs to be refined to handle also meta data with multiple items of the same type like META:FILEATTACHMENT.

This interface is lightweight, is easy to use, and can be implemented with little effort smile

-- PeterThoeny - 28 Mar 2004

Given that we are sticking with a functional interface instead of using OO, how about keeping meta and text separate?

    my $text = TWiki::Func::readTopicText( $web, $topic );
    TWiki::Func::saveTopicText( $web, $topic, $text );

    my $value = TWiki::Func::readTopicMeta( $web, $topic, $key );
    TWiki::Func::writeTopicMeta($web, $topic, $key, $value);

-- MartinCleaver - 28 Mar 2004

I'd rather this than be forced to know that META is interleaved with text.

Peter, in PluginApiForTopicHandling you wrote "expose metadata to plugins. Shudder :eek:". I don't understand this remark. Exposing metadata to plugnis is exactly what you have been doing, by forcing the plugin author to know it is stored interleaved in the topic text. Or am I missing something here?

-- CrawfordCurrie - 28 Mar 2004

And if you want to be quick and efficient:

    my $value = TWiki::Func::readTopicMeta( $web, $topic, $key );
    TWiki::Func::writeTopicMeta($web, $topic, $key, $value);

    my %values = TWiki::Func::readTopicMetaHash( $web, $topic);
    TWiki::Func::writeTopicMetaHash($web, $topic, %values);

-- MartinCleaver - 28 Mar 2004

my turn smile

if we continue to embed the meta data into the text in the programatic interface (both internally and for plugins, we force the Store code to either

  1. always be one flat file per topic (with embeded meta data only) OR
  2. to combine the seperated data (from some non-linear data store)

both of which are significantly more limiting than a much more generic interface as shown by Martin

what if we consider the topic text to be metadata named TEXT ? then we don't need readTopicText at all.

-- SvenDowideit - 28 Mar 2004

Sven's last comment is perhaps going a little bit too far at this stage. However in all other practical respects he is right.

-- CC

I agree with Sven. I see no reason why we can't simplify the API of the TWiki::Meta object and get behind that as the "plugin API for metadata handling." Bring back readTopic and add an orthogonal saveTopic and we'd be set.

-- WalterMundt - 28 Mar 2004 - 10:41

Martin is right. The LOGICAL topic contains the four categories:

  • The Textual body that is displayed
  • The Metadata information
  • The Access Control information
  • Any topic-sepcific settings

Currently, TWiki has all four parts stored in the same file. Apart from leading to some logical inconsistencies, for example, having a META statement in verbatim block. The code that extracts metadata is not a true parser. See TWiki:Codev/AccidentalMetaDeclInTopicTextIsNasty and other places.

I've seen a Wiki based on YAML where a variation on Sven's idea was implemented. It was a single file, yes, so RCS applied easily. But YAML - see http://www.yaml.org as well as late-model Ruby and Perl - is a serialization tool.

Imagine a perl struct (or an object if you play closure games) that had fields for the metadata, the settings, the access permission and the topic data. The metadat might look something like:

'METADATA' => {
    'MOVED' => {
      'to' => 'TWiki.ManagingTopics',
      'from' => 'TWiki.RenameTopic',
      'date' => '999329908',
      'by' => 'MikeMannix'
    },
    'ATTACHMENT' => {
      'comment' => 'Just a sample',
      'version' => '',
      'date' => '964294620',
      'user' => 'thoeny',
      'name' => 'Sample.txt',
      'path' => 'C:\\DATA\\Sample.txt',
      'attr' => '',
      'size' => '30'
    },
    'PARENT' => [
      'WebHome',
      'Main.WebHome'
    ],

This is purely illustrative.

The point is that the YAML cod can slurp this up and return any node or branch. So there is a getTopic() call that returns the topicObject.

The point is with serialization it doens't matter how that object is repesented internally. YAML is just a convention that means the serialization is text and human readbale. We could use DataDumper() instead.

So this isn't so much an API for handling metadata as it is that the topic is an object.

I've proposed elsewhere the idea of moving metadata into a DBHash, moving access control into another DBHash. We can make those four components I mentioned above separate files, but the getTopicobject() would hide the details.

This is important in the long term. With this kind of interface we are saying that the storage is hidden completely. At the moment the file code is a bit littered. But we can extend the model so that each web could be stored differently. THis has implications for corporate designs and implementations. %MAINWEB% can be at corporate HQ and accessed over the network via FTP, for example. Here's a YAML description of web ...

webs:
    # Only set here things that don't need to be set in the per web preferences
    # 
    Main:
      title: "Sys"         # rename main to sys
      searchable: true     # this is the default   can be overridden in Web Preferences
      skin: tiger          # default skin for this web can be overridden in Web Preferences
      autolink: true       # can be overridden in Web Preferences
      storage: Compatible  # Short form

    Wiki:
      title: "Config" # more user friendly
      storage: 
         Topic:    Filesys text
         Revision: Filesys RCS    # Could be CVS
         Meta:     Filesys YAML
         Access:   Filesys hash   # no rev history
         Prefs:    Filesys hash   # no rev history

    Know: 
      # title defaults if not specified
      storage:
         # corporate central
         Topic:   Http URI(.../topic) parameters
         Meta:    Http URI(.../meta) parameters
         Access:  Http URI(.../access) parameters
         Prefs:   Http URI(.../prefs) parameters

    Sandbox:
      storage:
         Topic:   MySQL databasename parameters
         Meta:    MySQL databasename parameters
         Access:  Filesys hash   # no rev history
         Prefs:   Filesys hash   # no rev history

    Junk:
      storage
         Topic:   Filesys history=null   # throw away revisions
         Meta:    Filesys history=null   # throw away attachment revisions 
         Access:  Null                   # returns "OK" gracefully
         Prefs:   Filesys hash   # no rev history

Why am I showing YAML rather than put this information in a Topic and managing it that way?

That's not the point. YAML is about serialization of internal data structures. The YAML is human readable so I'm using it here to represent how and what the data is and is organized. How it is actually represented ... heck who knows. I could equally well use YAML to describe a template ... So its like Zen and the finger pointing at the moon. Don't obsess about the finger, its the acto of pointing that counts.

Its a case of "this data structure describes the web and how to access it and what the charecteristics and capabilities of the storage are".

Walter is right; Sven is right. The YAML is just showing a representation. A reprsentation of the internal form in a human readable format. (Perhaps think of it as a debuging tools, eh?) The API hides not only the structure of the internal form but how it is "serialized" by saveTopic() and "un-serialized" by readTopic(). Now the

    my $text = readTopicText( $web, $topic );
    my $meta = metaExtractFromText( $text );

becomes a non issue, a wong way of approaching things. The metadata and the text are in the topicOhject - but you don't need to know how they are represented or stored. Hand, by refernce of course, the topicObject around.

And the perl YAML code makes all of this easy.

Brian Ingerson's Perl implementation is available at yaml.freepan.org. It include a command line tool you can experiment with. I have. Sadly work pressure means I can't continue with this line of experimentation. sigh

-- AntonAylward - 30 Sep 2004

Dodging the cruft, the specific proposal here is to export the TWiki::Meta class definition to plugins. This involves specific steps:

  1. Documentation of the Meta class
  2. Un-deprecation of the readTopic method in the plugins API
  3. Adding a saveTopic method to the plugins API to the following spec:

saveTopic( $web, $topic, $meta, $text, $options )

  • $web - web for the topic
  • $topic - topic name
  • $meta - reference to TWiki::Meta object
  • $text - text of the topic (without embedded meta-data!!!
  • $options - hash of save options
$options may include:
dontlog don't log this change in twiki log
comment comment for save
minor True if this is a minor change, and is not to be notified

-- CrawfordCurrie - 04 Mar 2005

Done in r3738

-- CrawfordCurrie - 05 Mar 2005

Questions:

  • Don't we constrain ourselfs for a more flexible interface in the future by exposing TWiki::Meta?
  • Plugins are supposed to use only the TWiki::Func API. Now we expect them to break this rule by using TWiki::Meta?

-- PeterThoeny - 05 Mar 2005

Perhaps if you shared what your vision is for a more flexible API I could comment. From where we stand today, the current interface:

  1. severely restricts the flexibility of the core impementation (by requiring META to be embedded in text),
  2. adds unnecessary processing overhead for plugins authors (parsing META),
  3. forces code duplication (a plugin author seriously manipulating meta-data has to duplicate TWiki::Meta - see DBCachePlugin),
  4. is very inefficient, as meta has to be re-inserted before calling handlers and removed again inthe core.

Many plugins are simple enough not to have to access core functionality but just about eveything that tries to manipulate meta-data is seriously hampered. I'm not proposing breaking the rules. I'm proposing changing the rules, and adding TWiki::Meta to the API available to plugins authors, something that should have been done from day 1.

-- CrawfordCurrie - 06 Mar 2005

I agree with Crawford. More often than not I have had to use "illegal" calls from Core to achieve the desired plugin functionality. If you examine the PluginsConformanceReport topic, you will find that many other plugins do as well.

-- ThomasWeigert - 06 Mar 2005

Edit | Attach | Watch | Print version | History: r18 < r17 < r16 < r15 < r14 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r18 - 2006-07-22 - WillNorris
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2026 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.