Tags:
create new tag
, view all tags
It would be nice to automatically convert documents in FileAttachments from their native formats to HTML. This could be done at the time of uploading the file; or in a lazy way - at the time of accessing the HTML equivalent for the first time. The HTML format has the added advantage that search could include file attachments as well.

Previously I was already considering automatic conversion of Word, Power Point and Excel files to HTML format so that attachments could be retrieved in the original format or in a - somewhat degraded - HTML format. At that time I could not find any free software that was suitable, or had an acceptable conversion quality. Ideally the conversion utility would be in Perl - for portability reasons. Simple C with not too many system or library dependencies would be acceptable too.

-- PeterThoeny - 09 Apr 2000

Checkout wvHtml at http://www.wvware.com - its standard with most linux distros now, and does quite a good job - you can even convert inline graphics to gif/jpg if you must...

-- CrisBailiff - 13 Feb 2001

Thanks for the pointer. By coincidence I got the same pointer by Matt Sergeant, director and CTO of http://axkit.org/ . Another library to investigate is the file conversion part of Sun's OpenOffice suite (formerly StarOffice).

For now we should first tackle AttachmentsUnderRevisionControl.

-- PeterThoeny - 18 Feb 2001

Cool. Hmm. Looks like we'd look for a wvTwiki ;^)

-- MartinCleaver - 26 Mar 2001

wvTWiki: Yes please! smile I've been spent the last four hours "porting" old system docs, many of which are already in html to TWikiShorthand. I have barely scratched the surface and am begining to wonder if it's really worth it.

Why convert docs which are already in html to wiki you ask? Because raw html is hard to edit in the little edit form, especially if it's polluted with many many gratuitous font tags. This means that even though the old docs can be viewed painlessly in TWiki, they are not likely to get updated, thereby losing the main reason for bringing them in in the first place.

-- MattWilkie - 25 Oct 2001

Quick list of conversion programs:

  • xlHtml - for Excel documents
  • wvware - a library which allows access to Microsoft Word files. It can load and parse Word 2000, 97, 95 and 6 file formats
  • catdoc & xls2csv - convert Word to plain text and Excel to comma separated ascii (csv)
  • xpdf - view/convert pdf to text

As part of changes I made to the TocPlugin, and some scripts to support it, I am using htmldoc to automatically convert sets of TWiki pages into large PDF files. Importing word documents is time consuming, although I've had fairly good success using "copy" and "paste" to move masses of text into the TWiki edit boxes. Tables are slow (inserting all the vertical bar characters), but the results are good.

An automatic way to import either the html or the rtf into TWiki form would be good, but I'm not waiting for it. There is so much redundant and obscuring markup added by work, that I think one would almost need to take a compiler approach to the problem: build a parse tree, optimize for redundancy, then optimize for "code quality".

-- CarlMikkelsen - 21 May 2003

Isn't there a way to just refer to external documents? I have a project that generates documentation in HTML using Python's help() function. I don't want to duplicate the logic involved in help() to locate all of the relevant entities in order to extract their docstrings to produce Twiki output. If I could create a Twiki topic that automatically included the body of an external HTML document, that would work well.

Alternatively, I'd have to convert the HTML to Twiki format which is, apparently, possible, but how do I get that output into my Twiki? I'd be happy if there was a means to check out, replace, then check in a Twiki topic via some API. That would allow me to use a script to produce the HTML, convert it to Twiki format, then update the appropriate Twiki topic with the new HTML.

What have I missed? Surely there's functionality for doing one of those two things, right?

-- RobStewart - 13 Jun 2003

Hi Rob, welcome to twiki.org :- )

IncludeTopicsAndWebPages might work for you. Also there is an interesting sounding plugin, SlashFilenamePlugin due out anyday now.

-- MattWilkie - 13 Jun 2003

If clickable link is enough, InterWiki is what I used. If not, you may define custom TWikiVariables opening IFRAME and closing it.

-- PeterMasiar - 14 Jun 2003

If the attachments are MSWord documents then you can convert them to wiki-text using wv and the stylesheet I attached to MsOfficeIntegration.

-- TobyCabot - 28 Jul 2003

Edit | Attach | Watch | Print version | History: r12 < r11 < r10 < r9 < r8 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r12 - 2008-08-25 - TWikiJanitor
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.