latex1Add my vote for this tag math1Add my vote for this tag create new tag
, view all tags

How to Include Existing LaTeX Docs in TWiki?

This topic is open for discussion on the best way to convert existing latex source file to the TWiki markup language, or at least view it as a fallback. This request has been brought up in a number of topics, including:

-- Contributors: ScottHoge


The challenge

IFAIK to date, there is no general .tex file parser available in perl. In my experience, TeX and LaTeX are so feature rich, that parsing .tex files is difficult. Line-by-line parsers, such at ltoh, work reasonably well, but fail on complex environments. For example, one would like to convert

Latex SourceTWiki Markup
 Ax = b
 Ax = b 
which requires more complexity than a line-by-line parser can provide.

(One point of evidence as to the complexity of latex parsing lies in the fact that latex support in OpenOffice.org is shallow (e.g. it was non-existent last I looked a year ago).)

A few years back when I looked at this problem, I reached the conclusion that a parser based on Parse::RecDescent was the best way to go. This method generates a hash for each level in the document, enabling one to keep META data associated with a particular environment (e.g. label and equation ) with the raw constructs. Is that still the the case? or can we leverage an intermediate program to convert to HTML (list below) that can then be fed into kupu, for example? Or are there other options?

A list of latex to HTML parsers I'm familiar with

  • ltoh
    • perl, line-by-line translator, easy to extend
  • tth
    • C, ???
  • latex2html
    • perl, but internals are not easily accessible/extendable
  • helvea
    • Objective CAML, most complete
  • tex2page
    • Scheme. This looks promising: under active development; currently uses tex->dvi->ps->png for images, but could easily be customized to use dvipng;

A more complete list is available here: http://www.tug.org/interest.html#web

Latex to RTF?

I've had reasonably good success with using latex2rtf to generate docs MS-Word can read. Can this be of any help?

An Alternative?

Would a more accessible approach be to save raw latex files as a TWiki topic, using TWiki to edit them, but render using a latex to HTML backend? One would lose some flexibility, e.g. TWiki:Plugins.SectionalEditPlugin, but gain in ease of inclusion.

Example: Declare new TWiki macro, RENDERLATEX (or some such name) Enter in

\section{ ... }

\subsection{ ... }

\bibliography{ ... }

and get back the corresponding HTML.

OT: Hadn't seen this before today: http://www.cs.queensu.ca/drl/ffes/ Wow!

-- ScottHoge - 27 Feb 2006

Maybe MathML can be somehow exploited. I remember that the LaTeXToMathMLPlugin uses something called itex2MML.

I second the %RENDERLATEX% environment proposal if there exists an easy way to %INCLUDE% LaTeX code attached to a topic, although I fear that there will be some security issues.

Guess the best way to combine TWiki and LaTeX is to use TWiki for collaboration and LaTeX for layouting the collected content for publication. Maybe supporting this by providing a special latex skin that generates plain latex-files that can be postprocessed by your favorite LaTeX installation to create nice and compact PDFs. -- Ups, we already can do this. wink

-- FranzJosefSilli - 27 Feb 2006

Maybe %RENDERLATEX% is the way to go. If done well, one could use it as a raw import to at least view (and maybe edit) existing latex documents, and then the conversion to full TWiki markup can occur as needed.

The use of MathML or rendered math images would probably depend on the HTML back-end render that utlimately gets choosen. Although I thought it was fantastic when it first appeared, these days I tend to dislike html2latex (difficult to install and extend). I've heard good reviews of hevea, but the dependance on Objective CAML is a deterent. I plan on giving tex2page a look and see if it is compatible with TWiki. Are there any others out there that deserve a look?

-- ScottHoge - 28 Feb 2006

Hey guys... I'm totally new here, but I have been editing the TWiki:Plugins.LatexModePlugin to allow viewing files written entirely in latex, where all one would have to do is copy and paste the entire latex document into %BEGINALLTEX% ... %ENDALLTEX% tags... Right now it works on a private wiki (for my purposes anyway)... So anyway I was wondering if anyone was interested in just taking a look at the code.

Right now it is basically a straight line-by-line parser that calls handleLatex(..) ...a lot. It basically looks at \begin{...} \end{blocks} (it handles nesting quite well) and renders each block separately. Of course if the block is too big then there's a problem... so what it does now is handle cases. It converts all itemizers into bulleted lists using wiki's (3*k spaces + * + space). It also handles the proof environment separately.

Anyway... that's all for now.

-- EvanChou - 07 Mar 2006

Yes, please feed back your chances. Let someone of he community (Scott?) check it for possible security risks.

-- FranzJosefSilli - 08 Mar 2006

The file is attached. I kind of forgot to document it... I will get arund to doing that this weekend perhaps.

-- EvanChou - 10 Mar 2006

Thanks, Evan.

I took a look at your changes, and they all look very reasonable. Thanks for sharing! Any objection to including at least some of the changes in a later release?

-- ScottHoge - 11 Mar 2006

No objections at all. I'm glad that my changes might have some use outside of our own wiki.

-- EvanChou - 12 Mar 2006

Update: I tested the attached plugins last night, and found a few errors.

  1. The attached code does not include the security updates from v 2.4, so should not be used on publicly accessible sites unless the changes from v 2.4 are merged in.
  2. The parsing is a bit aggressive, and so I found that simple constructs fail to render. E.g.
The data acquired from each coil, $W_l$, can be described by
\begin{equation}  \label{eq:one}  s_l(G_y^g,t) = \int\!\!\!\int
\rho(x,y)\,W_l(x,y)\, e^{\jmath \gamma (G_x x t + G_y^g y \tau)} dx \ dy  \end{equation}

In summary, the attached code may work for some folks, but I'm still inclined to still persue a Parse::RecDesent implementation for more robust parsing.

-- ScottHoge - 08 May 2006

After struggling to get Parse::RecDescent to do exactly what I wanted, I decided to build a simple LaTeX parser instead. I plan to follow the strategy outlined in Evan's code. That is: parse what can be easily parsed, convert the parsed pieces to TML, and pass all unhandled chunks off to the image generation engine.

For starters, I plan to build a general (and very minimal) parser focused on three types:

  • environment blocks: \begin{} ... \end{}
  • simple commands: \command{}
  • simple macros: \macro

Handlers for specific commands of each type can then to added in as needed.

So, feel free to attach example raw latex samples that I can use during development testing. Code handlers would be welcome as well. For example, one section of Evan's code focuses on converting the theorem environment to TML. I hope to post an example matching the equation environment example given above.

-- ScottHoge - 11 May 2006

Tonight I uploaded the latex parsing (and partial conversion) module that I've been toying with the past few months. It works reasonably well on a number of topics, but is likely to find more use as a .tex to TML module than a true show-latex-docs-in-twiki module. Time will tell.

To try it out, download the latest release from LatexModePlugin, and add in the Parse.pm file from SVN: http://svn.twiki.org/svn/twiki/trunk/LatexModePlugin/lib/TWiki/Plugins/LatexModePlugin

and then take a look at the included pod documentation. (e.g. perldoc Parse.pm, or use PerlDocPlugin)

This code is very alpha. It works well enough that I'm happy to let it loose in the wild, but it is not ready yet to include as part of a standard release.


-- ScottHoge - 30 Sep 2006

my parse implementation has been working OK now for a few months, but I've always felt it could be more robust. Central to this was a desire to rewrite the parsing logic to more closely resemble HTML::Parse.

It appears that at least two other folks had similar ideas. Tonight I came across LaTeX::TOM, which is exactly what I was looking for. As time permits, the parse.pm module will start using this CPAN module, and eventually be included in the official release.

-- ScottHoge - 29 Jan 2007

Topic attachments
I Attachment History Action Size Date Who Comment
Perl source code filepm LatexModePlugin.pm r1 manage 46.5 K 2006-03-10 - 11:57 EvanChou Hacked up file, a derivative of LatexModePlugin
Edit | Attach | Watch | Print version | History: r19 < r18 < r17 < r16 < r15 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r19 - 2008-02-17 - SvenDowideit
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.