This page is intended to list the varieties of text markup available in Linux (and maybe other systems), provide a sample or description of them (or references), list their common filename extensions or identifying characteristics, list the tools used to create them, the tools available to convert one to the other, the conversions possible, and limitations. (And maybe other things.)
For now, I am collecting tidbits of information. This page will require refactoring.
Here is a list of some of the formats / extensions that come to mind immediately:
- TeX
- LaTeX
- LyX
- KLyX
- .dvi
- hypertext
- .ps
- .pdf
- .txt
- .doc (MSWord, sorry!)
- The six varieties of .txt producible by MSWord
- sgml
- html
- xml
- xhtml
- .rtf
- info
- Texinfo
- man
- groff
-
Some limitations
- .dvi (and .pdf) -- Can't cut and paste (from the viewer), weird paging (does the spacebar work in acrobat -- it works in the Linux viewer)
Some converters
- a2ps -- ascii to postscript
- catdoc -- Word to LaTeX
- delatex, detex -- LaTeX to ASCII
- dvips -- .dvi to postscript (.ps)
- dvi2fax -- .dvi to fax
- dvilj -- .dvi to Hewlett-Packard LaserJet format
- dvilj2p -- .dvi to Hewlett-Packard LaserJet 2p format
- dvilj4 -- .dvi to Hewlett-Packard LaserJet 4 format
- eps2eps -- why? Because some applications output poor eps -- sometimes this helps.
- hevea
-- LaTeX to HTML
- html2text -- html2txt (link removed, apparently a dead link (and on a spam blacklist))
- htmldoc
-- HTML to PDF or PS
- latex2html -- LaTeX to HTML
- pdf2ps
- pdftohtml
- pdftops
- pdftotext
- pdf2dsc
- pstricks -- ??
- ps2pdf -- postscript to pdf
- ps2ascii -- postscript to ascii
- ps2png -- postscript to png??
- ps2ps -- why? Because some applications output poor ps -- sometimes this helps.
- word2x -- Word to LaTeX
- col -b -- strips backspaces and doubled characters out of things like ascii man files
- do apropos filters
- man2txt -- man page files (troff -man macro markup) to plain text, see http://www.vitanuova.com/inferno/man/1/man.html
. See also (suggested by Brandon :
#!/bin/csh
man $1 | 2txt > $1.txt
-
- If none of the above work well enough or cannot be found, maybe man2html would be useful -- use man2html and then strip out all the html tags. (One of my big desires is to have paragraphs be recognized as paragraphs (blocks of text) versus lines of text with line feeds between them.)
- man2html -- UNIX nroff(1) manual pages to HTML
- dos2unix, unix2dos -- convert line endings from to and vice versa
- bibtex2html
- wv -- word to HTML (and others?)
Other notes
TeX uses .dvi files (binary, not human readable). First a marked-up plain-text file is created which is run through a processor to create the .dvi file. Later the .dvi file can be run through another processor to create the finished text for printing (or display).
Other Links
See other entries, like
TeX,
LaTeX,
LyX, etc.
See Also (or Combine With)
Contributors
- RandyKramer - 08 Feb 2002
- Brendan -- 19 Feb 2002
- <If you edit this page, add your name here, move this to the next line>