Tags:
attachments1Add my vote for this tag pdf1Add my vote for this tag search2Add my vote for this tag create new tag
, view all tags

SearchPDFPlugin

SearchPDFPlugin allows the contents of attached PDF files to be included in searches. This functionality is covered in other plugins (SearchEngineSwishEAddOn & SearchEnginePluceneAddOn) but these plugins depended on CPAN modules or programs that did not run on a windows server.

How does it work?

This plugin requires an external program to extract text from PDF files and then stores the results in a META tag within the topic.  There are three main components of this process:

SearchPDFPlugin: Handles the events related to attachments being added to or removed from topics.  Checks to see if the attachment is a PDF and if the attachment is being removed it removes any META data associated with the attachment.  If the attachment is being added then it writes an entry into the SearchPDF.txt file in the work area.

SearchPDF.txt: Tracks when new attachments have been added and need to be indexed. 

  • If this file contains the word ALL on a single line then all topics are checked for PDF attachments.

indexPDF.pl: Process the SearchPDF.txt file in the work area by calling the text extraction program to generate META data and saves the data in the appropriate topic.

Plugin Installation Instructions

  • Download the ZIP file from the Plugin web (see below)
  • Unzip SearchPDFPlugin.zip in your root ($TWIKI_ROOT) directory. Content:
File: Description:
data/TWiki/SearchPDFPlugin.txt This page.
lib/TWiki/Plugins/SearchPDFPlugin.pm The plugin code.
pub/_work_areas/SearchPDFPlugin/SearchPDF.txt Work file that stores recently attached PDFs that need to be indexed (contains 'ALL' so the first time the script runs all topics with PDFs are indexed.
tools/indexPDF.pl Script that reads SearchPDF.txt file and adds META data to topics.

  • Create a new user TWikiSearchPDF that is a member of the TWikiAdminGroup (or edit the preferences below to select an account of your choice).
  • Download and install the XPDF program for extracting text from PDF files (http://www.foolabs.com/xpdf/download.html)
  • Edit the TWikiPreferences for your site and add the following:
   * Search PDF plugin needs a user account in order to modify topics
      * Set SEARCHPDFUSER = TWikiSearchPDF
      * Set SEARCHPDFUSERWEB = Main  
  • Add a line to LocalSite.cfg that specifies the location and name of the XPDF program:
    • $TWiki::cfg{Plugins}{SearchPDFPlugin}{XPDFLocation} = 'c:/Wiki/xpdf-3.02-win32/pdftotext.exe';
  • Visit configure in your TWiki installation, and enable the plugin in the {Plugins} section.

Plugin Info

  • Set SHORTDESCRIPTION = Search attached PDF documents.

Plugin Author: TWiki:Main.AndyBeardsall
Copyright: © 2007, TWiki:Main.AndyBeardsall
License: GPL (GNU General Public License)
Plugin Version: 5 Sept 2007 (V1.002)
Change History:  
08 Aug 2007: Initial version
14 Aug 2007: Moved xpdf location variable to LocalSite.cfg, moved Plugin Prefs to TWikiPreferences in Main
5 Sept 2007: Added xpdf executable name to the location entry in LocalSite.cfg and removed hard coded name from indexPDF.pl
TWiki Dependency: $TWiki::Plugins::VERSION 1.1
CPAN Dependencies: none
Other Dependencies: xpdf
Perl Version: 5.005
Benchmarks: GoodStyle nn%, FormattedSearch nn%, SearchPDFPlugin nn%
Plugin Home: http://TWiki.org/cgi-bin/view/Plugins/SearchPDFPlugin
Feedback: http://TWiki.org/cgi-bin/view/Plugins/SearchPDFPluginDev
Appraisal: http://TWiki.org/cgi-bin/view/Plugins/SearchPDFPluginAppraisal

Related Topics: TWikiPlugins, DeveloperDocumentationCategory, AdminDocumentationCategory, TWikiPreferences

Topic attachments
I Attachment History Action Size Date Who Comment
Compressed Zip archivezip SearchPDFPlugin.zip r3 r2 r1 manage 9.4 K 2007-09-05 - 20:52 AndyBeardsall Plugin files - v1.002
Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r5 - 2007-09-05 - AndyBeardsall
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2015 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.