Tags:
create new tag
view all tags

Question

Since we installed Google Mini we've seen a manyfold increase in the visit counts on WebStatistics.

The entries look like this:

| 01 Sep 2006 - 10:57 | TWikiGuest | rdiff | TWiki.TWikiTip016 | 1 1 gsa | 10.100.10.17 |

Is there a way to not include gsa (in the extra info) in the count script?

Environment

TWiki version: TWikiRelease04x00x04
TWiki plugins: DefaultPlugin, EmptyPlugin, InterwikiPlugin
Server OS:  
Web server:  
Perl version:  
Client OS:  
Web Browser:  
Categories: Statistics

-- ArthurClemens - 01 Sep 2006

Answer

ALERT! If you answer a question - or someone answered one of your questions - please remember to edit the page and set the status to answered. The status selector is below the edit box.

The user agent is identified in the extra info (5th cell). You can exclude gsa (Google Search Appliance) from the statistics.

Add the following to sub _collectLogData of lib/UI/Statistics.pm: (this is untested)

        my $userObj = $session->{users}->findUser($logFileUserName);
        
        my( $opName, $webTopic, $notes, $ip ) = @fields;

        # ignore gsa spider
        next if( $notes && $notes =~ / gsa / );

        # ignore minor changes - not statistically helpful
        next if( $notes && $notes =~ /(minor|dontNotify)/ );

        # ignore searches for now - idea: make a "top search phrase list" 
        next if( $opName && $opName =~ /(search)/ );

Also, for better performance and search result, using robots.txt, instruct the spider to exclude all scripts but view and fiewfile.

-- PeterThoeny - 01 Sep 2006

Change status to:
Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r2 - 2006-09-01 - PeterThoeny
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2026 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.