Question
Since we installed Google Mini we've seen a manyfold increase in the visit counts on WebStatistics.
The entries look like this:
| 01 Sep 2006 - 10:57 | TWikiGuest | rdiff | TWiki.TWikiTip016 | 1 1 gsa | 10.100.10.17 |
Is there a way to not include
gsa (in the extra info) in the count script?
Environment
--
ArthurClemens - 01 Sep 2006
Answer
If you answer a question - or someone answered one of your questions - please remember to edit the page and set the status to answered. The status selector is below the edit box.
The user agent is identified in the extra info (5th cell). You can exclude
gsa (Google Search Appliance) from the statistics.
Add the following to
sub _collectLogData of
lib/UI/Statistics.pm: (this is untested)
my $userObj = $session->{users}->findUser($logFileUserName);
my( $opName, $webTopic, $notes, $ip ) = @fields;
# ignore gsa spider
next if( $notes && $notes =~ / gsa / );
# ignore minor changes - not statistically helpful
next if( $notes && $notes =~ /(minor|dontNotify)/ );
# ignore searches for now - idea: make a "top search phrase list"
next if( $opName && $opName =~ /(search)/ );
Also, for better performance and search result, using
robots.txt, instruct the spider to exclude all scripts but
view and
fiewfile.
--
PeterThoeny - 01 Sep 2006