Title :Logging 404 errors in your custom statistics using Apache
and a PHP script.
Author :Ioannis Cherouvim
Email :morales[at]hack[dot]gr
Date :2005-05-12
Story :Are you using you own statistics generator for your php
website traffic ? Yes, the kind of statistics that need an
include statement on the top of each of php files you want
to log, in order to capture the user's visit. Yes I do as
well, it's cool, nice and makes you a better programmer.
Problem :The problem is that these stats are triggered only when
the user actually attempts to access a php that exists in
your server. If he types in yoursite.com/some-garbage-here
this will not be logged in your custom stats. It will only
be found in the regular apache logs, and it will be possible
to see that if you are using statistics that are based on
parsing of the server logs.
Solution:Using a little help from the apache webserver we can capture
404 hits as well. All we need to do is instruct apache to
redirect all not found attempts to a new script we are going
to make which will handle this case. The script will be
called error.php and will reside somewhere inside our
webserver. It will construct an appropriate and informative
error message containing the request method, status, URL and
query string information and log that in our custom stats.
Then it will immediately redirect the user to our real
website. So a) we track 404 errors which might be important
for us to know about internal broken links or maybe hack
attempts and b) never let any user see a 404 error message
by simply redirecting him to the good working URL of our
site, thus never loosing a 'customer'.
Step a) :Put these lines in UNIX type text file called .htaccess in
the root of your website. It will instruct the server to
redirect the user in case of errors 403, 404 and 500.
Step b) :Construct an error.php script in the appropriate directory
in your server (in this example into the /core/ directory).
Construct the desired error string and call the statistics
by logging this error. Then redirect to correct site. This
example will log something like :
ERROR[GET]>404>/ind3ex.php?file=get
Which shows us that there was a 404 error in the users attempt
to call an 'ind3ex.php' file which does not exist in our
server, using a 'GET' method and using a 'file=get' query string.