So we have something like this line on one of our apache logs.
1.2.3.4 - - [22/Jun/2012:04:05:55 +0000] "GET /this-is-something.html HTTP/1.1" 200 47452 "-" "rogerbot/1.0 (http ://www.seomoz.org, rogerbot-crawler@seomoz.org)" 10.20.30.40
Let’s break it down parameter by parameter. The log in question uses a custom log format defined as:
LogFormat "%{X-Forwarded-For}i %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i" %h" mylog CustomLog /mnt/pharmdaily/log/pharmdaily-access.log mylog env=!dontlog
There are 10 “parameters” in this log format. The first line below is a hint, the actual values are marked a-j.
(#) parameter, sample-value (a) %{X-Forwarded-For}i, 50.19.77.29 (b) %l, - (c) %u, - (d) %t, [22/Jun/2012:04:05:55 +0000] (e) "%r", "GET /dentaloral/regular-teeth-cleanings-could-cut-heart-attack-risk-study.html HTTP/1.1" (f) %>s, 200 (g) %b, 47452 (h) "%{Referer}i", "-" (i) "%{User-Agent}i", "rogerbot/1.0 (http://www.seomoz.org, rogerbot-crawler@seomoz.org)" (j) %h, 10.170.195.204
Tying this all together with the apache documentation for mod_log, we get the following descriptions for these parameters:
(a) The contents of X-Forwarded-For header line in the request sent to the server (b) Remote logname (from identd, if supplied). This will return a dash unless IdentityCheck is set On. (c) Remote user (from auth; may be bogus if return status (%s) is 401) (d) Time the request was received (standard english format) (e) First line of request (f) Status. For requests that got internally redirected, this is the status of the *original* request --- %...>s for the last. (g) Size of response in bytes, excluding HTTP headers. In CLF format, i.e. a '-' rather than a 0 when no bytes are sent. (h) The contents of Referrer header line in the request sent to the server (i) The contents of User-Agent header line in the request sent to the server (j) Remote host