Common Log Format
From Just Solve the File Format Problem
Revision as of 13:15, 17 December 2012 by Dan Tobias (Talk | contribs)
The Common Log Format is a standardized log format used by a number of web servers to keep track of accesses to websites. It is the format used by default in Apache.
The format is defined by this expression in the httpd.conf (Apache) file:
"%h %l %u %t \"%r\" %>s %b"
This consists of the following space-separated fields:
- Hostname or IP address of accesser of site. If a proxy server is between the end-user and the server, that might get logged here instead of the actual accesser's address.
- RFC 1413 identity of client; this is noted by Apache as unreliable, and is usually blank (represented by a hyphen (-) in the file).
- Username of user accessing document; will be a hyphen (-) for public web sites that have no user access controls.
- Timestamp string surrounded by square brackets, e.g. [12/Dec/2012:12:12:12 -0500]
- HTTP request surrounded by double quotes, e.g., "GET /stuff.html HTTP/1.1"
- HTTP status code: 200 for successful access, 404 for not-found, and other codes.
- Number of bytes transferred in requested object
Note that this format does not include the user agent string or referrer; you need to use the Combined Log Format to include these fields.