Increase Your Web Traffic

Guide to Web Server
Logs and Errors

When you want to track Web site stats, server log files are essential resources. Because the location of server log files is defined when a Web server is installed, my server log files and your server log files are probably stored in a completely different directory. The best way to find the server log files is to ask your service provider or the webmaster where the server logs are stored.

Access Log

The access log is the key to discovering who is visiting your Web site and why. Every time someone requests a file from your Web site, an entry is made in the access log, making the access log a running history of every successful and unsuccessful attempt to retrieve information from your Web site. Because each entry has its own line, entries in the access log can be extracted easily when you want to compile stats for your Web site. Compiling stats from access log entries is covered inthe Saturday Morning section of the book under "Understanding Visits and Page Views." Entries in the access log look like this:

ds.ucla.edu - - [02/Mar/1997:13:38:31 -0800] "GET /dice.gif HTTP/1.0" 200 1714

Common Log Format

The most prevalent format for server access logs is the Common Log Format. As summarized in Table 1, access log entries in the Common Log Format have seven fields.

Table 1: Fields in the Common Log Format

Field NameDescription
1 Host Field Identifies the host computer requesting a file from your Web server. The value in this field is either the fully qualified domain name or the remote host.
2Identification FieldIdentifies users by their username per RFC 931. This field is rarely used, however. Because of this, you will generally see a hyphen (-) in this field.
3User Authentication Field Used with protected areas at your Web site. Unless you have a password-protected area at your Web site, you will usually see a hyphen (-) in this field.
4Time Stamp FieldTells you exactly when someone accessed a file on the server. Because the format of the time field is very specific, the time field can be extracted to perform many different calculations. The format for the time stamp field is: DD/MMM/YYYY:HH:MM:SS OFFSET.
5HTTP Request FieldHelps you determine three things: the method the remote client used to request the information, the file the remote client requested, and the HTTP version the client used to retrieve the file.
6Status Code FieldTells you the status of the data transfer. From this field, you can determine whether files were transferred correctly, weren't found, were loaded from cache, and more.
7Transfer Volume FieldIndicates the number of bytes transferred to the client as a result of the request. If a status code other than a success code is used in field six, this field will contain a hyphen (-) or a zero to indicate that no data was transferred.

Extended Log Format

A log format growing in popularity is the extended log format. Access log entries in the Extended Log Format have nine fields. Because the first 7 fields are the same as those shown in Table 1, Table 2 summarizes only the two additional fields.

Table 2 Fields in the Extended Log Format

Field NameDescription
8 Referrer FieldSpecifies the URL of the page the client was at before visiting the page referenced in field 5.
9Agent Field Specifies the name and version of the browser that requested the file on your server.

Referrer Log

The referrer log tells you exactly where a client was before arriving at your Web site. Entries in the referrer log look like this:

http://www.netdaily.com/ -> /ci/w2.htm

In the combined or extended log format, this referrer information is added to the access log. Table 3 summarizes the fields in the referrer log.

Table 3 Fields in the Referrer Log

Field NameDescription
1Referrer Field Indicates the referrer, which is the site the user came from
2 Separator Field Serves to separate the first and last field
3 Local URL Field Indicates the relative URL on the Web page that the client requested

Agent Log

The agent log tells you the name and version of the browser that requested a file on your server. The values placed in the agent log are taken from the User_Agent field that all browsers supply in the HTTP header accompanying a file request. Although entries in the agent log don't follow a strict format, a general format for entries is the following:

browser name/version (supplemental information)

Entries in an actual agent log look like this:

aolbrowser/1.1 InterCon-Web-Library 1.2 (Macintosh, 68K)

Lynx/2-4-2 libwww/unknown

Mozilla/3.0 (compatible; MSIE 4.0; Macintosh)

Status Code Classes

Status codes are defined in the HTTP specification and are universal to all Web servers. All status codes are three-digit numbers. Because the first digit of the status code indicates the class of the code, you can often tell at a glance what has happened. Table 4 shows the general classes for status codes. Table 5 shows all the status codes in the HTTP 1.1 specification.

Table 4 - General Classes of Status Codes

CodeClass Description
1XX Continue/Protocol Change
2XX Success
3XX Redirection
4XX Client error/failure
5XX Server error

Table 5 HTTP 1.1. Status Codes

CodeClass Description
100 Continue
101 Switching Protocols
200 File transfer OK
201Created
202 Accepted
203 Non-Authoritative Information
204 No content
205 Reset Content
206 Partial Content
300 Multiple Choices
301 Moved Permanently
302 File moved temporarily
303 See Other
304 File not modified
305 Use Proxy
400 Invalid request
401 Client not authorized to access file
402 Payment Required
403 Client forbidden from accessing file or directory
404 File not found
405 Method Not Allowed
406 Not Acceptable
407 Proxy Authentication Required
408 Request Time-out
409 Conflict
410 Gone
411 Length Required
412 Precondition Failed
413 Request Entity Too Largev
414 Request-URI Too Large
415 Unsupported Media Type
500 Internal server error
501 Not Implemented
502 Bad gateway
503 Service unavailable
504 Gateway Time-out

(c)1997-1998 William R. Stanek

All Rights Reserved