Odd Log Entries

Volume 6, Issue 28; 03 Jun 2003

Caching effects?

Being a new blogger, I'm still fascinated by access patterns in the server logs. I have a little window up on one of my desktops where a (slightly filtered) tail of the log drifts by throughout the day.

My filter highlights errors, so I tend to notice them even when I'm letting the rest of the log flow by uninspected. Most often these errors are obvious typos or artifacts of some sloppy forwarding on my part, sometimes they're evidence that I've bungled, but I've seen several that I can't explain. One such example is:

nnn.nnn.nnn.nnn - - [03/Jun/2003:11:14:41 -0400]
"GET /graphics/rssicon.jpg HTTP/1.1" 404 300
"http://norman.walsh.name/2003/05/19/learning"
"Opera/7.10 (Windows 98; U)  [en]"

I understand this entry to say that the machine with the IP address “nnn.nnn.nnn.nnn” attempted to GET /graphics/rssicon.jpg referred to from http://norman.walsh.name/2003/05/19/learning and that attempt generated a 404 error. All well and good, except that there isn't any reference to /graphics/rssicon.jpg in “learning” (or any of the other referrers that turn up in the logs).

The underlying cause of this problem is that I recently reorganized the web server a bit and moved that graphic. So there was a version of the “learning” URI that contained that reference, but it doesn't exist anymore.

So how does this spurious GET come about? Are there browsers or caching proxies that store only the pages but not the graphics? So the browser attempts to get the base URI, finds a copy in the cache then attempts to get the graphics but doesn't find them in the cache so it goes back to the server? That's my best guess.