You asked to GET what!?

I glance at the server logs occasionally and (this is probably well known) I just noticed something, or rather its significance just dawned on me.

With disturbing frequency I see things like this in the access log:

1 GET /topics#Photography …
2 GET /dates#Y2002 …

This is broken. The client isn’t supposed to send the fragment identifier to the server. Luckily, Apache seems to do the right thing.

The culprints:

Mozilla/4.0 (compatible; MSIE 4.01; Windows 98)
Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
Mozilla/4.0 (compatible; MSIE 5.0; Windows 95; DigExt)
Mozilla/4.0 (compatible; MSIE 5.0; Windows 98)
Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)
Mozilla/4.0 (compatible; MSIE 5.23; Mac_PowerPC)
Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)
Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)
Mozilla/4.04 [en] (WinNT; I ;Nav)
Mozilla/4.06 [en] (Win98; I)
Mozilla/4.7 [en] (Win98; I)
Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/124 (KHTML, like Gecko) Safari/125.1
Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/85.7 (KHTML, like Gecko) Safari/85.5
Mozilla/5.0 (Macintosh; U; PPC; en-US; rv:1.2.1) Gecko/20021130
Mozilla/5.0 (X11; U; Linux i686; de-AT; rv:1.2.1) Gecko/20021204
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.9) Gecko/20020408
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021130
Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
P3P Client

Silly clients.

Now, I also see things like this:

1 GET /topics%23SelfReference 
2 GET /dates%23Y2003 

Which is just totally, totally wrong. The client has percent escaped the hash mark before sending it! Apache sends back a 404 for that nonsense (though I suppose I could do better with a rewrite rule of some sort).

The culprints: Mozilla/4.0 and Mozilla/4.06 on Windows. Folks: upgrade your browsers!

Comments:

Safari is broken, as were early versions of Mozilla. But the ones claiming to be MSIE are almost certainly spambots lying about their User-Agent. I used to forbid all access to anyone asking for a hash, assuming they were all broken spambots... until Safari came out. *sigh*

Posted by Mark Pilgrim on 03 Apr 2004 @ 07:34pm UTC #

You're right, of course, that this browser behavior does not follow the HTTP/1.1 spec and is therefore broken.

However, I think that the spec *should* allow, even mandate, fragids to be send (when the browser is displaying a page with a fragid), because it would allow new kinds of fragids (e.g. xpointer schemes) to be deployed with some server-side support. I see this as similar to how Citeseer looks at the referrer, and when you're coming from Google, highlights the words you've searched for like the Google cache does. (At least it used to, for some reason it didn't work when I tried right now.) IMHO, implementing new kinds of xpointers server-side by highlighting the linked document portion would be very helpful.

Posted by Benja Fallenstein on 04 Apr 2004 @ 12:58pm UTC #

Some sort of server-side mechanism for fragments would be nice, but doesn't the slash work just fine for that?

Posted by Norman Walsh on 07 Apr 2004 @ 12:12pm UTC #
Comments on this essay are closed. Thank you, spammers.