Who knows whether the best of men be known? or whether there be not more remarkable persons forgot, than any that stand remembered in the known account of time.
The admonition came from an exchange on #foaf. These chats are generally logged on the web, but I can find no record of a log for 05 May 2003 when this discussion [lightly edited] occurred:
... <nwalsh> Ok, so I'm going to have to fiddle things so that I make a link for each [individual in my foaf file]. Which is preferable: <nwalsh> Making the foaf:know entries appear literally in the foaf.rdf file, or making the foaf.rdf file contain a whole set of <foaf:knows rdf:resource="otherfile#id"/> <mortenf> the first one-or the third option, the foafy one, saying <foaf:knows foaf:mbox_sha1sum="..."/> <nwalsh> Hmmm. <mortenf> it's generally considered bad foaf karma to assign uris to people... <nwalsh> Hmmm again. <nwalsh> I've already done that, for my own purposes, but I guess I don't have to expose it <mortenf> if you do option two, you're assigning a uri to each person instead of using a blank node. <nwalsh> Ok. Thanks. I'll have to think some more.
It's the last line that earned this problem a mention on my “tough nuts” page.
I can see three objections to assigning URIs to people:
Proliferation of URIs for the same resource. It's entirely consistent within the web architecture to have multiple URIs for the same resource, but fewer is better. It's easier for machines and people to tell that two URIs identify the same resource if the URIs are spelled the same.
Impertinence. It strikes me as a tiny bit impertinent for me to assign a URI for you. Sean, you are http://.../knows/who#sean-b-palmer. “Feh,” you might reply, “am not!”
As it turns out, I decided to expose the URIs that I assigned
anyway. Sean, that really is you! But out of respect for FOAF karma, I
didn't put those URIs in my FOAF file.
Instead, I've identified the
foaf:know only by the SHA1 hash of their
mailbox. If I've also got a persistent URI for them, I made that URI a
That's either a good compromise, or the worst of both worlds. I'm not sure which.
With respect to the argument about proliferation of URIs, I don't see how “using a blank node” simplifies the problem. It strikes me that multiple blank nodes are as no less confusing to machines and people than multiple URIs. In fact, multiple blank nodes are probably more confusing to people.
The central problem seems to be, how do you merge these multiple identifiers so that you can tell that the various statements you've got are really about the same resource? Sean points to a good discussion by Dan Brickley of this “smooshing” problem.
To quote Lewis Carroll for the second time in two days, “‘When I use a word,’ Humpty Dumpty said in a rather scornful tone, ‘it means just what I choose it to mean—neither more nor less.’”