Preview

Comment:

Posted by

Comment

Name: You must provide your name.
Email*: You must provide your email address.
  *Please provide your real email address; it will not be displayed as part of the comment.
Homepage:
Comment**:
  **The following markup may be used in the body of the comment: a, abbr, b, br, code, em, i, p, pre, strong, and var. You can also use character entities. Any other markup will be discarded, including all attributes (except href on a). Your tag soup will be sanitized...
What is ten times three?
  In an effort to reduce the amount of comment spam submitted by bots, I'm trying out a simple CAPTCHA system. In order to submit your comment, you must answer the simple math question above. For example, if asked "What is the two plus five?", you would enter 7.
Remember me? (Want a cookie?)

 (There must be no errors before you submit.)

The body of the essay you are commenting on appears below. Certain features, such as the navigation, are not supported in this preview. I might someday fix that. Or not.


Modeling names and addresses. No, not that old debate, the sort that appear in your address book.

Years ago, when I was using a Palm device for my address book, calendar, etc., I arranged to convert that data into RDF. I described that work in Generalized Metadata in your Palm, a paper that I presented at Extreme Markup Languages in 2002.

When I converted from the Palm to the Sidekick, I temporarily lost the RDF. I had no trouble, thanks to Dan Connolly and the T-Mobile XML/RPC interface, getting XML out, but I wasn't getting RDF. (I could have, Dan does, but I didn't because it was quicker to get my local infrastructure running again just from the straight XML.)

Recently, I decided it was time to get the RDF back. I want to be able to combine the contacts in my address book with other data sources in ways that RDF makes easy and I want to be able to do inference over contacts again. In addition, I now have a tool that will validate my RDF. Validation, does this instance conform to the model I've described?, was one of the first things I asked about when I started using RDF. Only after the publication of OWL does it seem that such tools have actually been widely deployed. (I'm using pellet at the moment.)

Designing the ontology 

Given that I can now validate my RDF, I'm much more motivated to write a schema for my model. Designing an RDF schema isn't unlike other design exercises; it consists principally of dividing the world into classes, properties on those classes, and defining the relationships between classes and properties.

My first instinct was to write my own ontology from scratch, defining a class for contacts and properties on resources of that class: first name, last name, email addresses, phone numbers, postal addresses, etc. In fact, that's just what I did. But there was a significant overlap with the FOAF vocabulary. One of RDF's strengths is the ability to easily aggregate different vocabularies, so I replaced many of my properties with appropriate FOAF properties.

In fact, I might propose to extend FOAF to cover more of this use case since it seems so closely related. Instead of just asserting my extensions, I've compromised and made some of my properties and classes subclasses and subproperties of the FOAF terms.

Classes or properties? 

Phone numbers, email addresses, postal addresses, and even to some extent, instant messaging addresses have “labels” associated with them. That is, a “work” phone number is distinct from a “home” phone number, etc.

This distinction is significant and has to be preserved in the model. Let's consider phone numbers as a concrete example. Three possibilities occur to me.

  1. Model the label directly: make a phone number a class of resource that has two properties, a label and a phone number.

  2. Use classes: make a phone number a class of resources with subclasses for a work phone number, a home phone number, etc.

  3. Use properties: make a phone number property with subproperties for work phone number, home phone number, etc.

After some thought and some discussion on the #swig channel on irc.freenode.net, I don't think there's a compelling argument in favor of any one solution, except that the first seems less appealing than either of the others. The label isn't open-ended free text, it's a string that identifies the kind of phone number and both the class and property solutions seem to do that more directly.

My personal inclination is to use classes, but I see that FOAF has already opted for the property approach (homepage, workplaceHomepage, etc.) in several places, so I decided to go that way too.

Lists or not? 

Another decision that has to be made is whether or not to model the various repeatable fields as lists. Certainly they're ordered in the XML and they appear ordered on the Sidekick display, but lists in RDF more-or-less suck, so I opted not to model them that way. It'll put a little more burden on any software I eventually write to synchronize from the RDF, but that seems better than dealing with the list problems everywhere. And really, the list nature of the properties isn't intrinsically important. If I want to call my friend's work phone number, I don't care if it's listed first or second, do I?

The “final” design 

Taking into account the choices above, and considering that I'm aiming to take advantage of FOAF as much as possible, let's consider how an entry in my address book gets translated to RDF. Here's an entry:

  1<contact id="_950">
  2  <last_modified>2005-11-24T14:10:51Z</last_modified>
  3  <category>Family</category>
  4  <firstname>Norman</firstname>
  5  <middlename>David</middlename>
  6  <surname>Walsh</surname>
  7  <company>Sun Microsystems, Inc.</company>
  8  <title>XML Standards Architect</title>
  9  <birthday>1967-06-16</birthday>
 10  <uris>
 11    <uri label="ID">#norman-walsh</uri>
 12    <uri label="Blog">http://norman.walsh.name/</uri>
 13    <uri label="Home">http://nwalsh.com/</uri>
 14  </uris>
 15  <emails>
 16    <email label="Work">Norman.Walsh@Sun.COM</email>
 17    <email label="Home">ndw@nwalsh.com</email>
 18  </emails>
 19  <phones>
 20    <phone label="Work">+1-413-303-1382</phone>
 21    <phone label="Work">+1-413-256-xxxx</phone>
 22    <phone label="Home">+1-413-256-xxxx</phone>
 23    <phone label="Mobile">+1-413-949-xxxx</phone>
 24  </phones>
 25  <addresses>
 26    <address label="Home">
 27      <street>XX Xxxx Street</street>
 28      <city>Belchertown</city>
 29      <state>MA</state>
 30      <postcode>01007</postcode>
 31    </address>
 32    <address label="Work">
 33      <street>1 Network Drive, Building #2
 34MS UBUR02-201</street>
 35      <city>Burlington</city>
 36      <state>MA</state>
 37      <postcode>01803</postcode>
 38    </address>
 39  </addresses>
 40  <notes>rdf:
 41a g:Male
 42geo:lat 42.3382
 43geo:long -72.45
 44foaf:page http://norman.walsh.name/foaf
 45
 46AccessLine is x53142</notes>
 47  <rdf:type rdf:resource="http://nwalsh.com/rdf/genealogy#Male"/>
 48  <geo:lat>42.3382</geo:lat>
 49  <geo:long>-72.45</geo:long>
 50  <foaf:page rdf:resource="http://norman.walsh.name/foaf"/>
 51</contact>

And here's the resulting RDF.

  1<rdf:Description rdf:about="http://norman.walsh.name/knows/who#norman-walsh">
  2  <rdf:type rdf:resource="http://nwalsh.com/rdf/contacts#Contact"/>
  3  <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/>

The “ID” URI is used to construct the URI for the resource. All contacts are members of the Contact class and contacts that have a first or last name are foaf:Persons. Contacts that have only company names are foaf:Organizations. I have a third class, c:Place, for geographic locations, but that's probably unique to my metadata collection.

  1  <c:lastModified rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2005-11-24T14:10:51Z</c:lastModified>
  2  <c:category>Family</c:category>

The last modified date and the category are directly related to the contact data in the address book.

  1  <foaf:firstName>Norman</foaf:firstName>
  2  <c:middleName>David</c:middleName>
  3  <foaf:surname>Walsh</foaf:surname>
  4  <foaf:name>Norman David Walsh</foaf:name>

First and last names are available in FOAF. At the moment middle names aren't, so I've created a middleName property. Generating the full foaf:name is straight-forward, so I do that as well.

  1  <c:associatedName>Sun Microsystems, Inc.</c:associatedName>
  2  <c:associatedTitle>XML Standards Architect</c:associatedTitle>

These properties associate a company name and title with a contact. (There's room for some additional modeling complexity in titles as person, company, and title form a three-part relationship, but I've never seen an electronic address book that tried to handle that situation, so let's not worry about it.)

  1  <c:dateOfBirth rdf:datatype="http://www.w3.org/2001/XMLSchema#date">1967-06-16</c:dateOfBirth>
  2  <foaf:birthday>06-16</foaf:birthday>

The FOAF birthday property, unfortunately, doesn't support full dates, so I've had to invent one. But I can generate the FOAF version as well.

  1  <foaf:weblog rdf:resource="http://norman.walsh.name/"/>
  2  <foaf:homepage rdf:resource="http://nwalsh.com/"/>

The URIs convert naturally to FOAF properties. Some of my contacts have other sorts of URIs (entries in the Getty Thesaurus of Geographic Names®, the CIA World Factbook, etc.) for which I've invented additional properties.

  1  <c:workMbox rdf:resource="mailto:Norman.Walsh@Sun.COM"/>
  2  <foaf:mbox_sha1sum>9f5c771a25733700b2f96af4f8e6f35c9b0ad327</foaf:mbox_sha1sum>
  3  <c:personalMbox rdf:resource="mailto:ndw@nwalsh.com"/>
  4  <foaf:mbox_sha1sum>5ddcd862514c327945dca20446e11cb54ceec68b</foaf:mbox_sha1sum>

I've invented subproperties of foaf:mbox for various kinds of email addresses.

  1  <c:workPhone rdf:resource="tel:+1-413-303-1382"/>
  2  <c:workPhone rdf:resource="tel:+1-413-256-xxxx"/>
  3  <c:homePhone rdf:resource="tel:+1-413-256-xxxx"/>
  4  <c:mobilePhone rdf:resource="tel:+1-413-949-xxxx"/>

Similarly, I've invented subproperties of foaf:phone for various kinds of phone numbers.

  1  <c:homeAddress rdf:parseType="Resource">
  2    <rdf:type rdf:resource="http://nwalsh.com/rdf/contacts#Address"/>
  3    <c:street>XX Xxxx Street</c:street>
  4    <c:city>Belchertown</c:city>
  5    <c:stateOrProvince>MA</c:stateOrProvince>
  6    <c:postcode>01007</c:postcode>
  7  </c:homeAddress>
  8  <c:workAddress rdf:parseType="Resource">
  9    <rdf:type rdf:resource="http://nwalsh.com/rdf/contacts#Address"/>
 10    <c:street>1 Network Drive, Building #2
 11MS UBUR02-201</c:street>
 12    <c:city>Burlington</c:city>
 13    <c:stateOrProvince>MA</c:stateOrProvince>
 14    <c:postcode>01803</c:postcode>
 15  </c:workAddress>

Continuing to follow that pattern, I invented a class for postal addresses, an address property, and subproperties of it for various kinds of addresses.

  1  <c:notes>rdf:
  2a g:Male
  3geo:lat 42.3382
  4geo:long -72.45
  5foaf:page http://norman.walsh.name/foaf
  6
  7AccessLine is x53142</c:notes>

Finally, the notes property holds notes about the contact. I parse pseudo-N3 from the notes field to add additional properties to the record.

  1  <rdf:type rdf:resource="http://nwalsh.com/rdf/genealogy#Male"/>
  2  <geo:lat>42.3382</geo:lat>
  3  <geo:long>-72.45</geo:long>
  4  <foaf:page rdf:resource="http://norman.walsh.name/foaf"/>
  5</rdf:Description>

The collected RDF for all my contacts are then augmented by additional inference rules to build a final, combined model for my “personal information manager”. One example of a rule is this one:

  1{ ?c a foaf:Organization .
  2  ?c c:associatedName ?t .
  3  ?p a foaf:Person .
  4  ?p c:associatedName ?t } => { ?p c:associatedWith ?c } .

This rule says that if there's an organization (for example, an entry in my address book with only a company name) and a person with the same association, then that person is associated with that organization. So, for example, when I format the address book entry for that company, I get pointers to all the people I know who work for that company. I also have a vocabulary for relationships inside Sun (employee numbers, department numbers, reporting structures, etc.) that I can “scrape” from the internal name finder. Rules associated with terms in that vocabulary allow me to generate appropriate cross-references between employees, departments, etc.

The Ontology 

The resulting ontology is:

  1# -*- N3 -*-
  2
  3@prefix owl: <http://www.w3.org/2002/07/owl#> .
  4@prefix xs: <http://www.w3.org/2001/XMLSchema#> .
  5@prefix c: <http://nwalsh.com/rdf/contacts#> .
  6@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
  7@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
  8@prefix foaf: <http://xmlns.com/foaf/0.1/> .
  9
 10<http://nwalsh.com/rdf/contacts> a owl:Ontology;
 11    rdfs:comment "Norm's ontology for his address book." .
 12
 13# ------------------------------------------------------------
 14
 15# A contact in an address book
 16c:Contact a owl:Class;
 17    rdfs:subClassOf
 18        [
 19             a owl:Restriction;
 20             owl:cardinality "1"^^xs:nonNegativeInteger;
 21             owl:onProperty c:lastModified ] .
 22
 23# Timestamp of address book entry
 24c:lastModified a owl:DatatypeProperty;
 25    rdfs:domain c:Contact;
 26    rdfs:range xs:dateTime .
 27
 28# Category in address book
 29c:category a owl:DatatypeProperty;
 30    rdfs:domain c:Contact .
 31
 32# A middle name (other name properties come from FOAF)
 33c:middleName a owl:DatatypeProperty .
 34
 35# Company and title
 36c:associatedName a owl:DatatypeProperty .
 37c:associatedTitle a owl:DatatypeProperty .
 38
 39# Birthday
 40c:dateOfBirth a owl:DatatypeProperty;
 41    rdfs:range xs:dateTime .
 42
 43# Email addresses
 44c:personalMbox a owl:ObjectProperty;
 45     rdfs:subPropertyOf foaf:mbox .
 46
 47c:workMbox a owl:ObjectProperty;
 48     rdfs:subPropertyOf foaf:mbox .
 49
 50c:pagerMbox a owl:ObjectProperty;
 51     rdfs:subPropertyOf foaf:mbox .
 52
 53c:obsoleteMbox a owl:ObjectProperty;
 54     rdfs:subPropertyOf foaf:mbox .
 55
 56# Phone numbers
 57c:dataPhone a owl:ObjectProperty;
 58     rdfs:subPropertyOf foaf:phone .
 59
 60c:fax a owl:ObjectProperty;
 61     rdfs:subPropertyOf foaf:phone .
 62
 63c:homePhone a owl:ObjectProperty;
 64     rdfs:subPropertyOf foaf:phone .
 65
 66c:workPhone a owl:ObjectProperty;
 67     rdfs:subPropertyOf foaf:phone .
 68
 69c:mobilePhone a owl:ObjectProperty;
 70     rdfs:subPropertyOf foaf:phone .
 71
 72c:pagerPhone a owl:ObjectProperty;
 73     rdfs:subPropertyOf foaf:phone .
 74
 75# Notes
 76c:notes a owl:DatatypeProperty .
 77
 78# Postal address
 79c:Address a owl:Class;
 80    rdfs:subClassOf
 81        [
 82             a owl:Restriction;
 83             owl:maxCardinality "1"^^xs:nonNegativeInteger;
 84             owl:onProperty c:street ],
 85        [
 86             a owl:Restriction;
 87             owl:maxCardinality "1"^^xs:nonNegativeInteger;
 88             owl:onProperty c:city ],
 89        [
 90             a owl:Restriction;
 91             owl:maxCardinality "1"^^xs:nonNegativeInteger;
 92             owl:onProperty c:stateOrProvince ],
 93        [
 94             a owl:Restriction;
 95             owl:maxCardinality "1"^^xs:nonNegativeInteger;
 96             owl:onProperty c:postcode ],
 97        [
 98             a owl:Restriction;
 99             owl:maxCardinality "1"^^xs:nonNegativeInteger;
100             owl:onProperty c:country ] .
101
102# Addresses
103c:address a owl:ObjectProperty;
104     rdfs:range c:Address .
105
106c:workAddress a owl:ObjectProperty;
107     rdfs:subPropertyOf c:address .
108
109c:homeAddress a owl:ObjectProperty;
110     rdfs:subPropertyOf c:address .
111
112# Fields of an address
113c:street a owl:DatatypeProperty;
114   rdfs:domain c:Address;
115   rdfs:range xs:string .
116
117c:city a owl:DatatypeProperty;
118   rdfs:domain c:Address;
119   rdfs:range xs:string .
120
121c:stateOrProvince a owl:DatatypeProperty;
122   rdfs:domain c:Address;
123   rdfs:range xs:string .
124
125c:postcode a owl:DatatypeProperty;
126   rdfs:domain c:Address;
127   rdfs:range xs:string .
128
129c:country a owl:DatatypeProperty;
130   rdfs:domain c:Address;
131   rdfs:range xs:string .

I have a few additional constraints that I think are limited to my particular address book:

  1# -*- N3 -*-
  2
  3@prefix owl: <http://www.w3.org/2002/07/owl#> .
  4@prefix xs: <http://www.w3.org/2001/XMLSchema#> .
  5@prefix c: <http://nwalsh.com/rdf/contacts#> .
  6@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
  7@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
  8@prefix foaf: <http://xmlns.com/foaf/0.1/> .
  9
 10c:Contact
 11    rdfs:subClassOf
 12        [
 13             a owl:Restriction;
 14             owl:cardinality "1"^^xs:nonNegativeInteger;
 15             owl:onProperty c:category ],
 16        [
 17             a owl:Restriction;
 18             owl:maxCardinality "1"^^xs:nonNegativeInteger;
 19             owl:onProperty foaf:firstName ],
 20        [
 21             a owl:Restriction;
 22             owl:maxCardinality "1"^^xs:nonNegativeInteger;
 23             owl:onProperty foaf:surname ],
 24        [
 25             a owl:Restriction;
 26             owl:maxCardinality "1"^^xs:nonNegativeInteger;
 27             owl:onProperty c:middleName ],
 28        [
 29             a owl:Restriction;
 30             owl:maxCardinality "1"^^xs:nonNegativeInteger;
 31             owl:onProperty c:associatedName ],
 32        [
 33             a owl:Restriction;
 34             owl:maxCardinality "1"^^xs:nonNegativeInteger;
 35             owl:onProperty c:associatedTitle ],
 36        [
 37             a owl:Restriction;
 38             owl:maxCardinality "1"^^xs:nonNegativeInteger;
 39             owl:onProperty c:dateOfBirth ],
 40        [
 41             a owl:Restriction;
 42             owl:maxCardinality "1"^^xs:nonNegativeInteger;
 43             owl:onProperty foaf:birthday ] .
 44
 45foaf:Organization
 46    rdfs:subClassOf
 47        [
 48             a owl:Restriction;
 49             owl:cardinality "0"^^xs:nonNegativeInteger;
 50             owl:onProperty foaf:firstName ],
 51        [
 52             a owl:Restriction;
 53             owl:cardinality "0"^^xs:nonNegativeInteger;
 54             owl:onProperty foaf:surname ],
 55        [
 56             a owl:Restriction;
 57             owl:cardinality "0"^^xs:nonNegativeInteger;
 58             owl:onProperty foaf:name ],
 59        [
 60             a owl:Restriction;
 61             owl:cardinality "0"^^xs:nonNegativeInteger;
 62             owl:onProperty foaf:nick ] .
 63
 64c:Place a owl:Class;
 65  rdfs:subClassOf foaf:Agent .
 66
 67c:gettyTGN a owl:ObjectProperty;
 68   rdfs:subPropertyOf foaf:page .
 69
 70c:ciaFactbook a owl:ObjectProperty;
 71   rdfs:subPropertyOf foaf:page .
 72
 73c:weather a owl:ObjectProperty;
 74   rdfs:subPropertyOf foaf:page .
 75
 76c:associatedWith a owl:ObjectProperty;
 77    rdfs:domain foaf:Person;
 78    rdfs:range foaf:Organization .
 79
 80c:hasAssociated a owl:ObjectProperty;
 81    rdfs:domain foaf:Organization;
 82    rdfs:range foaf:Person .

Open Questions 

Should some of this be incorporated into FOAF? Should I have tried to use the vCard schema instead? And of course, which bits could be modelled better?