The Web Semantic

Avatar

data for humans and computers and the tools that make it available

Using ARQ and regexp to refine data ranges

ARQ is able to update RDF graphs, and I’m just beginning to realize how powerful that can be.  I’ve been playing with geonames and found that the ontology defined population as:

(Geonames.org has a fantastic set of data and public api’s by the way, if anything points the way to a semantic web it’s them.)

I was using their ontology to experiment with binding and it shows one of the problems in trying to do this, vague data ranges (and cardinalities, but that’s another discussion).  We know about the property, and its domain, but no range.  As humans we can infer the type by looking at the data, but just going off the RDF as given, we know nothing about the range.  Assuming this is frequent with public data sources, we’d need to find ways to clean things up a bit in an automated way.

<datatypeproperty rdf:about=”#population”>
<domain rdf:resource=”#Feature”></domain>
<label xml:lang=”en”>population</label>
</datatypeproperty>

What I’ve attempted is to use ARQ to query this data, find all population values that meet the criteria of being defined as ^^xsd:integer, and then asserting a new property that extends the geonames property, a new property that does have a range…

 

<!– NOTE: we’re in a different namespace of course –>

 

<owl:DatatypeProperty rdf:ID=”population”>

  <rdfs:range rdf:resource=”http://www.w3.org/2001/XMLSchema#int”/>

<  rdfs:subPropertyOf rdf:resource=”http://www.geonames.org/ontology#population”/>

</owl:DatatypeProperty>

So we first need to find all places with a population that matches [0-9]+.  Considering we might have words “one thousand” this is the easiest case to re-assert with our new property…

WHERE { ?place gn:population ?population .

 FILTER regex(?population, “[0-9]+”, “i”) }

The filter selected all statements whose object matches the integer mask, and whose predecate is the base “population” property.  To assert new information I used the CONSTRUCT clause.  I really wasn’t sure what to do there…seems like there are many ways in ARQ and I had trouble with INSERT, so I just went with the first method that compiled and ran…

CONSTRUCT

“{?place ext:population ?population}

This constructs a new assertion with the extended population property against the statements we know are valid integers.  Now are back to a more structured representation of population that can be bound to an integer type in some other programming language.  I would rather assert the statements back into the original model, however, I’m still not clear on how to use the named graphs.

Original ontology:

http://jenabean.googlecode.com/svn/trunk/src/example/geonames.owl

Extended ontology:
http://jenabean.googlecode.com/svn/trunk/src/example/geonamesext.owl

Java source file with complete SPARQL query…
http://jenabean.googlecode.com/svn/trunk/src/example/ConstructQuery.java

Data used for the example:
http://jenabean.googlecode.com/svn/trunk/src/example/capitals.rdf

Taylor

No Comments, Comment or Ping

Reply to “Using ARQ and regexp to refine data ranges”

Before you go

places we like