First RDF Schema for a Semantic Web enabled Drupal

As a semantic web researcher and developer, my goal is to bring these technologies to the
lay people. The main problem is the common chicken and egg dilemma,
where the semantic web technologies need semantic data to become truly
useful and powerful, but nobody wants to produce such data until they can see how powerful the semantic web is.

There is an immense amount of data available on the internet spread over millions of
HTML pages, PDF documents, et cetera. These formats have been designed for
making these documents understandable for people, but not for machines. In this instance RDF
comes in as a language to describe data and relationships within the data. From a web of documents we evolve to a web of pieces of
data, i.e. concepts, items, ideas, events, people, you name it. Each of
them can be identified by their own Uniform Resource Identifier (URI), and the web becomes a global database.

To achieve this challenge, data producers need to adapt their
applications so that they supply RDF data in a standard fashion which
will be understood on a global scale: it's the role of the W3C
to coordinate this effort. Drupal and other CMSs are
important data producers and as a result could set the standard for weaving the Web.

I'm glad that Dries, in his Keynote at Drupalcon Boston 2008, understood the potential behind the semantic web and took the decision to push this effort into Drupal. I'm thrilled to see this happening in Drupal 6 and in Drupal 7 core. For those interested, here is the semantic web audio part of Dries' keynote from the original audio recording. The full video of the keynote is also online.

Semantic Web in Drupal Video
An interesting mashup demo video was showed during the keynote, and it demonstrates what semantic web is capable of and is a good example of how it can be used in Drupal. When I first saw the video, I felt the explanation of the SPARQL query was confusing. Some lines are highlighted with the following explanation:


It can even do this across multiple datagraphs at the same time, which
means that creating complex queries across disconnected data sources
can be very simple.

While it's true that SPARQL can query multiple sources, the query presented here is not the right example. The highlighted lines are just namespace prefix definitions commonly used in SPARQL to make queries more concise and are taken from the Turtle format. A query across multiple data sources would look more like the following (note the NAMED and NAMED GRAPH lines):

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?who ?g ?mbox
FROM <http://site1.org/dft.ttl>
FROM NAMED <http://site2.org/alice>
FROM NAMED <http://site3.org/bob>
WHERE
{
?g dc:publisher ?who .
GRAPH ?g { ?x foaf:mbox ?mbox }
}

RDF Schema for Drupal - Why bother?

Drupal RDF Schema
As I emphasized in the beginning of this post, the basic principle to remember is data portability. It's important to make sure that the way you express your data is machine readable and understandable. Most platforms need to describe their users and the relations between them; the data these users posted, and how they interlink to each other. Describing what a piece of data is with taxonomy terms and how these terms are related is useful to many systems. These cases are not specific to Drupal, standards have been put in place in coordination with the W3C:

I proposed a first RDF schema for Drupal on groups.drupal.org, and it's great to see that some knowledgeable people from the Linking Open Data project such as Frédérick Giasson have already started to give their feedback! Let's see what we can build up on it.

LOD

Ok, Ok, I botched it.

That's my voice in the video, my script, and mea culpa. I've had my wrist slapped about the inaccuracy, but you're the first to blog about it that I've seen. Good eye.

The truth is that the SPARQL module could only do multi-sourced queries about 4 hours before we got the video to Dries, and there just wasn't enough time to dig up a use case, query it, and film it in a hurry. What you've posted here is exactly the kind of thing we wanted to do and didn't have time for. We wanted to stress the distributed nature of SPARQL, and it was being worked on until the last minute before we simply had to stop and edit. It just didn't come together in time.

I'm hoping to post some examples of the SPARQL module doing exactly the kind of thing you show here sooner rather than later, but I can't say exactly when that will happen. You're probably in one of the groups I'll post it to when that happens.

I saw your schema when you posted it last week; good job getting the discussion going. I hope the core folks are listening as much as the Semweb folks are watching.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <h3> <img> <span> <blockquote> <div> <h1> <h2> <h3>
  • Lines and paragraphs break automatically.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.

More information about formatting options