Background research work leading to RDF in Drupal 7 released as part of my Master's thesis

Oct 22nd, 2010 NUIG graduation leaflet (Photo credit: Anna Dabrowska)

Today it's about time to make my M.Sc. thesis available online. The review process took a long while, but I finally graduated from DERI, National University of Ireland, Galway last October. That's more than a year after I submitted my thesis and actually left Galway to move to Boston.

The Open Graph protocol and Drupal

Administration interface of the open graph protocol Drupal module

A couple of days after Dries Buytaert gave his keynote at DrupalCon San Francisco and reaffirmed his support for the Semantic Web in Drupal, Facebook co-founder Mark Zuckerberg announced at the f8 developer conference the brand new Open Graph protocol, a technology to turn webpages into social objects and capture them in a social graph. The announcement was backed up by a lot of PR and according to Facebook, 50,000 websites have already implemented OGP including IMDb, NHL, Posterous, Pandora, Rotten Tomatoes, Yelp and more. There is plenty to read about the marketing around this announcement, but I'm going to keep this post at a technical level only.

RDFa in Drupal 7: last call for feedback before alpha release

The first alpha release of Drupal 7 will be created next Friday Jan 15th. We've already incorporated most of the feedback we received from the semweb community so far, but I wanted to give the community a last chance to review the RDFa markup and the default RDF mappings we use before it's too late. I should emphasize that all the markup and default RDF mappings that we ship in core will be pretty much set in stone after the stable release of Drupal 7, hence this call for feedback. Site administrators who care about semantics will be able to alter these mappings by installing extra modules, but many people (read several 10K sites) will just install Drupal 7 and not care about the semantics it generates. Therefore we want to make sure the RDFa generated by Drupal out of the box is somewhat correct and does not make folks from the semantic/pedantic web community angry :) - we've tried to keep the semantics as generic as possible for that reason.

RDF mappings

I've created a diagram representing the default semantics of the core data structure which has been committed and I would appreciate feedback on the RDF terms we've used.

Drupal 7 core RDF schema

RDFa markup

To make the RDFa markup review process easier, I've updated the usual testing site at http://drupalrdf.openspring.net/. It features a blog post with some comments which represents a typical Drupal 7 page annotated with RDFa. Some other pages have been randomly generated to be able to test the tracker which acts as a very simple sitemap in RDFa.

Status of RDF in Drupal (November 09) and wrap up of ISWC2009

ISWC 2009 background - http://www.flickr.com/photos/kasei/4055714142/

I had the pleasure to give a presentation of the paper "Produce and Consume Linked Data with Drupal!" at ISWC2009 last, and I was very honored we won the Best Semantic Web in Use Paper award! The 30 minutes of presentation + Q/A passed very quickly and I didn't have much time to expand on the status of RDF in Drupal 7 vs. Drupal 6 after describing the inner workings of the modules we developed. I'm sure this will also interest some people outside the attendees. First of all, the current stable version of Drupal is Drupal 6 (the latest version at the time of this writing being Drupal 6.14). This is the version on which we started to implement the contributed modules presented at ISWC2009, namely RDF CCK, RDF external vocabulary importer (Evoc), SPARQL Endpoint and RDF SPARQL Proxy. Contributed modules means they do not get included in the core Drupal package, but people can download them from drupal.org for free and drop them on their server so Drupal core can be extended. These 4 modules work pretty well on Drupal 6, you can get RDF export in RDF/XML, N-Triples, turtle, json. However generating RDFa is not very easy as it requires to patch the CCK on which we rely to generate the content pages and store the various field data. We made sure this would not be a problem in the next version of Drupal (Drupal 7) which is still under development, and due to be released sometime next year.

Produce and Consume Linked Data with Drupal!

Drupal in the Linked Data CloudProduce and Consume Linked Data with Drupal! is the title of the paper I will be presenting next week at the 8th International Semantic Web Conference (ISWC 2009) in Washington, DC. I wrote it at the end of M.Sc. at DERI, in partnership with the Harvard Medical School and the Massachusetts General Hospital which is where I am now working.

It presents the approach for using Drupal (or any other CMS) as a Linked Data producer and consumer platform. Some part of this approach were used in the RDF API that Dries committed a few days ago to Drupal core. I have attached the full paper, and here is the abstract:

Currently a large number of Web sites are driven by Content Management Systems (CMS) which manage textual and multimedia content but also - inherently - carry valuable information about a site's structure and content model. Exposing this structured information to the Web of Data has so far required considerable expertise in RDF and OWL modelling and additional programming effort. In this paper we tackle one of the most popular CMS: Drupal. We enable site administrators to export their site content model and data to the Web of Data without requiring extensive knowledge on Semantic Web technologies. Our modules create RDFa annotations and - optionally - a SPARQL endpoint for any Drupal site out of the box. Likewise, we add the means to map the site data to existing ontologies on the Web with a search interface to find commonly used ontology terms. We also allow a Drupal site administrator to include existing RDF data from remote SPARQL endpoints on the Web in the site. When brought together, these features allow networked RDF Drupal sites that reuse and enrich Linked Data. We finally discuss the adoption of our modules and report on a use case in the biomedical field and the current status of its deployment.

RDFa in Drupal: Bringing Cheese to the Web of Data

"RDFa in Drupal: Bringing Cheese to the Web of Data" is the title of our short paper which was recently accepted at the 5th Workshop on Scripting and Development for the Semantic Web. It seems that the topic of food on the semantic web is the new black as this paper comes out at the same time as Boris Mann's announcement about the Open Restaurants aka "BaconPatioBeer".

This paper illustrates how a CMS like Drupal can be used on the Semantic Web and make every Drupal site part of the growing Web of Data. We created a cheese review site as a use case. It relies on the RDF API and the RDF CCK modules.

The good news is that we are working to get this RDF goodness into Drupal core! We are organizing an RDF code sprint. This sprint builds on Dries' ideas expressed in his recent posts Drupal, the semantic web and search and RDFa and Drupal. With RDF in the core of Drupal and RDFa output by default, it's dozens of thousands of websites which will all of a sudden start publishing their data as RDF.

So far, Stéphane Corlosquet, Florian Loretan, Benjamin Melançon and Rolf Guescini have signed up. How about you?

Some others are willing to come but cannot afford the trip until some funding is secured. To help us fund the sprint and bring more Drupal rockstars on board, please consider making a donation using the ChipIn widget on this page. The money will be used to cover flight, food and hotel costs for the sprinters. All sprinters are generously donating their time to make this happen. It would also be great to fly in a few additional people with extensive testing and Fields experience. Any excess money will be used to add more people, or will be donated to the Drupal Association.

Report on my recent trip to the US: Harvard, DrupalCon...

During my 5 week stay in the US, I was based at the Harvard's Initiative in Innovative Computing where I worked on the Drupal based Science Collaboration Framework (SCF) project with Tim Clark, Sudeshna Das and Benjamin Melançon. I had the chance to meet Tim last year when he visited DERI and presented the SCF project. Our goal was to align the efforts which were put into SCF with the efforts of the Drupal community in terms of RDF, and see what requirements are emerging from a project such as SCF and contribute them back to the Drupal community. Tim and Benjamin had arranged for me to present the latest RDF module developments at the semantic web interest group gatherings in Cambridge, Mass and New York. Many more popped up as I was there. They are detailed below.

New York for 2 days

Feb 26th: presentation at the meetup.com NYC semantic web user group organized by Marco Neumann. This was the description of the presentation:

screencast on RDFa in Drupal - examples and use cases

This is the video which was presented during DrupalCon DC 2009 at the Practical Semantic Web and Why You Should Care session.


The Semantic Web strikes again

Exciting times for the Semantic Web in Drupal...

Harvard IIC and SCF

Today is my first day at Harvard's Initiative in Innovative Computing where I'll work on the Drupal based Science Collaboration Framework (SCF) project with Tim Clark, Sudeshna Das and Benjamin Melançon. I had the chance to meet Tim last year when he visited DERI and presented the SCF project. We'll work on aligning the efforts which were put into SCF with the efforts of the Drupal community in terms of RDF. We will see what requirements are emerging from a project such as SCF and contribute them back to the Drupal community.

Talking about the Semantic Web and Drupal next week at DrupalCon Szeged 2008

Following up on the interest of the Drupal Semantic Web group, I'll present my ideas on the Semantic Web which will be an update of the talk I gave in Barcelona. I will also present a project which I co-started a few months ago: Neologism. The session is scheduled for Saturday 30th at 3pm.

Neologism is a lightweight web-based vocabulary editor and publishing tool built with Drupal. It makes vocabulary authoring easy and fun. Just create a vocabulary, add classes and properties to it, and your vocabulary is instantly published and available online! Several formats are supported via content negotiation: HTML, RDF/XML and N3. All the term URIs are dereferenceable and point to their human readable description.

See the session details on the drupalCon site.

Others are invited to tag along and present their Drupal semweb application as well!

There will also be some semantic web related BoF:

See you next week!


Subscribe to RSS - RDF