The Open Graph protocol and Drupal

Administration interface of the open graph protocol Drupal module

A couple of days after Dries Buytaert gave his keynote at DrupalCon San Francisco and reaffirmed his support for the Semantic Web in Drupal, Facebook co-founder Mark Zuckerberg announced at the f8 developer conference the brand new Open Graph protocol, a technology to turn webpages into social objects and capture them in a social graph. The announcement was backed up by a lot of PR and according to Facebook, 50,000 websites have already implemented OGP including IMDb, NHL, Posterous, Pandora, Rotten Tomatoes, Yelp and more. There is plenty to read about the marketing around this announcement, but I'm going to keep this post at a technical level only.

The good news is that the Open Graph protocol is built atop existing Semantic Web standards like RDF and RDFa, the same standards which have been integrated into Drupal 7. Facebook is joining Yahoo! SearchMonkey and Google Rich Snippets which now all consume RDFa. Although it has been designed and created by Facebook, OGP can be used by anyone, Facebook being the first consuming this data produced by the sites having the right OGP markup. In fact, there is little information about Facebook on the main OGP documentation page, they even refer the reader to Facebook documentation as "their documentation", keeping OGP as generic as possible. Any web application is free to markup their webpages with the Open Graph protocol markup, and any web application is free to consume this data like Facebook does today - in essence, it's no different than tackling the Semantic Web chicken an egg issue, making the data available as machine readable format (RDFa in this particular case) so that other peers can consume it. Kudos to Facebook for the right intention. The like button is not part of the Open Protocol, it's a Facebook specific implementation which is detailed on the Facebook developer documentation.

I've created an opengraphprotocol module for Drupal 7 which takes advantage of its new core functionalities such as the use of namespaces in RDFa. OPG requires to add the og and fb namespaces in the HTML output. This is something which would have required users to hack their theme in Drupal 6, but which is only a couple of lines in a Drupal 7 modules thanks to hook_rdf_namespaces:

<?php
function opengraphprotocol_rdf_namespaces() {
  return array(
   
'og'      => 'http://opengraphprotocol.org/schema/',
   
'fb'      => 'http://www.facebook.com/2008/fbml',
  );
}
?>

The rest of the module adds the Open Graph protocol RDFa markup in the head HTML element of the page: og:title, og:type, og:url and og:image. Most importantly, taking full advantage of Drupal's content types, the module offers a basic mapping interface to define what type of social object you want your content types to be mapped to which is then reflected in the page markup via the og:type property. With fields now in core, the module will also output whatever field is recognized as one of the Open Graph protocol properties like description, image, latitude, longitude, locality, region, email, phone_number, fax_number. So for instance, if you create a field 'description' (machine name field_description) its content will be marked up with OGP. Similarly you can create a field of type integer 'phone_number' and it will be exported as well. Finally the module adds the Like button for commodity and automatic integration with Facebook. You can see the module in action on this site and note the Like button below this article.

The Open Graph protocol's not perfect, but none of Google or Yahoo! got it right the first time either, and I believe OGP will align with the best practices. I wish the Open Graph protocol was not so specific and was encouraging developers to write richer RDFa markup like what we have in Drupal 7.

  • Open Graph protocol doesn't promote the "Don't Repeat Yourself" (DRY) pattern which RDFa enables: OGP asks developer to reiterate information which is likely to exist in the page. For example when a field is marked up with RDFa in Drupal, the related semantic markup is directly added to the HTML markup surrounding the field data. I take it that OPG is targeting applications which might not be have a flexible rendering engine like what we have in Drupal, but how about those which do?
  • OGP redefines vocabulary terms which have been around for many years:
    og:image        -> foaf:depiction
    og:latitude     -> geo:lat
    og:postal-code  -> vcard:postal-code
    og:email        -> foaf:mbox
    og:phone_number -> foaf:phone

    The problem is that existing RDF data which might already be using legacy vocabularies need to add OPG's specific terms if they want to be included in the Open Graph. This is a recurrent problem which happens every time a new big player adopts RDF, it happened with Yahoo! and Google too. RDF datasets end up with duplicate terms for the same semantic and have to add, say og:postal-code and google:postal-code even though they already have annotated their data with vcard:postal-code.

It also has some limitations which would not exist if more standard RDFa markup was used. More specifically:

  • The Open Graph protocol is not able to disambiguate a webpage and all the resources it might describe. In OGP's eyes, the social objects are the pages (HTML documents) and not the real concepts or physical objects people are likely to show an interest in. Let's look at some examples:
    • Take a user profile page (typically of type sioc:UserAccount) and the real person it describes (foaf:Person): what do you mean when you hit the "like" button, is it that you like that Person, or only that particular profile page of that person (say because it has a funny picture). Drupal makes the difference between the two entities in its RDFa markup, but OPG cannot capture that.
    • What if you want to like a particular comment on a page, and not the whole page?
    • Same goes for a page about a music album and all the songs it contains.

    The Web of Data Tim Berners-Lee and the Semantic Web community has been advocating for years is not what the Open Graph protocol enables, we're still at the old document linking stage here.

  • The Open Graph protocol introduces og:type, an alternative to the widely used rdf:type. The rationale behind it is to keep the markup consistent in line with their <@property> <@content> syntax. However, because the @content attribute is used, it means it requires a string as the type of object. The first consequences is a limitation in OGP: it is not possible to specify several types for the same object, for example you cannot say that someone is both an actor and director, something which would easily be specified using RDFa's typeof attribute if only we had proper URIs instead of string. Compare the following snippets. Here is what OPG promotes:
    <meta property="og:type" content="actor" />

    and this is what a more RDF friendly markup would look like:
    <meta about="" typeof="og:Actor og:Director" />

    By using the @typeof attribute (a shortcut in RDFa to specify the type of the object you're talking about), you get rid of the single type limitation, and you get to use real RDF classes which look like strings thanks to the CURIE syntax. Another benefit of using RDFa's typeof is that you are not limited to using types defined by OGP, but any type from any namespace such as foaf: orsioc:, and that's exactly what we do in Drupal 7.

I understand many the points above are justified by decisions the Facebook team took to keep the markup very simple for developers. David Recordon explained them in his talk The Open Graph Protocol Design Decisions at the Linked Data Camp at WWW2010, which was followed by a breakout session during which many people worked on the Open Graph protocol RDF schema, proof that Facebook seems to be fairly open to following the standards, or at least acknowledging them. I trust many of the issues I highlighted above will be fixed in the future.

Comments

Very nice write-up and also a good critique of the OGP in general! I implemented the OGP for the the SW Dog Food Site recently, and since it is Drupal 6-based, it did indeed have to hack the theme a bit to get it done. Looks like we should move over to Drupal 7 after all...

I'm wondering if you have looked at options for monitoring the activity of the Like buttons, i.e., get notified whenever someone likes an object on your site. It seems that one way to go is to use the Open Graph API to get information about the fan count of an object, e.g. like this:

https://graph.facebook.com/?ids=http://openspring.net/blog/2010/05/26/the-open-graph-protocol-and-drupal

However, this requires authentication, and I'm currently a bit stumped how to achieve this in a non-application context. Do you have any experience with that?

Knud

Hardcoding Open Graph properties on mere field names sounds a bit harsh ? There should be a more flexible way to assign meta properties to a given field (instance ?)

Excellent writeup, good to see more players supporting open vocabularies for the web.

Stephane, this question is maybe more towards your Drupal work - I've now installed Drupal 7 and read over the ways in which RDF are integrated into the core and in particular with CCK. But, I'm not sure if a tool exists yet to do what I would like, and if it would be useful to write one. The functionality I am looking for is to allow a user to easily create new content (in Drupal, should be a node) with a user-specified relationship to an existing URL; also to create a user-specified relationship between any two existing pages, in a simple one step process. The relation itself should be possible to be any available predicate AND taggable by terms in any language.

That is, I should be able to say that a lesson plan was ab:inspired_by a book, without necessarily having control over the type of either the lesson plan or the book page, as long as they each have URLs. Of course I can assert this in a triple by simply coding it in any RDF format, but I am trying to find out if this tool already exists, or if it might be worth writing for Drupal and WordPress.

It seems to me, that this functionality of adding arbitrary predicates between nodes does not yet exist in Drupal 7, even with the RDF support? I would like to be able to do it outside of the CCK system, as the predicates I have in mind are somewhat general and are not consistently associated with a specific type. (One can say a user 'recommends' an item, one can also assert that a book 'recommends' a movie if it is mentioned positively in the book. But most books will not have any 'recommends' properties attached to them, 'recommends' is not generally a property of a book. ) Basically I have a vocabulary in mind that could be applied to any media that expresses ideas, whether that media is a person, a web page, a raw chunk of text, a book, movie, or youtube clip. But I want to be able to make assertions without having control over the subject url.

Sorry if this question isn't very targetted at the Open Graph Protocol. I was hoping that OGP would enable the functionality I'm looking for, but it seems its in a different direction, of providing more markup for pages themselves rather than making it easy for 3rd party users to assert relations between them. We do get 'like', but I am looking for much more than that.

Thanks for any clue

--Golda

Hi Stéphane,

One further thing to add to your points on @typeof is that in RDFa 1.1 you can use a URI mapping as a term.

You can declare a term just as you would declare any other URI mapping, like this:

  <html
   xmlns:Actor="http://opengraphprotocol.org/schema/Actor"
   xmlns:Director="http://opengraphprotocol.org/schema/Director"
  >

You can also use the new @prefix attribute, like this:

  <html prefix="
    Actor: http://opengraphprotocol.org/schema/Actor
    Director: http://opengraphprotocol.org/schema/Director
   "
  >

But perhaps the most interesting technique is to use the new @profile attribute, and point to an external document in the cloud, that contains a list of these definitions.

Whatever technique is used to declare the terms, once defined they can be used like this:

  <meta about="" typeof="Actor Director" />

Best regards,

Mark

Add new comment