Platinum Sponsors

SUN

ELCA

Gold Sponsors

AdNovum

Credit-Suisse

Silver Sponsors

SAP

SyBOR AG

Partners

Netcetera AG

JUGS - Java User Group Switzerland

Stadt Zürich Wirtschaftsförderung

cR Kommunikation

Eveni AG

LiSoG - Linux Solutions Group e.V.

Star Alliance

ICTnet

simsa

Creatronic Media Supply

Media Partners

Netzwoche

inside-it.ch

javamagazine

InfoWeek

IT Reseller

JavaSPECTRUM

APRESS

Dean Allemang

Dean Allemang

(870) Semantic Mash-ups using RDF, RSS and Microformats

Peer-Refereed Talk

Tuesday, 2007-06-26, 13:40 - 14:20, Arena 6

  • Dean Allemang - TopQuadrant Inc. (speaker)
  • Holger Knublauch
  • Willie Milnor

Topics

Download the presentation

Abstract

The name "mash-up" started out as a reference in music, where two or 
more music sources were brought together into a single work. All too often, a 
web mash-up refers to a site that takes information from a single web site and 
displays it in a novel way, like having Google Maps display all the coffeeshops 
in a certain zip code.  While this is a useful thing to do, it seems odd to call 
it a mash-up, since there is only source for the information. But in order to 
combine information from multiple sources and display it even in an open API 
like Google Maps, requires a program that will translate from each source into 
the API. The mashing-up is under program control, that is, under the control of 
the programmer.

In this presentation we outline the idea of a semantic mash-up, where the 
mash-up program is a model-driven architecture. This puts the structure of the 
mash-up under model control, rather than program control. It is still necessary 
to translate each information source into a semantic structure (i.e., RDF), but 
once that has been done, the structure of the mash-up is specified by a model, 
rather than by program code. 

Seen this way, Semantic Mash-ups are an example of model-driven architecture 
(MDA), where a program is specified by a model rathern than by program code. The 
advantage of MDA versus conventional system construction by programming is that 
modeling is ostensibly more accessible to a wider class of users than 
programming. The holy grail of MDA is to empower "business users" to 
construct systems the way they want to, without a need for intervention by a 
programmer. 

While MDA is an attractive idea in principle, often it turns out that the 
process of modeling, which is supposed to be a non-technical activity, is just 
as technical as writing a program in a general-purpose language. If Semantic 
Mash-ups were just MDA in disguise, there would be no reason to pay them much 
heed. 

We argue, however, that because Semantic mash-ups do not attempt to provide 
anything close to a general-purpose system construction capability, that it is a 
lot easier for the process of modeling to be accessible to a general audience.  
Describing how to combine information in a mash-up is a fairly simple modeling 
task, one that actually can be done by user with fewer technical skills than are 
required by a more ambitious general-purpose programming language. 

The W3C standard language for sharing information on the semantic web, RDF, is 
the ultimate mash-up language. It provides and elegant framwork to describe 
information based on the global naming convention that is already in use 
throughout the World Wide Web, the URI. Merging information from multiple 
sources is a simple merging process in which all information about a particular 
"resource" (URI) is brought together from multiple sources into one 
place. 

The idea of using RDF as the basis for a mash-up was pioneered by the MIT Simile 
project in 2005. Simile tools could be used to convert information from a number 
of sources into RDF, merge them into a single source, then display them in a 
number of ways, including maps (using the Google Map API), faceted search, 
timelines and graph displays. 

The Simile tools do not take advantage of the other layers of the W3C semantic 
web stack, in particular, RDFS and OWL. As we shall see, these tools are 
indispensible for allowing a semantic model to describe how to combine 
information from multiple sources, i.e., to making a semantic mash-up model. 

In this presentation, we will describe a system we have built for enabling 
semantic mash-ups. The system, call TopBraid,  is based on the W3C standard 
lanaguages for semantic modeling RDF, RDFS and OWL, and built using the Eclipse 
platform.  The system supports the following user tasks for constructing a 
semantic mash-up:

1) Semantic Mash-up designers use the desktop interface TopBraid Composer to 
descdribe what information should be combined together and in what way for a 
semantic mash-up

2) Plug-in programmers build simple interfaces from RDF to a display plug-in 
(e.g., Maps, Calendars, spreadsheets, timelines, etc.)

3) Content providers mark up information in a way that makes it more amenable to 
mashing up (using microformats or RDFa)

The system is deployed using the Eclipse framework server-side; all plug-ins and 
models that are available in the desktop environment are also available in the 
delivery system. Any display that is available in the Eclipse rich client is 
also available for presentation on a web browser (thin client). 

Data source that are already in RDF compliant forms (e.g., RDFA, RSS 1.0, 
GRDDL-enabled microformats or even RDF/XML itself) are imported directly into 
TopBraid composer at mash-up design time. Other data sources can be converted 
into RDF using an automated mark-up strategy pioneered by the Simile project 
Solvent tool, whereby web page fragments are selected and marked-up with 
semantic metadata. 

Once these sources have been imported into TopBraid Composer, it is a simple 
matter to describe various combinations of information using the constructs of 
RDFS and OWL. RDFS provides a class structure in which sets of individuals are 
described as Classes; a mash-up of multiple sets (each set representing 
information from a different source) is specified as a common superclass of the 
classes to be mashed up.  The semantics of RDFS imply that the display of a 
class includes all members of its subclasses. This allows a modeler to define 
several layers of mash-ups, depending on the level of detail that is useful for 
a particular display. 

The system has been deployed with a small number of display plug-ins for maps, 
spreadsheets, calendars and timelines, and forms the focus of a semantic web 
training course that TopQuadrant runs at regular intervals. During the two-day 
hands-on session, course participants create their own semantic mash-up by 
finding and creating information sources, merging them together, and displaying 
them using the plug-ins provided with TopBraid. 

Future work includes incorporating more diplay modes in the form of Eclipse 
plug-ins, providing more capabilities to support markup of unstructured data, 
and translators of information from other structured data sources.