fishDelish Species Pages

Now that we have the FishBase data in our triple store, I took a look at how we could generate some nice looking species pages. FishBase currently offers pages presenting information about species (for example the Whale Shark).

Whale Shark on FishBase

I wanted to try and replicate (some of) this presentation in as simple/lightweight a way as possible. The solution I adopted involves a single SPARQL query that pulls out relevant information about a species, and an XSL stylesheet that transforms the results of that query into an HTML page. The whole thing is tied together with a simple bit of PHP code that executes the SPARQL query (using RAP — a bit long in the tooth, but it does this job), requesting the results as XML. It then uses PHP’s DOMDocument to add a link to the XSL stylesheet into the results. The HTML rendering is then actually handled by the web browser applying the style sheet. The resulting species pages (e.g. the Whale Shark again) are not — to use the words of David Flanders, our JISC Programme Manager — as information rich as the original FishBase pages, but they are sexier :-) .

Whale Shark on FishBase

To a certain extent, that’s simply down to styling (I’m a big fan of Georgia), but the exercise did help to explore the usage of SPARQL and XSL on the FishBase dataset. The SPARQL queries and stylesheets developed will also be useful in conjunction with the mysparql libraries developed in fishDelish.

The first problem I faced was trying to understand the structure of the data in the triplestore. The property names produced by D2R are not always entirely, ermm, readable. As Bijan has already discussed in his fishing expeditions, the state of linked data “browsers” is mixed. I ended up using Chris Gutteridge’s Graphite “Quick and Dirty” RDF Browser to help navigate around the data set.

A second question was how to approach the queries. The species pages have a simple structure. They have a single “topic” (i.e. the species), and then display characteristics of that species. So constructing a species page can be seen as a form filling process where the attributes are predetermined. It’s possible to write a SPARQL query to get information about a species with a single row in the results. The stylesheet (e.g. for species) can grab the values out of those results and “fill in the blanks” as required. An alternative would be to use some kind of generic s-p-o pattern in the query and pull out all the information about a particular URI (i.e. the species). In the species case though, we already know what information we’re interested in getting out so the “canned” approach is fine.

I also produced some pages for Orders and Families (e.g. Rhincodontidae or Rajiformes). The SPARQL query here returns a number of rows, as the query asks for all the families in an order or species in a family. There is redundancy in the query result as the first few columns in each row are identical. A cleaner solution here might be to use more than one SPARQL query — one pulling out the family information, one requesting the family/species list. That would require more sophisticated processing though, rather than my lightweight SPARQL query + XSL approach. Again, this is something that mysparql would help with.

Overall, this was in interesting experiment and exercise in understanding the FishBase RDF data. Harking back to Bijan’s earlier blog post, as I’m already familiar with SPARQL and XSL, it was probably easier for me to produce these pages using the converted data, but it’s not clear whether that would be true in general. It did help to illustrate the kinds of things we can now begin to do with the RDF data though, and puts us in a situation where we can look at further integration of the data with other data sets. For example it would be nice to hook into resources like the BBC Wildlife Finder pages, which are also packed with semantic goodness.

This entry was posted in 1 and tagged , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

2 Comments

  1. Posted March 7, 2011 at 11:23 am | Permalink

    Hi again.

    Great to see the data out. I’ve just been having a browse, not that I know anything about fish data, but looks good :)

    I wondered if you’d thought about the possibility of being able to using W3C ‘cool URIs for the semantic web’ and possibly the UK Govt patterns advice at all? Maybe D2R out-of-bob behavior makes this tricky?

    Cheers, Adrian
    JiscEXPO Synthesis Liaison

  2. Posted March 8, 2011 at 1:58 pm | Permalink

    We are building LOD into our species database and I was looking into how to create links from our species to Fishbase when I came across your FishDelish site.

    Perhaps we can create some cooperation? You can grab our current RDF dumps at http://eunis.eea.europa.eu/void.rdf

    Best regards,
    Søren Roug
    European Environment Agency

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>