Thursday, January 29, 2009

Perfection Is An Unrealistic Goal

Linda Rising gave a presentation on a topic that is still fairly misunderstood. She tries to address the question "What is the best way for us to work as individuals?" She talks about not deceiving yourself and understanding that everything is a journey without a destination. It is better to build something that is known to be imperfect now, than wait for clarification, as more knowledge can be gain from a system that doesn't work than any theoretical discussions. Software development is like a series of experiments, a series of learning cycles. She then uses our regular sleep cycles to identify our optimal work cycles and emphasises the importance of taking breaks throughout the day to improve productivity.

Monday, January 26, 2009

Modelling Objects In RDF

Also at The Semantic Technology Conference this year, I will again be speaking on persisting Objects in RDF. This is an introductory session on building object-oriented applications with the Sesame RDF repository. This year we will be going into more detail on the advantages of using RDF as a persistence layer. Including achieving forward and backward data compatibility, OWL's representation richness, longer running optimistic transactions, and using graphs for auditing purposes.

Reblog this post [with Zemanta]

Thursday, January 22, 2009

Unifying RDF Stores

In June I will be speaking at The Semantic Technology Conference with Paul Gearon about some of the integration work we did with Sesame and Mulgara. We will also be comparing a hand full of existing RDF stores and demonstrate how they can be used interchangeably or as a unified federation.

The RDF storage market has become much more divers recently, with many providers tailoring to specific environments and data patterns. This talk will cover some considerations to help you identify what RDF store implementation is best for your data and environment. We will discuss common features found with many providers, unique features found in only a few, and demonstrate some of the new features in Mulgara. We will also be demonstrating a unified RDF API to allow RDF stores to be swapped into applications post-development and how using a provider independent API enables divers RDF storage nodes to be federated together, allowing each node to be tailored to the unique shape of the data being stored within.

Hope to see you there!

Reblog this post [with Zemanta]

Thursday, January 15, 2009

Validating RDF

RDFS/OWL is criticized for its weak ability to validate documents in contrary to XML, which has many mature validation tools.

A common confusion in RDF is the rdfs:range/rdfs:domain properties. A property value can always be assumed to have the type of the rdfs:range value. This is very different to XML, which only has rules to validate tags, but cannot conclude anything. Many of the predicates in RDF are used for similar inferencing, but they lacks any way to validate or check if a statement really is true. This is a critical feature for data interchange, which RDF is otherwise well suited for.

To address this limitation, an RDF graph can be sorted and serialized into RDF/XML. With a little organization of statements, such as grouping by subject, and controlled serialization, common XML validation tools can be applied to a more formal RDF/XML document. Our validation was done with relatively small graphs and we restricted the use of BNodes to specific statements to ensure similarly structured data would produce similar XML documents.

Although TriX could also have been used (it is a more formal XML serialization of RDF), it was considered that the format produced would not be as easy to work with for validation tools.

With a controlled RDF/XML structure we were able to apply RNG to provide structure validation before accepting foreign data and able to automate the export into more controlled formats using XSLT. (We used a rule engine for state validation.) Although RDF is a great way to interchange data against an changing model, XML is still better over the last mile to restrict the vocabulary of the data accepted.

Reblog this post [with Zemanta]

Monday, January 12, 2009

Did Google Just Expose Semantic Data in Search Results?

Google has been hesitant, in the past, of employing semantic technology, citing trust issues and the lack of quality meta-data in The Web today. However, it would appear Google is warming to the idea of semantics in the area of natural language processing. I have written a couple articles on the subject previously, but it appears that Google is exposing their own text analysis, acording to this blog entry.

Reblog this post [with Zemanta]

Thursday, January 8, 2009

Building Super-Scalable Web Systems with REST

I came across this blog posting that I thought was an interesting example of how the services should be oriented around the data and not the other way around.

The idea here is that the data (in this case weather info) needs to be partitioned in a meaningful way (by location). REST services can then be created around this and utilize the caching available in HTTP.

By creating and organizing services around the domain's data model more efficient services can be created. This reinforces why SOA often ends up failing, because there is not enough emphasis on the data. It also highlights REST support for caching and cache validation built into the protocol. Other service/message specifications (like SOAP) would have more difficulty identify and implementing a caching mechanism.

Reblog this post [with Zemanta]