Talis Consultancy
World leading expertise in Linked Data and the Semantic Web

Category: Consuming Data

The Tyranny of Time

A guest post by Lawrence Serewicz, Principal Information Management Officer, Durham County Council

I came across the following reference to time within the retail sector and it made me consider how my world of local government, or any business for that matter, thinks about time.

An old saying in the retail industry is that: ‘If information is available monthly, then decisions taken will take 6 months to have an effect. If it is available weekly, then decisions take a month to influence outcomes; if daily, it takes a week; and if hourly, the decisions can have an impact the next day’ (p.13)

(Source : Valuing Information as an Asset http://www.sas.com/reg/gen/uk/valuing-information )

How often do we collect data? In many organisations, there are quarterly returns, but is that enough for today’s services? In some cases, councils collect real-time data, but are their reporting systems ready for it? For example, management or cabinet committee, meetings may be once a month, but is that enough to have a strategic view of what is happening within an organisation?

At one level, the timeframe for Council Members is different because their work is strategic, they are trying to shape the organisation’s future and where it will be over the long term and not determining if the recent refuse collection achieved 99% or 98% effectiveness. Even if we discount the member’s need to have real time data (at least a strategic level) and focus on the manager/officer’s role, we still see the tyranny to time.

How often do we see, use, or for that matter, analyse, real-time data? Do our performance management systems display a disconnect between the timescale within which they are collected and reported? We may have refuse bin collection rates measured every day, but if our performance reporting within the organisation is quarterly, how well does that serve the organisation? At the same time, is that performance information available to the services, such as customer service desks?

In this example, if the real-time performance is being reported to the customer service desk, they can see that the bin collection rate on a snowy day (for example) is lagging in some areas, but is still robust in other areas. Thus, a call from an area with good collection (say 99%) is going to be a different issue than a missed collection in an area with an 80% collection rate because of the snow conditions. Yet, how many performance management systems or performance information systems are designed to capture and analyse real time data. Even weekly data, can be considered real time data depending on the service, so it raises the point at the start. If the information is only available quarterly, what is the impact rate? If you collect each quarter, is the final impact seen yearly or in two years? If that is the timescale, is it going to be effective?

What does this have to do with open data? If data is being collected and made available to customers and the public, are they getting real time data or is there an organisation influenced lag effect on the data? One of the main themes within the UK  government’s open data collection consultation (http://data.gov.uk/opendataconsultation) as well as its overall transparency agenda is to open service performance information to the public.

The service performance information will inform their choice about services but also to hold it to account. Yet, if there is a lag effect, between when service information is collected and published can the public hold a local authority to account effectively? How much and when the information is released can have a large influence on whether an organisation is accountable. If it only has to report once a year, how much accountability can be achieved? If a change in performance is required, how will it be demonstrated in such a long reporting cycle?

If, however, real time data is released, will that have a destabilizing effect on the political process? If the political process is relying on quarterly performance reporting and the public are getting the information in real time, how will elected members be able to respond? Moreover, if members, as residents, are consuming the information in real time as well, what is the role of a quarterly performance reporting system?  To be sure there will be different reports for different issues, but the underlying question is how to make open data respond to real time demand.  Do I need to know the car park was full last week if I am trying to get parked now?

The issue of time is also about how and where information is released. If an organisation releases its performance statistics in a paper report, and not as a spreadsheet, can external scrutiny be achieved?  In that sense, the format for publication will show the timescales. Such reporting has an immediate and direct effect on the ability of the public, and members, to hold the organisation to account.

At the same time, there is the question of whether real time reporting fits your strategy. If one company is working on a the day to day reporting and another is taking a ten year strategy to grow they will have different understandings of time.  Moreover, their reporting mechanisms will be different.  Yet, can the 10 year plan work without taking care of the day to day? In that sense, can anyone escape the tyranny of time?  The more your competitors harness, the more you will need to adapt or adopt.

From an accountability perspective, the issue may be simply finding a way to reconcile that with monthly or quarterly performance reporting to the real time data.

What effect this will have on the way we operate in the public and private sectors?  Only time will tell.

Schema.org Déjà vu

schema-org1 The Web has been around for getting on for a couple of decades now, and massive industries have grown up around the magic of making it work for you and your organisation.  Some of it, it has to be said, can be considered snake-oil.  Much of it is the output of some of the best brains on the planet.  Where, on the hit parade of technological revolutions to influence mankind, the Web is placed is oft disputed, but it is definitely up there with fire, steam, electricity, computing, and of course the wheel.  Similar debates, are and will virtually rage, around the hit parade of web features that will in retrospect have been most influential – pick your favourites, http, XML, REST, Flash, RSS, SVG, the URL, the href, CSS, RDF – the list is a long one.

I have observed a pattern as each of the successful new enhancements to the web have been introduced, and then generally adopted.  Firstly there is a disconnect between the proponents of the new approach/technology/feature and the rest of us.  The former split their passions between focusing on the detailed application, rules, and syntax of it’s use and; broadcasting it’s worth to the world, not quite understanding why the web masses do not ‘get it’ and adopt it immediately.  This phase is then followed by one of post-hype disillusionment from the creators, especially when others start suggesting simplifications to their baby.  Also at this time back-room adoption by those who find it interesting, but are not evangelistic about it, starts to occur.  The real kick for the web comes from those back-room folks who just use this next thing to deliver stuff and solve problems in a better way.  It is the results of their work that the wider world starts to emulate, so that they can keep up with the pack and remain competitive.  Soon this new feature is adopted by the majority, because all the big boys are using it, and it becomes just part of the tool kit.

A great example of this was RSS.  Not a technological leap but a pragmatic mix of current techniques and technologies mixed in with some lateral thinking and a group of people agreeing to do it in ‘this way’ then sharing it with the world.  As you will see from the Wikipedia page on RSS, the syntax wars raged in the early days – I remember it well 0.9, 0.91, 1.0, 1.1, 2.0- 2.01, etc.  I also remember trying, not always with success, to convince people around me to use it, because it was so simple.  Looking back it is difficult to say exactly when it became mainstream, but this line from Wikipedia gives me a clue: In December 2005, the Microsoft Internet Explorer team and Microsoft Outlook team announced on their blogs that they were adopting the feed icon first used in the Mozilla Firefox browser. In February 2006, Opera Software followed suit.  From then on, the majority of consumers of RSS were not aware of what they were using and it became just one of the web technologies you use to get stuff done.

I am now seeing the pattern starting to repeat itself again, with structured and linked data.  Many, including me, have been evangelising the benefits of web friendly, structured, linked data for some time now – preaching to a crowd that has been slow in growing, but growing it is.   Serious benefit is now being gained by organisations adopting these techniques and technologies, as our selection of case studies demonstrate.  They are getting on with it, often with our help, using it to deliver stuff.  We haven’t hit the mainstream yet.  For instance, the SEO folks still need to get their head around the difference between content and data. 

Something is stirring around the edge of the Semantic Web/Linked Data community  that has the potential to give structured web enabled data the kick towards mainstream that RSS got when Microsoft adopted the RSS logo and all that came with it.   That something is schema.org, an initiative backed by the heavyweights of the search engine world, Google, Yahoo, and Bing.  For the SEO and web developer folks, schema.org offers a simple attractive proposition – embed some structured data in your html and, via things like Google’s Rich Snippets, we will give you a value added display in our search results.  Result, happy web developers with their sites getting improve listing display.  Result, lots of structured data starting to be published by people that you would have had an impossible task in convincing that it would be a good idea to publish structured data on the web.

I was at Semtech in San Francisco in June, just after schema.org was launched and caused a bit of a stir.  They’ve over simplified the standards that we have been working on for years, dumbing down RDF, diluting the capability, with to small a set of attributes, etc., etc.  When you get under the skin of schema.org, you see that with support for RDFa and supporting RDFa 1.1 lite, they are not that far from the RDF/Linked Data community.

Schema.org should be welcomed as an enabler for getting loads more structured and linked data on the web.  Is their approach now perfect,? No.  Will it influence the development of Linked Data? Yes.  Will the introduction be messy? Yes.  Is it about more than just rich snippets?  Oh yes.  Do the webmasters care at the moment? No.

If you want a friendly insight in to what schema.org is about, I suggest a listen to this month’s Semantic Link podcast, with their guest from Google/schema.org Ramanathan V. Guha. 

Now where have I seen that name before? – Oh yes, back on the Wikipedia RSS pageThe basic idea of restructuring information about websites goes back to as early as 1995, when Ramanathan V. Guha and others in Apple Computer’s Advanced Technology Group developed the Meta Content Framework.”  So it probably isn’t just me who is getting a feeling of Déjà vu.

Ontologies wont make you rich: or will they?

This post sets out some discussion points that arose in response to a conversation with +Aaron Bradley on Google+. The conversation was prompted by Kendall Clark’s post which started by suggesting “an OWL ontology is like a public API for your data”. Aaron suggested that his OWL ontology may need to remain private in order to retain competitive advantage.

There is no value in writing ontologies that are not shared. If you describe your own data in your own way without sharing that ontology, how will you ever find other data that you could mix into yours at a later date?

The counter argument is that the data within your organisation is disparate and needs to be organised, but you don’t want to give away your secrets as to how you have organised your data. I am not about to claim that Linked open Data is the only way to do Linked Data. Linked Data within an organisation will allow data integration across departments to happen more easily.

But the ontology is not core to this. It is the way you can combine data with shared URIs that use open ontologies that is the killer feature. So if you want to protect anything, then you may want to protect those URIs. Now that we are talking about URIs we have already moved the discussion into the data layer rather than the ontology layer, and you’re still able to protect your data even if people know what ontologies you’re using.

An ontology is not going to give you a competitive advantage. Your advantage will be what you do with the data, not how the data is described. No-one to my knowledge has made a business out of trading database schema; but when they trade well curated data, there is money to be made.

If more than one organisation uses the same ontologies to describe two different datasets, then that ontology has started to create a data market where those two organisations can trade their data without prohibitive data integration overheads. Sharing your ontology helps you to grow your market.

If you are interested in having your data easily available via a public API, you will find that publishing your data as Linked Data, because it can be published with both a Human friendly HTML face and machine friendly RDF face, transforms your website into your API. There are standard techniques that can then be applied to monetize your data streams, and this may even include a paywall.

Of course you might use OWL, or some part of OWL, to describe how your data is structured, but if you need APIs built on top, then a Linked Data approach is proving to be a simple way to achieve both those aims in one go, surely that is more cost effective?

In summary: The data layer is where your competitive advantage sits. The ontology layer is the bit of the Linked Data ecosystem that is going to add value to your data through ontology re-use making your data easier to integrate, both internally and externally, and growing your market. Your API (either internal or external) can be built easily using a Linked Data approach.

If you want to know more about how re-usable ontologies can grow your market, then talk to us.