The Open Data Institute, set to launch next month, aims to turn digital information into economic growth. Matt Ross meets its chief Gavin Starks, who sees data as the raw material for an important new British primary industry
“There are two cadences,” says Gavin Starks in his gentle Scottish lilt. “Things always take longer than you expect them to; but actually, with hindsight, a lot of things have happened much faster and at a greater scale than anybody had anticipated.” And yes, he knows that sounds paradoxical: technological advances lead people to anticipate rapid changes in the way we live, he argues, but in fact it takes some time for societies to adapt and infrastructures to develop – then when they’ve done so, things develop very fast indeed.
When Starks was pioneering web-based businesses at Virgin in the mid-’90s, he recalls, “we thought everything was going to change overnight. Of course it took a decade to get decent broadband penetration and web services, but once you’ve reached a tipping point the change is much more rapid than any business or government can move”.
When progress suddenly accelerates in this way, any organisation that hasn’t embraced the new world is in deep trouble – as Starks has witnessed in the media, publishing and music industries. After he put Virgin’s radio and retail operations online, he says, “there was about five years of farting around” while people worked out how to do business on the web. Then things took off: “A whole range of companies went bust because they didn’t adapt to an emerging world. In fact, they tried to stop it; some of them still are,” he says. “But the people who transformed, 80 per cent of their revenue is now coming from digital.”
The lesson here for all “incumbents” – in both the business and public sector worlds – is that “by the time the change is happening, it’s too late for larger organisations to adapt,” says Starks, “so they need to take a step into what may feel like a very unknown space to begin with. If they don’t, a lot of their core value could be completely undermined in 10 years”.
Data mining
Starks is a veteran in a very new industry: after stints as a professional astronomer, and an academic with a line in ambient electronic music – quiet, contemplative careers that seem a good match for his character – he became one of the people who helped to spin the worldwide web as we know it, and has spent the last 15 years founding businesses in the emerging digital industries. Now he’s been recruited as the chief executive of the new Open Data Institute (ODI): an independent body established with Cabinet Office cash to help turn the UK’s ever-growing stock of digital data into economic growth and social benefits. Working with the ODI’s chairman and president – the academic Nigel Shadbolt and Tim Berners-Lee, who invented the worldwide web, respectively – Starks aims to catalyse and shape the release of public data, to raise awareness of the possibilities, and to incubate new businesses at the heart of a brand new industry.
That last aim isn’t hyperbole, Starks believes: while the web has transformed many forms of communication and business, he sees “data as a new primary industry; an entirely new asset class” – and one that will be central to our economic and social development. To realise the potential of the data we hold, he argues, information must be released and different datasets combined, producing insights and generating products that “create value for everyone involved: the taxpayer, government departments, start-ups and private companies.”
As an example, Starks cites a start-up that’s combining health data with information from other sources, then applying analytics techniques to produce findings that “could result in substantial financial savings for the NHS”. Another new company monitors the changing energy supplies of IT ‘cloud’ providers, then directs its clients’ data-processing jobs to the places with the lowest carbon emissions: this not only minimises pollution, Starks points out, but also boosts greener businesses and encourages others to decarbonise. Then there’s Starks’ own business, AMEE – the ‘Avoiding Mass Extinctions Engine’ – which a few years ago “wrestled information out of” the Department for Environment, Food & Rural Affairs and created a tool for measuring the carbon footprint of every human activity.
“Within three months of launching, we were the back end of Defra’s campaign ‘Act on CO2’. We ended up engaging with two million homes across the UK, and we had hundreds of people signing up for access – ranging from Google to Morgan Stanley to Radiohead,” Starks recalls. “We opened up a channel so they could get continuous, reliable, structured information and build their own solutions on top of it. Radiohead measured the footprint of their global tour, Morgan Stanley built an intranet tool for use within their organisation, and Google built a gadget that mapped everybody’s carbon footprint. None of those organisations would have gone to Defra and downloaded the data spreadsheet.”
AMEE is about to launch a service that combines private and public sector information to produce an environmental score for every company in the UK – so “we’ve now come full circle,” says Starks. “We’re providing value back into [Government Procurement Service chief] David Shields, who has a new metric to measure a company’s environmental performance.”
Bring out your data
This is ground-breaking stuff, says Starks – and the UK is “leading the world in terms of our open data policies”; something that could give us a head start in the emerging data industry. It’s the ODI’s job to ensure that we capitalise on that opportunity, in part by drawing a “reliable, structured, continuous supply of useable data” out of public sector organisations. “We need to help people understand what we mean by addressable information; by provenance or traceability; by structure; by continuity of supply,” he adds.
To produce this pipeline of useable data, he continues, “the ODI will provide expertise and resources to work with the public sector to help them unlock their own assets”. This isn’t just a technological task, he adds, but about demonstrating the agenda’s importance: his staff will support government bodies that have “ambition but no resource to unlock that information, help connect them with the ability to publish,” and demonstrate how the published data could be used for public benefit. Soon the ODI will be running training courses targeted at civil servants, and it’s pulling together “teams that will go into organisations, show them what’s possible, and ideally leave them with the knowledge to do it themselves.”
To date, it’s been awkward for businesses to use much of the information published under the open data label: too many public bodies have simply put raw info online in inaccessible formats, or without providing crucial supporting details such as how it was collected or calculated. Part of the solution, says Starks, is the use of ‘application interfaces’ (APIs): specifications that provide a common baseline for the categorisation and interpretation of data, enabling data users’ computers to locate, understand and compare pieces of information.
To the uninitiated, selecting an API looks like a fairly daunting task: it’s like the early days of electricity, says Starks, when “London was covered with different electricity suppliers, all working at different voltages, with different plugs etcetera.” But back in the early 1900s, electricity suppliers quickly realised they should converge on a common approach – and the digital world, similarly, is coalescing around a set of freely-available open standards. “Eventually, everybody worked out that we need a common way of doing this,” he explains. “And the comparison is that if the government says: ‘We need things to be machine-readable,’ that drives a certain set of behaviours. It doesn’t say: ‘The data must be in this format,’ it just asks for it to be machine-readable – and that translates into: ‘Please have an API’, basically.” The ODI, says Starks, can help public bodies build or choose an appropriate API for their data.
Asked whether APIs can be easily applied to existing datasets, Starks baulks. “I’d love to say ‘yes’, but having seen some of the legacy systems, you need in some cases a bucket and spade to get the data out,” he says. “Still, transforming that information and creating a cut that you can make public is not rocket science.” The rapidly developing commodity market in cloud services and a looming price war between Amazon and Google – “I think there’s going to be a huge battle,” he says – are fast driving down costs for this kind of work, and the Government Digital Service “has done a spectacular job of demonstrating how feasible it is. The toolkits they use are a complete mix of open source and cloud services and a whole range of tools that the web start-ups are using.”
There are traps for the unwary, Starks warns – including the risk that ostensibly ‘anonymised’ data published separately by different organisations can be combined to reveal people’s identities – but “we can bring in expertise that will do that kind of data analysis. We have the skills in the UK to do this.”
To free, or not to free?
Of course, some parts of the public sector make a living by charging for access to their data – and Starks notes that “there’s nothing wrong with charging for data, even though it’s open.” However, he argues that in some cases, free publication would create economic benefits far greater than the loss of public revenue. “If you have a dataset that only one company or individual wants, there isn’t a really good economic case for investing in opening up that data unless they’re going to pay for it,” he says. However, where thousands of businesses could use a dataset to underpin their products and services, in many cases Starks would favour free publication.
Many of the public bodies that generate revenue by selling access to data “are not charged with innovating with it,” Starks points out – and the potential value of datasets only becomes clear after people have had the opportunity to spend time playing with the info and exploring possible uses. Many entrepreneurs starting up a new business “have one idea, but you never end up [pursuing] the first idea,” he notes. “You have to transform it and have a real culture of exploration.”
“I’m not criticising people for doing their jobs,” he adds, “but our view here is more about how we can unlock more value.” The UK’s Postcode Address File (PAF) is a good example of a dataset that should be thrown open, he argues: the Danish government quadrupled its return on investment by releasing its PAF free of charge. “It’s such a fundamental building block for pretty much every business that needs an address that it’s kind of crazy that we don’t have that as a properly open file.” (See news, p3.)
Private organisations too should be releasing their digital archives, Starks believes, enabling third parties to combine them with other datasets in ways that would help businesses understand and serve their customers. And there’s another reason for favouring open data, he argues: if organisations collect data without publishing it, he says, people quickly become suspicious.
Tell me about myself
Asked whether schemes such as Labour’s National Identity Database heightened public fears over how the government would use data, Starks replies that “the absence of transparency is usually the root cause of anxiety. If you know that information is being collected, you know that somebody is watching what you do.” The potential for such data to be used in ways which are against individuals’ interests – for example, a health insurance company using supermarket loyalty card data to judge people’s risk of suffering a heart attack – creates “a huge amount of latent fear, which I think is going to increase. I don’t have any doubt that the anxiety around this surveillance will become stronger and stronger.” After all, he says, the amount of data – and thus the potential for such surveillance – is growing all the time.
The only solution, Starks argues, is for the owners of data to “become very transparent about what exists, what’s being measured, and how you can access it yourself. Then you address who else can access it, and what control others have.” Tesco has just released its own Clubcard data to customers, he adds approvingly: “As a Clubcard user, I’d quite like access to my own data, thank you!”
As both public and private organisations release more data, Starks says, the opportunities for businesses will continue to grow – and the ODI is running ‘hackathon’ brainstorming sessions, start-up incubation spaces, and mentoring schemes in order to help entrepreneurs generate ideas and support their development into concrete products. A key element of its work will be signposting businesses to potentially valuable datasets, and helping them to interpret the ever-rising tidal wave of information. “Innovation doesn’t happen in a vacuum. You can’t just say: ‘If we build it, they will come’,” he explains. “You have to, in some cases, drag people to the feeding trough and say: ‘Look, here’s the information. We’ve worked out some ideas around this’.”
Many people already understand the potential value of the data their organisations hold, Starks believes, but they lack the time or tools to move forwards. “Typically, you’ll find a champion within a department, a corporate organisation or start-up,” he points out. “How do we unlock that potential energy? There’s a lot of energy in the system, a lot of enthusiasm. People know in most cases what they need to do, but they can’t get there.”
For the ODI, then, “a key message is to say: ‘This is happening. We’re here to help it happen and to catalyse it, and we’re here to support people who are very time- and resource-limited to understand the future and ensure that they don’t get left behind’.”
The organisation is well equipped for the task, Starks believes. “We’ve got great seed funding; an amazing space where we’re already holding events; the momentum we’ve got already is spectacular; our connections with the industry are exceptional,” he says. “Our positioning as a non-partisan, non-profit independent body that can enable and act as a neutral space for companies, start-ups and the public sector to come together is incredibly exciting.”
This former astronomer, musician, lecturer, media professional, green businessman and digital entrepreneur has seen a few things in his time, but Gavin Starks seems genuinely – if characteristically gently – excited about his new job.
“I’ve been running start-ups for nearly 15 years, and when I look at this area the components you need to make something scale and work are embodied in the mission of the Open Data Institute,” he says. “We tick all the boxes you could possibly want – and then create a new set of boxes, and tick them too.”
Are there any datasets held by other government departments or agencies that you’d like to get access to? The ODI is trying to identify data that should be published, and to find civil service organisations that want help with releasing their own data. Contact them at events@theodi.org