
I’ve started working on a project to aggregate and visualise data relating to Ireland from 1980-2010. Finding this data has been difficult. Putting the data into formats I can use efficiently is going to be painful. It turns out, unsurprisingly, that most data I was trying to find about Irish public life in the last 30 years either isn’t available, or isn’t available in formats that make it easy to analyse. For example:
Private media companies are obviously under no obligation to share their data in accessible formats, but it would be nice. The Irish Times and Irish Independent’s archives are pretty woeful; light-years behind what is being done by The New York Time, Guardian, NPR, Current, BBC and other media organisations.
This might sound like an engineering problem, but I think it’s much more fundamental than that. When the world was run by slower, linear, paper-based decision making processes it was perfectly reasonable to create and maintain records in silos, send them around at a leisurely pace, and then make them available when and if they were needed - or not - because who do you think you are?
We don’t live in this world, and you could argue that we haven’t lived there for a long time - and it’s pretty clear that they way the Irish government, public bodies and media organisations share information hasn’t kept pace with reality. Having a website is seen as doing enough.
Good democracy requires that we are able to hold public government, and those who exercise power to account. The friction involved in getting the most basic information from Irish public institutions acts against informed democracy. We assume that getting insight into how the government spends money is impossible, so we don’t try. It’s a feedback loop that encourages secrecy, let’s bad people away with doing bad things, keeps good people from getting the credit they deserve - and promotes a lack of understanding of how difficult it actually is to run a country.
We are living through a bruising period in Irish history - and the next government doesn’t just need to take painful economic decisions, it also need to begin to restore trust in the idea that government works in all our interests. We have an information problem.

Soon after its own decline into economic chaos, Iceland (above - doesn’t it look lovely?) realised it also had an information problem - there were just too many places for bad decisions to hide, so their parliament passed strong press freedom laws that would enable difficult questions to be asked in the future. It’s embarrassing that in Ireland we are still bringing journalists to court to force them to reveal their sources. (I am not saying the economic meltdown could have been avoided if Irish financial journalists had tools to get at more information - they were asleep at the wheel - but it would be a start, and would give them no excuses.)
The US government required that raw data on fiscal stimulus spending be made available in both human and machine-readable formats. They also created Data.gov - a central source for federal data. We have PDFs.
Both the Icelandic and US governments recognised that a free exchange of timely information has a role in avoiding crises in the future, and is required for real democracy. What do we have to do to get on board with, you know, the present?
One of the first things the next government should do is to require all government departments to make all substantive data collected or created in the course of their work available in machine-readable file formats, and also through a freely available API. This information should be gathered together in one place where people can access it on their own terms, and should be extended out to all bodies who receive public money over a certain threshold within a short period of time. If citizens are paying for the clock, citizens should be able to see the clockwork.
The government should also prove that it is a supporter of active inquiry into how it works - and should put in place strong protection for whistle-blowers, emulate Iceland’s new press freedom laws, remove the fee for freedom of information requests. Make it clear that Ireland has changed, changed utterly.
First of all don’t make this a project. This isn’t a project - a temporary get-together to create instantly obsolete artifacts - this is a set of new understandings. The wrong way to do this would be to hire a massive, international consultancy firm to carry out an epic, exhausting, expensive project - whereby they would first identify every piece of data that bodies have, engineer new processes to make all that data easier to collect, procure servers, put teams of programmers to work etc. etc. We’ve tried that, and it sucks. Really. Please don’t do this.
First, just agree - in one sentence - what expectations Irish people should have of their government in relation to information; a sort of Data Elevator Pitch. Match the need for information in an active democracy with a set of principles that should apply across Government.
Next, carry out a one week consultation to identify the 10 points of data that would be useful for other parts of government, experts (researchers, media), and members of the public. Shocking stuff I know - ask people what they want.
For each of these data points define how it’s created, by whom and how often. If it isn’t currently created, or isn’t available, flag this publicly - and move on to the next data point. This isn’t a project remember, so you can think longer term - so long as you actually come back and get that data point very, very soon.
Now, just get the bloody data and stick it in a database; none of this million Euro consultancy shite - cheap or free tools on bulletproof servers. Take everything currently in the CSO’s databases and make sense of that too. Write a simple API that returns a particular data point for a particular time period - get to the release of a minimal viable product within 4 weeks of the end of your consultation period. Yes! 4 weeks! Holy real-world-timeframes!
Gather feedback from the people who use these tools constantly - make your users the most important people in your world - and create a roadmap for 1, 3, 6 months. Release improvements every two weeks and keep moving your roadmaps forward.
This doesn’t have to be like pulling teeth; it is the type of project that could use one or two helpful (non-gold-digging) advisors, and a merry band of 3-5 folks (youngsters) in each department. They just have to very good with technology, very good with people, humble enough to think small, and have - and I’m not joking here - 100% support from internal masters and the government. Just let them do it, keep the shit out of their way.
If you create this new reality, build the tools, give us the information we need in a timely fashion, don’t charge us for the “privilege” and make the project look straightforward you will have succeeded in doing what no other government has ever done. If you make it easy to get at data, and we see how limited your choices are, we will understand when you make sensible decisions which turn out not to bear fruit. At least you will have made it easy for us to see the context in which you’re operating.
And if you don’t take this opportunity; well, we will continue to have little useful insight into how decisions are made. When things don’t turn out for the best we will assume you are secretive and/or incompetent and we will elect the other people. Soon the time to act will have passed, the inertia will overcome you, and we citizens will put our hopes away for whomever is next - because, of course, this is all Ireland could ever have been.
Blah blah blah, just do this;
The rest of us should try to copy these people:

Paul,
Thanks for posting this. I’ve been talking to a number of people about this issue over the last few months. For all the talk of Smart Economy, it’s astounding that there hasn’t been a government-led Open Data initiative.
I think this is something that can happen quickly if given a bit of a push from the web / tech community.
There are already people giving up their spare time to make requests under Freedom of Information to various government departments—Gavin Sheridan for one, http://thestory.ie/about/. For the most part, the data that is eventually returned needs considerable cleaning up, never mind putting it into a format that is usable such as an API.
My company, echolibre, has created an Open Source API Framework called FRAPI - http://getfrapi.com/ - which would be ideal for quickly making large government datasets available in a usable, documented manner.
In our spare time we’ve looked at how it could be used with government data. Just this past week we have prototyped taking a large CSV and layering an API over it.
I’d like to continue this conversation in the real world, with anyone who has the connections on influence to get buy-in from someone in government or the public service.
Again, great post and I know there are many people in the Irish tech community who want to do something along these lines.
And I spend a great deal of time converting PDFs to spreadsheets, too :-)
Thanks for the feedback; it’s good to see others are thinking along the same lines, and I don’t think there could be raise this issue. If we’re serious about not ending up here again then things need to change.
In the short term, it’s highly likely that I’m going to need to build some sort of database of statistics wrapped in an API to power the visualisation project that I’m working on. Eamonn - I will definitely check out FRAPI.
Cool stuff. What sort of visualisation were you thinking of? Drop me a mail?
Hi Gavin - project will juxtapose social, economic, environmental and personal data over time from 1980 to 2010. I want to use data from Irish datasets, and also non-Irish data from EU/International bodies, foreign media agencies. I’m not going to be thinking too much about the design/format of the visuals until I have narrowed down the scope of the data I want to use/can get at; which is obviously taking some time. I will keep you posted - maybe you can point me in the right direction on some data - I will definitely mail you.
I really enjoyed your article Paul, clear and to the point.
The insight and value that big/open data provides would quickly give a ROI that should make it a no-brainer. However the transparency and accountability that comes with the territory is probably the reason for any reluctance in adoption of same, along with the perceived costs/labour overhead.
The US Data.gov initiative is a fantastic example of what we should be doing ourselves, machine readable data which is easily manipulated and used to to create meaning; not confusing PDF’s designed to frustrate.
Data doesn’t lie, it clarifies, but it also removes layers of obfuscation which all might not be entirely comfortable with.
Dan Pink recently highlighted the fantastic idea of ‘A Taxpayers Receipt’, this could be easily implemented if access were provided to structured/usable data.
http://www.danpink.com/archives/2010/10/idea-of-the-day-a-taxpayer-receipt
Imagine this type of information being available in a National Dashboard, marry open data with the visual simplicity/beauty of a Feltron or McCandless project and Ireland would be onto a winner. What went wrong and why it went wrong would be a lot easier to discern, we could learn from mistakes and more importantly avoid making similar ones in the future…
Data visualisation & infographics are becoming a crucial part of journalism/national storytelling, as nothing can tell a story better than a picture - case in point being the groundbreaking work being done by NYT’s Interactive Newsroom team.
Sounds like a great project, looking forward with great interest to the fruits of your labour.
Darren
Cool! :-) I already have some stuff like Central Bank PSC stats 2000-2010 done… and have just got my hands on some substantial datasets via FOI. More are on the way.
Great stuff Gavin - I will get back to you.
Great post Paul and really buy into the simple approach to solving a problem I face daily as a commercial researcher. Data can only clarify matters and if is is social or economic, it can only improve society or make things transparent and people accountable.
If I can help, shout. In the meantime, best of luck with project
I agree a thousand percent. And completely disagree.
There’s exactly one way to ensure we get any kind of government data API that is within our control and definitely won’t result in anybody’s time being completely wasted: build it ourselves.
It’ll suck, compared to the ‘real thing’ that should replace it. But it will enable some actual uses in Ireland in new apps and tools that don’t currently exist, and that’s the kind of thing that generates real traction. And I’m getting pretty good at mechanically-stripping PDFs apart these days. In fact there’s one application of crowbar-ing data out into a new service I can think of where non-PDF data is actually less preferable. (But, granted, only one.)
I’m a little biased - I’m using the only structured government data that I think is produced by any part of the government for KildareStreet. And I’m here to tell you that data sucks out loud. It’s riddled with errors every single day. We can and should do better than asking them to do something right and then trust them not to screw it up.
I think the main point is: Don’t tell, show. I’m in if you are. :)
I would tend to agree with John Handelaar above. Go ahead and scrape/build. Pick one thing, and do it. Relying on the powers-that-be to do the job quickly or properly is probably just inviting disappointment. In Northern Ireland
OpenDataNI was announced to great fanfare last year, but has been left to slowly die ever since the initial media buzz faded, with emails and other messages just ignored.
All of the difficulties in this type of project returned to mind last night after reading Agitprop Àgogo which covers some of the issues nicely. It also helped me clarify my thoughts on a few Northern Ireland data issues I was looking at.
Just start building things. Clearly a number of people are interested and some will get involved.
“We are living through a bruising period in Irish history - and the next government doesn’t just need to take painful economic decisions, it also need to begin to restore trust in the idea that government works in all our interests. We have an information problem.” This line sums it up for me. Everyone should copy and paste this to their local TD and other public representatives and get a commitment to honour this after the next election. Best of luck with your campaign. Brendan
I get what you mean though theres nothing worse then pdf’d excel tables spread across two pages with text entries that break rows, explain that a bit more. (really really long narrow blog posts are nearly as bad :P)
you’re copying Carl Malamud and his campaign to be a modern national xml printer http://public.resource.org/ http://www.youtube.com/watch?v=9E43_fdhu-o
sounds like a grand proposal, if you have the technical abilty to do it some of it, go ahead, I don’t think even optimistically it’ll be as easy as you describe, or that you can expect governments to act like agile startup companies, they won’t, they can’t, we’ll have to do most of it ourselves.
the government getting request for data thats in digital format and then printing it out and posting it isn’t by mistake its deliberate petty hampering of the flow of info, i watched a talk by the information commission officer, who had a stink ‘its not a problem and its not my job’ off him. they don’t see the problem.
to engage people and to effect change it might be good to think locally, http://countculture.wordpress.com/ who created http://openlylocal.com/ has been very good at scraping websites and also encouraging them to put rdf tags on their info, and campaigning to avoid costly consultants getting in between us and our data. again great, although few ppl could make openlylocal.
the sunlight foundations has a campaign thats http://publicequalsonline.com/ and i would add in libaries too
the CSO release more usable stats recently http://www.irisheconomy.ie/index.php/2010/10/08/the-csos-detailed-public-finance-data/
impressed to see our university including TCD open up their research archives to public reading, thats smart let it not sit unused on their shelves, I hadn’t realised they’d got that far with it
http://www.rian.ie
and TASC has written on An Economic Argument for Stronger Freedom of Information Laws in Ireland http://www.tascnet.ie/showPage.php?ID=3143 or none at all
to add to the point about transparency being good for the government and its employees, when the government aircraft travel data was released, did they release the reason for each trip, that would give context to the thousands Euros spent, along side, no, and when they released the details of the st patricks trips around the world by ministers it was only one line of info _given to the press_, I asked for it and was ignored, government still confuse being public with giving the info to the press, although maybe http://merrionstreet.ie might help with things like that and make it easier to find then on the department and state body websites
the lack of commons and detailed geographical data is a reasons we can’t copy a lot of what the US and UK is doing though. need to battle the OSI aswell, postcodes will help if we can access their api.
see my blog for my amateur exhibits, with thestory’s and ken foxe’s FOIs, etc, they are number of pioneers, need more sharing of what ppl are doing
we have to do it ourselves but also actively challenge the gov
I would be willing to assist. Probably where I could help would be speaking/writing to Government departments.
Have you seen http://data.worldbank.org/ ?
This is a problem in a lot more places than Ireland… Would it not be possible to create a system that made it easy for anyone, anywhere who had access to public/freedom of information data to publish it to a database that could be accessed by anyone else via an API or visualisations? Kind of like a wikipedia for government data worldwide?
How do we centralise this conversation? Mailing list? CC list?
public accessible mailing list start as you mean to go on
Eamon Leonard has created a Google Group to start a conversation about this: http://groups.google.com/group/open-data-ireland and I’m happy to help out with a community initiative in whatever way I can. My own focus in the short term will be to build a dataset to power my own project, but the wider conversation is definitely really cool to see. You guys are all great, it’s cool to see this is something people have been thinking about.
I think we really need some kind of legal backing for open data to really prosper. Does EFF Ireland still exist, we really need a lobby group to try push it through. I think the Greens could be suggestable on this issue.
I think you might be interested in the recently launched AIRO initiative from NUI Maynooth. I think it has some of the framework and data that you may find interesting.