Should Citizen Generated Data be official? If not, does it make it any less essential?

There is a tension that has existed in the past few years about the co-existence of official statistics on the one hand and citizen generated data on the other. The development community continues to grapple with the “official-ness” (and therefore usefulness) of citizen generated data in delivering on the Sustainable Development Goals, for example.

As you may be aware, the organisation for which I work, Open Institute has a particular passion for finding ways in which citizens can use data to better participate in their governance. We have contended many times that official statistics – the sole preserve of National Statistical Offices (NSOs) – do not suffice as a means to achieving or even monitoring development goals. For many citizens, the nature of official statistics is that they are often so far removed from people’s understanding that they might as well be another language.

Most importantly, I share the view with Elizabeth Stuart, who recently was quoted on Devex as saying, “Governments do not adequately know their own people,” and I add, “because they only rely on official statistics.”

I am in Accra, Ghana, one of my favorite places in the world. This time, I am here for work – meeting data and policy wonks who especially care about how data can be used for Sustainable Development. In a Peer learning workshop organised by two international organisations with almost incomprehensible names.

The first, Deutsche Gesellschaft für Internationale Zusammenarbeit, is more commonly known as GIZ, and is in fact the German Society for International Cooperation. The second is the Global Partnership for Sustainable Development Data (GPSDD), essentially a strong partnership of governments, civil society organisations and private sector companies who care about how the Sustainable Development Goals shall be achieved by the year 2030.

Today, the focus of the three-day workshop turned out to be the reason that I came to Ghana, after all. The participants, who represent Kenya’s and Ghana’s national statistics offices, civil society organisations from the two countries and private sector companies like Safaricom, turned their attention to thinking about how data from non-official sources could be made official or at least worthy of recognition and use by governments for planning and resource allocation.

The day started with me making a presentation about our citizen generated data work (I focused on the Lanet Umoja location Case), which we have thoroughly documented on our learning website, datalocal.info. Once we were done, the team retreated in two groups to consider the work that we do, and the work that others like us do. The brief that they were given is to try and figure out how such data was was collected in Lanet (see it here) could be made official or acceptable for government.

This was turnkey for me. The discussions brought to the fore some interesting considerations. First: is all data, statistics – or should it be? I suppose it is important to establish the distinction that I am making, for purposes of this conversation.

There are official statistics, which are used by government for monitoring and tracking a country’s progress in a standardised, internationally comparable manner, the collection of which is necessarily strictly scientific. Then there’s the rest of data, which is collected by all other players for many different purposes – in our case for driving citizen participation and engagement with government.

The second consideration that came to the fore is the dichotomy between administrative data, which for purposes of this conversation, is that data that is needed for action by government (especially at subnational level) and official statistics. Must national statistics offices find use for – or even accept this administrative data? Does it gain or lose credibility if it has or does not have their stamp of approval?

Samuel Kipruto

Well, the one thing that I did learn is that even for citizen generated data, the pursuit for accuracy is paramount and non-traditional data producers must at least try to find as high a level of rigour as they can. Specifically, I sat with Samuel Kipruto, the Senior Manager for Data Processing at KNBS and he educated me that there are some key issues that one must consider in the attempt to be adequately rigorous in collecting data that can be acceptable to them.

  • Metadata and commonly agreed definitions. When we collected data in Lanet, our definition of household was the english definition of the word – the people who live in that house at the time, whereas a more rigorous definition (by OECD in this case) is “A household is a small group of persons who share the same living accommodation, who pool some, or all, of their income and wealth and who consume certain types of goods and services collectively, mainly housing and food.” These, I learnt, are found in a compendium of definitions and concepts that one must develop or access from the national statistics offices (for national uniformity).
  • Documentation of the process and training of the data collectors. It is important, I learnt, that the data collectors fully understand the data collection concepts and definitions and in addition are trained to minimise bias and to make sure that they only take the respondents feedback – even if they know them. “Imagine you are collecting your neighbour’s data, ” explained Samuel, “You ask him how many cars he has and he says he has six. But you only ever see 2 cars in his house and have never seen the other four. You might be tempted in the data to write 2 and not 6, assuming that he is lying. But he might be having the other four parked elsewhere!”
  • Quality is dictated by the process. I learnt that having a robust quality assurance mechanism that is well documented goes a long way in assuring credibility of the data. It is also crucial that we have mechanisms for independent supervision and validation of the data so that we can assure others not present that the data has gone through a thoughtful enough process of collection.

On the side of the statistical agencies, I found that there are a number of nascent opportunities that we, in the civil society and private sector could work with them on.

  1. Work together to make the statistical laws more inclusive and that define a statistical system that goes beyond just the NSO.
  2. Statistical offices could recognise that there is an opportunity for them to gain access to a whole new world of data that they do not have the power to collect, from non-official sources and that can be accurate enough for them to analyse along with their work and to increase the quality of the eventual output. This would need a level of proactivity in reaching out to non-state players and working with them.
  3. A practical deliverable that I identified in the conversation that I hope we can work together with Kenya’s and Ghana’s NSOs is the development if an online platform that has guidelines for anyone wanting to collect data that is acceptable to them. The platform would have the necessary principles, checklists and the compendium of agreed concepts and definitions freely and readily accessible to all. This way, all players have a chance to collect data that has the necessary rigour to be acceptable to all, especially the NSOs.
  4. There is an opportunity for a change in perceptions that non-official data does not necessarily mean non-essential data.

All in all, I think this was one of the most useful workshops that I have attended in quite a while and I am very energised.