Open Source Data Analytics as a Digital Public Good
Unicef experts require analytics and visualization tools to create a snapshot of current needs and responses in the countries where they work. Asha Pun, for instance, works at the Sierra Leone country office, and is a specialist in the Maternal and Newborn Health program. She needs generic datasets like population, as well as the locations of critical resources in order to develop equity focused planning and to help support her advocacy and advice on policy. Additionally, she needs a way to upload her own data*, and create and share narratives.
In this short article, I’ll describe a solution for Asha and other experts throughout Unicef comprised of several open source software tools, point out bottle necks, and suggest how to work around them. I’ll also suggest how the solution can be created as a digital public good by public entities. Finally, I’ll end by offering a live use case from within the Unicef Sierra Leone country office.
Open Source Tools
The two open source business analytics tools I’ve worked with are:
Airbnb’s Apache Superset:
- Pros — pretty easy to setup and allows CSV upload and chart / dashboard sharing
- Con: charts can be difficult to create
- Pros: super easy to deploy, charts are easy to create and share.
The datasets
Procuring the datasets is relatively straightforward, though it can be time consuming. Some useful datasets include:
- High Resolution Population Density Maps + Demographic Estimates
- Water point data
- Vaccination coverage: two sources are the Spatial Data Repository and WorldPop.
- Health clinics
- Schools
- MICS
Services requested as digital public goods
In addition to global organizations like Unicef, these tools and data sets could be useful to government agencies in developing countries that focus on policy. In a previous article, I described a hackathon designed for people with limited tech experience and resources who might use this stack to learn while working problems around improving society and the environment.
Here are three services that public entities might provide as a digital public good:
- Create a pipeline to ingest and enrich each of the public data sets listed above
- Create a docker-compose or kube file to spin up an instance of Superset, Metabase, and Postgres, seeded with datasets specific to a country and a program (ex: nutrition, health, wash, HIV..etc)
- Offer cloud resources to host instances of analytics stacks to be used by small teams digital do-gooders.
Here is a diagram of the stack:
The use case
During the past week, the Unicef Health team in the Sierra Leone country office has been preparing a presentation around community health workers. The objective, in the words of Health Manager, Yuki Suehiro, is:
“…to advocate with key stakeholders, including the government agencies, for the need to reconsider the primary health care strategy from equity and efficiency perspectives — by looking at distribution of health facilities and community health workers vs. distribution of population.”
Here is the dashboard home page that features three basic datasets we prepared for them.
Finally, below are diagrams created using Metabase and Keplergl which were included in a presentation titled, “A Deep Dive into Primary Health Care Service” by Health Manager, Yuki Suehiro, and Health Specialist, Hailemariam Legesse:
Foot note:
* Data ingested from Special Baby Care Units by Dr. Pun will be integrated with Unicef’s DHIS platform sometime next year.