Open Source Data Analytics as a Digital Public Good

Mike Fabrikant
4 min readOct 16, 2019

Unicef experts require analytics and visualization tools to create a snapshot of current needs and responses in the countries where they work. Asha Pun, for instance, works at the Sierra Leone country office, and is a specialist in the Maternal and Newborn Health program. She needs generic datasets like population, as well as the locations of critical resources in order to develop equity focused planning and to help support her advocacy and advice on policy. Additionally, she needs a way to upload her own data*, and create and share narratives.

In this short article, I’ll describe a solution for Asha and other experts throughout Unicef comprised of several open source software tools, point out bottle necks, and suggest how to work around them. I’ll also suggest how the solution can be created as a digital public good by public entities. Finally, I’ll end by offering a live use case from within the Unicef Sierra Leone country office.

Open Source Tools

The two open source business analytics tools I’ve worked with are:

Airbnb’s Apache Superset:

  • Pros — pretty easy to setup and allows CSV upload and chart / dashboard sharing
  • Con: charts can be difficult to create

Metabase:

  • Pros: super easy to deploy, charts are easy to create and share.

The datasets

Procuring the datasets is relatively straightforward, though it can be time consuming. Some useful datasets include:

Services requested as digital public goods

In addition to global organizations like Unicef, these tools and data sets could be useful to government agencies in developing countries that focus on policy. In a previous article, I described a hackathon designed for people with limited tech experience and resources who might use this stack to learn while working problems around improving society and the environment.

Here are three services that public entities might provide as a digital public good:

  • Create a pipeline to ingest and enrich each of the public data sets listed above
  • Create a docker-compose or kube file to spin up an instance of Superset, Metabase, and Postgres, seeded with datasets specific to a country and a program (ex: nutrition, health, wash, HIV..etc)
  • Offer cloud resources to host instances of analytics stacks to be used by small teams digital do-gooders.

Here is a diagram of the stack:

A diagram of the stack, with kepler.gl thrown in.

The use case

During the past week, the Unicef Health team in the Sierra Leone country office has been preparing a presentation around community health workers. The objective, in the words of Health Manager, Yuki Suehiro, is:

“…to advocate with key stakeholders, including the government agencies, for the need to reconsider the primary health care strategy from equity and efficiency perspectives — by looking at distribution of health facilities and community health workers vs. distribution of population.”

Here is the dashboard home page that features three basic datasets we prepared for them.

The basic datasets

Finally, below are diagrams created using Metabase and Keplergl which were included in a presentation titled, “A Deep Dive into Primary Health Care Service” by Health Manager, Yuki Suehiro, and Health Specialist, Hailemariam Legesse:

Male and female community health workers who are over 5km from the nearest health facility. The size of each dot represents the number of people in that person’s coverage area.

Foot note:

* Data ingested from Special Baby Care Units by Dr. Pun will be integrated with Unicef’s DHIS platform sometime next year.

--

--