When global health policymakers can’t pinpoint causes of death in developing countries, how can they make the best policy and spending decisions? Abraham Flaxman, an assistant professor at the Institute for Health Metrics and Evaluation at the University of Washington, is working to bridge the gap. He develops algorithms that fill in missing health data. The methods, once primarily used for research purposes, are scaling up. Flaxman’s techniques could soon become routine in nations throughout the world.
I spoke recently with Flaxman about garbage codes on death certificates, what Mongolia tells us about health in Central Asia and why this work keeps him up at night. Below are excerpts from our interview.
When it comes to global health, what don’t we know and why don’t we know it?
I did a post-doc at Microsoft Research, where I was working on theoretical computer science. I came to the Institute for Health Metrics and Evaluation at the University of Washington, which is where I learned what I know about global health. I came in thinking we’d be working to determine what people are doing and how that’s changing, but baseline things like the basic prescriptive epidemiology was a topic of current research there. How many people have malaria and how many are dying from it? That’s right now a topic of scientific debate. Everyone agrees it’s more than there should be, but the numbers published by the World Health Organization and the numbers published by some of my colleagues are different enough that there’s still effort to resolve these differences.
And that’s happening within other fields of global health, as well?
Absolutely. In global health, we’ve recognized some of these big killers like HIV, TB and malaria. We’ve put a lot of effort into understanding their epidemiology. But there are many other maladies than haven’t received the same amount of attention, so there’s even less known.
Why do you care about this problem? What motivates you?
I came to it through the methods. My background is as a mathematician and computer scientist. At Microsoft, they had these wonderful data sets available. That was something I hadn’t had access to or opportunity to work with in my graduate studies. This I found tremendously interesting. Coming out of the post-doc work at Microsoft Research, I was saying to myself, ‘I’ve got to keep doing this. What big data is available that’s interesting, that’s going to be accessible and that’s going to be important?’ I ended up finding a great match with global health metrics. There is some data available, but the methods are important because of the noisy, sparse nature of the available data. It’s something that matters. People are spending huge amounts of resources every year in order to improve health. This information can help them make good choices.
How do your models and algorithms fill in missing information in health data sets?
We often describe it in terms of borrowing strength. To understand the way Hepatitis C has affected people in Central Asia, we had only information from a handful of studies in Mongolia. We had to look at those and, for all the other Central Asian countries, fill in the blanks somehow. Then we looked at Central Europe. Here we had studies from several countries, so we could see how much variation there was country-to-country. By combining the amount of variation in a different region with the patterns we saw in the limited data available in the first region, we tried to come up with a best estimate. There’s a huge amount of uncertainty because of the limited data, but for what’s available this was our best guess. This is the standard method we applied in our Global Burden of Disease Study. For every disease, injury and risk factor on a list of more than 300, we had to fill in the blanks one way or another.
Talk about your model that determines cause of death.
There’s more data collected about how many people are dying from a disease, but still there are a lot of gaps in that information. In the United States, if you want to know how many 55-year-olds are dying from heart disease, the Centers for Disease Control and Prevention has put together information from death certificates and can give you a very accurate count. You can look at how that’s changing over time and make a good estimate of how many lost years of life were caused. If you wanted to do the same thing in many other countries, you wouldn’t be able to do this. In Thailand, they have death certificates. But a lot of the causes put on the death certificates — at least before this work was pushing people to improve accuracy — were what my colleagues call garbage codes. They’re coded to causes of death that can’t actually be the underlying cause. In other parts of the world, there’s no vital registration system in the first place, so you can’t even check the quality of the cause of death certifications.
There’s been a 30-year effort to create an alternative approach called the verbal autopsy. In the verbal autopsy, a trained interviewer asks the family some questions about the signs and symptoms their loved one had before their death. Was your uncle coughing and for how long? Did he smoke? Did the doctor ever tell you he had a heart condition? The part I’ve been involved in researching is how you go from the result of that interview to a cause of death. The traditional approach for years has been to hire doctors who can read the results of the interview and determine the underlying cause. There are major problems with that. First, it’s expensive. You need to get doctors familiar with the diseases in the region you’re working on. You can’t just hire any doctor with spare time to look at the results. Second, it takes a long time. It’s hard to find the doctors who have the expertise you need. Then there’s a problem with accuracy. If you get two doctors to look at the same results, they might both come up with different diagnoses. Because I have the tools of a computer scientist, I’ve been involved in developing automated methods for computers to go from the results of these interviews to underlying cause.
How does that work? And how do you know whether you’re accurate?
When I showed up at the Institute for Health Metrics and Evaluation, they already had underway this wonderful study for determining how accurate all these methods were. They made strict lists that said: ‘If someone shows up in the hospital and complains about chest pain and you run tests and see elevated levels of proteins in their blood and they die, they died of a heart attack.’ Then we know how they died, so we’ll conduct this verbal autopsy interview for the purposes of studying the methods. My colleagues collected thousands of these interviews where they’re sure of the underlying cause of death. Anything a person can dream up, we can apply on this data set and measure how well it does.
When I saw the data set they were collecting, I said, ‘That’s machine learning.’ We don’t know the best way to solve it. The data set itself is a great resource, but what’s going to be useful in understanding what people are dying from is the research into which algorithms work best.
How widely used are your tools now?
We’re in a time of scale up. These methods have been used primarily for research until now. Now that we can say, ‘Here’s how well it works,’ it’s time we can make a case. This should be part of the system. Since getting to the point where these discussions can get started, it’s been pointed that even where death certificates are pretty much perfect, like in the United States, there already are elements of this in place. It’s not routine for every death, but in special cases. If a woman dies in childbirth, they do one of these structured interviews specialized for their situation. They’re asking a set of standard questions to find out what went wrong.
This isn’t some kind of second-best technology. We’re not cutting corners because low- or middle-income countries can’t afford it. This is something everyone can benefit from. The state this is in now is getting the research out the door. There are already developments on the application side to make this easy when people want to do it and there are discussions about how people can use it in routine settings.
As you continue to work on this problem, what keeps you up at night?
The thing that makes it interesting and exciting is also what keeps me up at night. These numbers matter. They represent some real human tragedy a lot of times and they’re going to influence policies. People will make decisions based on them. It’s important they be right. And it’s important they be communicated well. Sometimes being right means saying, ‘I don’t know.’
What’s next for you and this work?
There’s so much exciting stuff. Now that we’ve completed the Global Burden of Disease, which goes region-by-region, there’s a lot of demand to build out to even more specific geographic areas. I’ve already seen this in some of my colleagues work. They might say, ‘Here’s what life expectancy looks like county-by-county in the United States.’ This is the kind of thing that planners at the county level can use to determine changes they want to make.
Photo: Abraham Flaxman