Last June, Jake Porway typed up a blog post that he only expected friends and fellow data scientists to read. Why not, he asked, find a way for data scientists to partner with social organizations, who are overrun with fascinating data, to help them solve societal problems?
The idea exploded, drawing attention from the White House to the United Nations. And so Data Without Borders was born, bringing together data scientists with social groups -- and even governments -- hoping to discover the details within their data.
I spoke with Porway, who by day is a data scientist in the New York Times Research & Development Lab, last week. Below are excerpts from our interview.
You're a data scientist by trade. Did the idea for Data Without Borders come from being frustrated that there was so much data out there you couldn't reach?
It was related. Without the benefit of being a data scientist, I'm not sure I would have put all the pieces together. It was less about the frustration of not having the data and more about the frustration of not feeling I had a way to use my talents to help the world. When I became a data scientist, I felt I had skills to offer the world that weren't being satisfied. That was in conjunction with data science blowing up as an important field. There's such power in the skills data scientists have for changing the world. There was so much lost potential in not being able to hook that up to social causes.
Talk about the disconnect between groups that have data -- like nonprofits and social change organizations -- and the data scientists who could tell them what those numbers mean. Why aren't they connecting without the help of groups like yours?
Those [data] skills are obviously marketable. [Data scientists] are immediately snapped up by companies that recognize the power of that and have resources for it. Google and Amazon are built on good data. Businesses siphon off a lot of talent. It's where there's money to be made and where a lot of exciting problems live.
Traditionally, the social sector has not had as much capacity for technology or data. There are fewer groups that have an understanding that it's important, much less have the infrastructure to support projects and resources to compete with someone like Google. The tech community got on board early because it's their business. The social sector didn't. But things are changing. We're at the beginning of a data revolution. Data is going to touch everyone and everything. Thanks to mobile phones, web platforms and more open data, there's data out there for everyone. Now more than ever, the social sector has an opportunity to use data. They just don't have the experience and infrastructure of the tech community.
Why is it important that someone look at data from social change organizations?
Let me start from the business side. Businesses are just beginning to realize that data allows you to be nimble and drive your business. You get instant feedback about how your programs are working. You get instant feedback about changes in the market. It gives you ideas about how you can change. Companies that can make use of that kind of data are the ones that can succeed.
The social sector has exactly the same opportunities. Examples are everywhere, from their back office workings to their projects. There's a big question in the social sector of whether you're having an impact. It's great that you built these schools, but did that improve education? How do you measure that? It's a much easier question to answer if you can make use of the data.
Can you give me an example of a Data Without Borders project?
The Grameen Foundation has a program for community knowledge workers. These workers are like local Google. They work with subsistence farmers in Africa who have no access to technology. They have no way of knowing what the weather is going to be like tomorrow. They don't know the price for beans in the village unless they go there. Grameen employs knowledge workers with mobile phones to visit farmers. They answer questions for them. They text the question in, get the answer and bring this knowledge to farmers. Grameen wanted to know: How is our program working? By using a cell phone program, they have the coolest data set ever. They have data about every request that went out, the location of where it happened, what was asked, what came back.
We worked with them on one of our weekend events. They were able to get an overhead view of how many farmers were being served in each region and how that changed over time. The other big question they had was: Which of our knowledge workers are good? When we looked at the data, we saw that some people logged a lot of questions, but only talked to one farmer. Other people didn't log as many questions, but talked to a much broader array of farmers. That got them thinking about how to measure this. By doing this program with us, they realized they needed to change their impact measures. They also changed an intervention in one of their programs because the data didn't support that it was helping. After working with us, they decided to set aside resources for a data science team.
We don't want to just solve someone's problem. We want them to think about data fundamentally differently. That was rewarding for us. We want every social organization to have the same data capacity as Google. I want it to be crazy for them to propose a project and not propose how they'll think about data in the same way it'd be crazy if they didn't propose a budget.
How does Data Without Borders work? Do the data scientists you use volunteer their time?
We have a number of levels of engagement. The first are weekend events that we call Data Dives. Those are purely volunteer. Social organizations bring specific data problems. We let the data scientists choose which one to work with for the weekend. By the end of the weekend, we try to focus on real results.
Above that is our Data Corps. It connects data scientists for one to six month engagements with organizations in their free time. These are volunteer or contract positions. We do have paid positions. I'd like to move to a point in the organization where all the positions from Data Corps and up pay something, even if it's nominal. It puts skin in the game. We don't want to just solve small problems. We want these organizations to take these lessons and grow from them.
We're also planning a fellowship program that would allow people to be embedded as data scientists for about a year. They'd be free to do whatever projects come up. And we have on-staff data scientists who we reserve for longer-term consulting or data projects that need strong capacity. Currently, that's just myself and our executive director. Ideally, I'd love to offer market-competitive jobs to data scientists in the social sector.
When did Data Without Borders get started?
It was a blog post I wrote last June. It went viral. I put this up for my friends and before I knew it the Guardian and the White House and the UN were calling.
We spent the fall piloting the weekend events. It worked better than we imagined. A group who worked with the UN showed their work at the UN General Assembly. We've seen governments getting involved. Now, we're ramping up with the Data Corps and the fellowship program to create long-lasting change.
What's next for the organization?
We're excited about bringing more groups into the fold, the government in particular. After we saw the data scientists and the social organizations working together, it dawned on me that a voice was missing. The social sector could recognize a problem, but they didn't necessarily know whether data would be useful. Given the data, data scientists know what to do. But they might not have access to all the data. The group that does is the government.
The government wants people to use their data, but no one's using it. The reason is that open data is like crude oil. It has potential, but if you don't know how to refine it, it's worthless. If you get these groups together, the social sector can identify goals the government shares, such as reducing infant mortality. Data scientists can consult. And the government has the data, but they don't know what makes it usable or how to use it. I'm excited about this type of collaboration. It bridges these communities, so they can make transformative changes in society.
Photo: Porway delivering his PopTech speech