This blog post is originally featured on the World Bank Group’s Data Blog, found here
“Every company is a technology company”. This idea, popularized by Gartner, can be seen unfolding in every sector of the economy as firms and governments adopt increasingly sophisticated technologies to achieve their goals. The development sector is no exception, and like others, we’re learning a lot about what it takes to apply new technologies to our work at scale.
Last week we published a blog about our experience in using Machine Learning (ML) to reduce the cost of survey data collection. This exercise highlighted some challenges that teams working on innovative projects might face in bringing their innovative ideas to useful implementations. In this post, we argue that:
- Disruptive technologies can make things look easy. The cost of experimentation, especially in the software domain, is often low. But quickly developed prototypes belie the complexity of creating robust systems that work at scale. There’s a lot more investment needed to get a prototype into production that you’d think.
- Organizations should monitor and invest in many proofs of concept because they can relatively inexpensively learn about their potential, quickly kill the ones that aren’t going anywhere, and identify the narrower pool of promising approaches to continue monitoring and investing resources in.
- But organizations should also recognize that the skills needed to make a proof of concept are very different to the skills needed to scale an idea to production. Without a structure or environment to support promising initiatives, even the best projects will die. And without an appetite for long-term investment, applications of disruptive technologies in international development will not reach any meaningful level of scale or usefulness.
Making it look easy: identifying someone’s age and gender from a picture
Take, for example, our prototype of age and gender recognition based on Microsoft Azure Cognitive Services. We want to use pictures of survey respondents to validate their answers about age and gender. This is a useful capability when conducting surveys in low- and middle-income countries because of the large number of errors in this critical for further analysis information and because people often don’t have an accurate estimate of their age – an additional data point like this can help improve the quality of the data collected.
Our Proof of Concept (PoC) seemed to work to a certain degree. The algorithm correctly identified the gender of the two males in the picture, however, the predicted age was significantly lower than their actual age (Figure 1).
More research and testing is necessary to decide whether this technology would be practical for data validation. Does the precision of age estimates depend on the algorithm or do we need better cameras? How well would the technique work in the field under various lighting conditions? Are there any variations in the precision due to the race or age of respondents? Would respondents be willing to pose for a picture? Will we run into any issues with local personal privacy laws? etc.
To answer these questions, we need to dig deeper into the ML algorithms; conduct extensive and expensive field pilots; and investigate and test a wide range of hardware specifications and platforms. It might also be wise to consult external experts to fine tune the model. While we managed to come up with the prototype within 2-3 weeks of initial investigation, the next steps will take much longer and will require significant investments in terms of money, time, and intellectual and emotional energy. But even that will not be enough to answer two main questions: Does it work? Who cares?
What kind of investment is needed before this application becomes useful?
Many new applications of technologies, especially those related to Machine Learning, Artificial Intelligence, and Big Data are now in the Research & Development (R&D) stage. Some of the projects will mature and reach the production stage where an even smaller subset would receive investments in marketing and dissemination.
By useful, we mean that a product adds value either through integration into the existing production processes or through direct consumption. For example, voice recognition, while rapidly approaching the “usefulness phase”, has not yet reached a level comparable with human recognition. Facial recognition made great progress towards being practical and is being applied in some industries, but it is still auxiliary to the traditional identification and authorization technologies. ML predictions of crop yields based on satellite imagery could potentially improve agricultural productivity and forecast commodity prices, but we have yet to see it in real, practical applications. Predictions of poverty rates or subnational GDPs based on high-resolution satellite images are equally exciting and uncertain. GPS navigation, on the other hand, has reached the maturity of a fully practical technology.
The lifecycle of a project typically starts from a low investment in R&D. During this initial R&D stage, the team working on the project might be limited to just few people. In this stage, the work involves developing and refining a new idea through consultation and research. This stage is characterised by a lot of unknowns, uncertainties, and inherently unknowable risks.
The level of investment might increase when an idea moves to the Proof-of-Concept (PoC) stage where the idea or the concept becomes better defined, some uncertainties are resolved, some risks are removed or mitigated. The product’s value proposition becomes clearer. More people with different profiles need to get involved to push project forward. The PoC stage brings its own uncertainties and uncovers technological limitations and risks that could be understood only by rapid prototyping. It is still far from being useful or practical, but at least it “proves the concept” by demonstrating that the idea is feasible.
What’s needed to get from proof of concept to production?
Very few project teams will manage to move past this stage and continue the path of linearly increasing investment. This is because most projects will require vastly different resources to evolve from the PoC to production. Field testing, user focus groups, development of new commercial grade applications, and integration of the new processes to the existing technologies need skills that might be scarce or missing completely in the team that conceived the original idea and brought it to the PoC stage.
During a recent presentation about our experiments with ML, one of the authors of this blog stunned the audience by saying that he saw about a 10 percent chance for the age-from-photo technology crossing the “usefulness” threshold. The participants of the presentation were expecting a much more optimistic prognosis. If the probability of success is that low, why would you even invest in such a project? In fact, these estimates are at least twice as high compared to the average rate of project success in the industry. A project, even if it fails, can generate knowledge and experience that can then be invested in other projects. However, since these benefits rarely outweigh the total cost of a project, there is a lot of financial risk in even starting these projects.
An organization involved in the application of disruptive technologies has to be ready to accept such risks. It should have an established structure and well-defined processes to first identify potential winners and then take them from the PoC stage into production. We strongly feel that the Silicon Valley adage – “Fail-fast, fail-often” – is very much applicable to the current landscape of disruptive technologies. This approach should be adopted not to overinvest into the failing ideas, but to invest in enough initial ideas to produce a few winners. Projects should a start in small, agile teams that would minimize transaction costs even in the most bureaucratic institutions. But once the validity and feasibility of the idea is confirmed, the institution should be fully committed to investing the required resources to see the project through to production.
By Michael M. Lokshin
Manager, DECSU, World Bank Group
Michael Lokshin is a Manager and Lead Economist in the Development Data Group (Survey unit). He received his Ph.D. in Economics from University of North Carolina at Chapel Hill in 1999 after which he joined the research group at the World Bank. His research focuses on the areas of poverty and inequality measurement, labor economics, and applied econometrics. More recently, he has been involved in the Bank’s efforts in developing software tools and methods for applied economic analysis. Michael leads the group of researchers, survey experts and software engineers in development of the Cloud for Development platform, Survey Solutions CAPI system and the ADePT project (Software Platform for Automated Economic Analysis). He is also a person behind the creation of the Economic Research Computer Center in the World Bank.