Yearly Archives: 2011

Building peace with impact evaluations

flickr-un_photo-4211845788.jpg__300x200_q80_crop_upscaleSince the 1990s, many multi-lateral and bi-lateral donors have expanded their peace-making and peace-keeping assistance to conflict-affected countries to include peace-building activities. The objective of these interventions is to prevent the conflict from reoccurring and return countries to a stable situation in which the economy can operate.

Some examples are programs to reintegrate ex-combatants into communities, mobilize individuals to work as communities to reduce violence and increase security, address war crimes through justice mechanisms, and support media to inform citizens and encourage dialogue. Over the years, these interventions have shown mixed results and learning has been limited in large part by the lack of rigorous impact evaluations. Policymakers and program managers often argue that impact evaluation is not possible in unstable or conflict-affected environments and that the need for rapid programming precludes a carefully planned study.

Cyrus Samii, Monika Kulma, and I are conducting a review study to demonstrate to policymakers and program managers that impact evaluation of peace-building, or stabilization, interventions is indeed possible. Our paper begins by reviewing the portfolio of stabilization interventions across different categories of programming. To frame our review, we focus on U.S. government funded interventions. While we have identified over 150 current or recent US-funded stabilization interventions, we have found only a few dozen evaluations of such programs, and only one rigorous impact evaluation.

But the US experience is not entirely representative. When we looked more broadly, we found more than 20 completed or ongoing impact evaluations of stabilization programs, with Cyrus serving on the research teams for a handful of those. These evaluations provide some useful insights and lessons for how to conduct impact evaluations of stabilization programs.

One initial finding is that quasi-experimental designs are quite useful in these settings, as often there is not the time or political will to randomize implementation. We also found some good examples of RCTs, so they are possible as well. Another lesson from the experience to date is that a key element of these evaluations is outcome measurement. Stabilization outcomes are not as easily defined, much less measured, as school attendance or diarrhea incidence. In fact, impact evaluation of stabilization interventions is very much a multi-disciplinary field, where both the theories of change and the outcome measures used in the studies come from fields including psychology, sociology, political science, and economics, among others.

We hope that this paper, when complete, will help policymakers and program managers for stabilization interventions be more open to, and creative about, impact evaluation. As USAID Administrative Rajiv Shah said in his Stabilization Guidance, “Every activity is an opportunity to learn what works, what does not, and why.”

View Dr Brown’s presentation made at 3ie’s recent Delhi seminar.

Evidence-based development: lessons from evidence-based management

reap-classroom.jpg__300x200_q80_crop_upscale (1)Evidence based development is treading in the footsteps of evidence-based medicine: innovating, testing, and systematically pulling together the results of different studies to see what works, where and why. Other disciplines as diverse as sports science and management have been going down the same route. Hard Facts, Dangerous Half-Truths, and Total Nonsense: profiting from evidence-based management by Jeffrey Pfeffer and Robert Sutton contains valuable insights for practitioners of evidence-based development.

a striking parallel is their emphasis on looking at the underlying assumptions of any new idea, just as we do in theory-based impact evaluation.  Their first example is from education, not business: paying teachers for pupil performance, a popular policy with a very weak evidence base. As they point out, a clear assumption here is that teachers are motivated by financial incentives. But someone who is mainly interested in money won’t pick teaching primary (elementary) school as their career of choice. This example also nicely illustrates the importance of context. In Accra, if not in Austin, or in Delhi, but not Detroit, financial motivation in fact is a strong factor in becoming a teacher. As high rates of teacher absenteeism show, the pleasure of bringing knowledge to young minds seems not to matter so much for a significant proportion of teachers.

A very common question is “how long does an impact evaluation take?”. Or programme managers just complain that impact studies take too long. Well, they take as long as it takes for the intervention to have an impact. So many have to wait two, three years or more to carry out the endline survey. And post endline surveys to check sustainability and longer run benefits are also a good idea. ButREAP’s study of iron-fortification to reduce anemia in rural schools in China found positive impacts on learning outcomes in just two months. And, as described by Pfeffer and Sutton, Yahoo gets millions of hits an hour, so directs a couple of hundred thousand to a version of the site in with some design modification, such as ad placement, getting the results of the impact of the change in an hour or less. So, an impact evaluation can take just an hour… if your intervention works that quickly.

Evidence-based management works, as examples from DaVita, operating kidney dialysis machines, to Harrah’s casinos show. Like many other successful businesses, such as McDonalds, 7-eleven, Intel and Amazon, Harrah ran field experiments (randomized control trials) of different business practices to see what worked, and then started (scaled up) or stopped (closed) their approach according to the evidence. A trend still resisted in many quarters of the development community has been wholeheartedly embraced by some of the world’s most successful managers.

One thing the evidence tells us is that most things in business don’t work. Nearly two-thirds of new firms fail in the first five years. A study of 700 firms found that 46 percent of the money spent on product development resulted in products that fail. One of the things that don’t work are mergers. Around 70 percent or more of all mergers don’t deliver any benefits and reduce the economic valuation of the firm. A review of 93 studies covering 200,000 mergers found that the negative impact on company value happens in a couple of months and persists. But the response to this evidence is not to say “let’s not merge”. In 30 percent of cases they do work. So we should ask: what do we learn from the successes and failures about how to do a successful merger? This was precisely the route taken by Cisco which has built up profitability through nearly 60 mergers. The same is true for development interventions. We shouldn’t say “behavior change communication doesn’t work”, which indeed is what a lot of evidence suggests. But, rather, ask, “when, where and how can we make it work?” But, and here’s a lesson for donors rushing to fund systematic reviews, look at the evidence base they have: evidence from over 200,000 mergers. A typical development systematic review can draw on evidence from just a dozen interventions or even fewer. We need more primary studies, and lots of them, of the same intervention in different settings. This is of course why we have 3ie.

The unhappy marriage of impact evaluation and the results agenda

H15

Governments want results. Tax payers want results. Beneficiaries want results.

The results agenda gained momentum in development circles during the 1990s, becoming firmly established with the widespread adoption of the Millennium Development Goals. This focus on results is welcome. Simply measuring success by the volume of spending, or even the number of teachers trained, kilometres of road built and women’s groups formed, is not a satisfactory approach. Input monitoring does not ensure that development spending makes a difference to people’s lives. Spending that makes a difference; that is what we mean by a result. So we would expect this agenda to go hand in hand with impact evaluations. But that has not been the case.

The response of the development community to the results agenda has largely been outcome monitoring. So indicators like infant mortality, business profits, and female empowerment are tracked. USAID was the first to adopt this approach in the mid-90s. And the first to abandon it, when the Government Accountability Office (GAO) objected that such outcome monitoring did not tell us anything about whether observed changes in outcomes were the result of the interventions supported with US dollars. Yet the use of outcome monitoring remains widespread amongst those claiming to be interested in results. There remains a view that ‘attribution is difficult’. But attribution is precisely what impact evaluation is about.

It is not being suggested that impact evaluations be carried out for all development programmes. But they should be in place for pilot projects and other innovative intervention, for large scale and flagship programmes, and for a sample of representative programmes of the sort which the agency typically supports.

Only with the widespread adoption of impact evaluation across development agencies can we truly demonstrate results. And, at the same time, create the evidence base about what works and why to get even better results in the future.

What if BRICS countries were committing to evaluation?

Brazil, under the leadership of President Lula, has already made some headway in demanding evidence to stop spending tax payers’ money on programs that don’t work and committing to evaluation.

In the case of the flagship social safety net program Bolsa Familia, now reaching around 40 million poor Brazilians with a budget of over USD 6 billion, evaluation has been an integral part of the program since its conception. The establishment of a monitoring and evaluation system was one of the main pillars of the program. The evaluation effort succeeded in legitimizing the intervention, so that it was no longer seen as Lula’s program. It was owned by Brazilians and most of those originally opposed to its implementation began advocating instead for its continuation.

There have been encouraging signs of a growing focus on evaluation also in other BRICS countries. India is establishing an Independent Evaluation Office to assess the impact of Indian government flagship programs. China is taking a more experimental learning approach by testing innovative policies in select districts before launching them at a national scale. South Africa adopted a government-wide mandatory framework for monitoring and evaluation.

In every decade, there is a breakthrough and we see something new that makes a difference to poor people’s lives. Used correctly, impact evaluation has proven that it can revolutionize the way we do development. Mexico was the first country to introduce mandatory impact evaluations for all its federal social programs, spurred on by its success with evaluating the conditional cash transfer program Progresa. The legitimacy it gained through evaluation allowed it to survive a change in administration although it was renamed Oportunidades. The program was scaled up and fine-tuned based on solid evidence. It now helps improve the lives of one in four Mexicans.

Between them, the BRICS are home to nearly half of the world’s poor people. By championing evaluation, these countries could be moving to the forefront of evidence based development. There is no unique solution for strengthening and institutionalizing a monitoring and evaluation system. It all depends on the political will and the championing of evaluation. “Mind the Gap: from Evidence to Policy Impact” is the opportunity to work together to build on what we have learned so far. Let’s make evidence matter.

New UK watchdog to improve aid impact

H16

Is aid money used effectively to improve lives? Public opinion surveys in developed countries show that people are very sceptical about the benefits of aid.

We need convincing evidence that examines the effectiveness of development spending. Such evidence can best come from independent bodies like the National Council for the Evaluation of Social Development Policy (Coneval) in Mexico, the Swedish Agency for Development Evaluation (SADEV) in Sweden, 3ie itself, and, most recently, the UK’s Independent Commission on Aid Impact (ICAI).

The role of this recently set up commission is to provide greater independent scrutiny of UK aid spending to ensure value for money and sustainable impact. The commission has just released its workplan for delivering this mission.

The main activities will include: evaluations which assess the sustainable development impact of UK-supported activities; value for money reviews looking at both long-term impact of the intervention and cost effectiveness; and investigations to check on the appropriateness of the approach and procedures used in aid project. The challenge ahead for the commission is to deliver high quality studies, which credibly identify the impact of UK aid.

Development interventions identified by the Commission for review, such as the climate change programme in Bangladesh, the health programme in Zimbabwe, the comparative study of health and education programmes in India, primary education in Nigeria, water supply and sanitation in Sudan and maternal health in Africa, should be assessed using rigorous quantitative impact evaluation methods embedded in a theory-based approach.

There is a question as to whether the commission restricts itself to ex-post evaluations, done once the intervention is being implemented or completed. Or can it engage in ex-ante designs before the intervention has started? Designing the evaluation prior to the launch of a programme, and collecting baseline data, generally delivers more robust findings.

Another question to be tackled is how best to design impact evaluations for interventions with a small sample, such as the review of DFID’s anti-corruption strategy, management of conflict pools and the study of World Bank evaluation and performance management. The creation of the Independent Commission on Aid Impact presents a real opportunity to raise the bar on the quality of development evaluations. The UK government is to be applauded for opening its aid programme to independent scrutiny. It is now up to the Commission to deliver the evidence of if UK aid is working.

Related blog: ODI Director Alison Evan’s ‘Top tips for the UK’s Independent Commission on Aid Impact