Monthly Archives: December 2014

Proof-of-concept evaluations: Building evidence for effective scale-ups

Heather_1066I delivered a talk at 3ie’s Delhi Seminar Series on a recently published PLoS ONE paper) and follow-up research. This project was a randomised experiment evaluating the potential for text messages to remind malaria patients to complete their treatment course of antimalarial medication. Specifically, we looked at completion of the only class of drugs fully effective in curing malaria in Sub-Saharan Africa: Artemisinin-based Combination Therapies (ACTs). An individual’s failure to complete treatment can have both private and public harms – parasite resistance to these drugs is already emerging in Southeast Asia and there is no clear alternative treatment in the pipeline.

Several interesting questions arose during the course of the seminar, including from discussant Simon Brooker.  Some of these questions about the study also came up in follow-up visits to vendors in Ghana. The main overarching question in all of these was: Why did we design the intervention to be so hands-off?


  • Why didn’t we allow the vendors to play a stronger role in educating and enrolling patients into the text messaging system?
  • Why didn’t we provide financial support to those for whom phone credit was a barrier to enrolling in the system?
  • Why didn’t we use more interactive forms of texting or even voice-calling (including Interactive Voice Response, such as used here)?
  • Why didn’t we link our messages to a larger system of messaging the drug vendors themselves to remind them about protocol (as was done here)?

Why this way?

 I believe we took this approach for three main reasons.

First, our funder CHAI (as an operational research project for the Affordable Medicines Facility – malaria (AMFm) wanted a proof-of-concept about the minimal supportive moving parts required to get patients enrolled into a text messaging system of reminders to complete their medication.  In the context of the AMFm, as well as Ghana’s National Health Insurance Scheme, the availability and affordability of ACTs has been expanding rapidly. But support to encourage appropriate use of those ACTs lagged behind.

So, we wanted to learn what could be scaled up cheaply and easily.

This study is the first randomised evaluation of a direct-to-patient (rather than to health workers) text messaging programme for malaria in Sub-Saharan Africa. We purposefully chose northern Ghana as the site of the study (specifically, in and around Tamale in Northern Region, which falls below the Ghanaian average on most welfare and development indicators). We worried that finding an effect from a text messaging programme in the capital, Accra, wouldn’t go very far in convincing people that a similar programme could work across Ghana. So, we tried to make things a bit difficult to find an impact.

Second, we wanted to isolate the effect of the text message itself.  By having the vendors play a stronger role in educating their patients about the need to complete their antimalarial medication, we would find ourselves unable to identify the effect of the text messages alone (without proliferating to an octopus of treatment arms, which budget constraints would not allow).

In this context, we were looking for answers to questions such as: Would the vendors hand out the flyers with minimal encouragement? Would it work if the vendors didn’t tell patients that the point of the messages was to remind them to finish their meds (vendors themselves were kept in the dark about this point until the end of the study)? Would it work if surveyors did not help assist patients in enrolling into the system (by either giving a missed call or sending a text)?

Third, the intervention was a somewhat narrow conception of mHealth-as-text-message, rather than text messages as social interactions embedded within larger social systems of communication and health care. This mHealth intervention, though run through a computer speaking Python and sending messages directly to mobile phones, was still very much embedded in social relationships, such as those between drug vendors and their patients (a point I bring out here).

Which way next?

From this study, we see that text messages can indeed have an effect on treatment completion. Precisely how to interpret the effect size is open to debate but as a proof-of-concept, we now have an idea that even in a purposively tough context, text messages may be part of the arsenal that moves patients towards full completion of malaria medication. This has practical significance as well as statistical significance: it can work. Moreover, there is suggestive evidence that the programme could be scaled up, given the hands-off approach we took and the enthusiasm of the vendors with whom we followed up.

There’s still however a long way to go, as this intervention only gets us to around 70 per cent completion rate of antimalarial medication. A likely way forward is thinking about text messages as one part of a larger, socially embedded intervention with multiple prongs to reach health providers, caregivers and patients through a variety of media and interaction mechanisms. This proof-of-concept evaluation should allow us to build on what works, making this more than a one-off study. It pushes us closer to the ultimate goal of a 100 per cent completion of anti-malarial medication.

Myths about microcredit and meta-analysis


It is widely claimed that microcredit lifts people out of poverty and empowers women. But evidence to support such claims is often anecdotal.

A typical microfinance organisation website paints a picture of very positive impact through stories: “Small loans enable them (women) to transform their lives, their children’s futures and their communities… The impact continues year after year.” Even where claims are based on rigorous evidence, as in a recent article on microfinance in the Guardian by the chief executive officer of CGap, the evidence presented is usually from a small number of chosen single impact evaluations, rather than the full range of available evidence. On the other hand, leading academics such as Naila Kabeer have long questioned the empowerment benefits of microcredit.

So, how do we know if microcredit works? The currency in which policymakers and journalists trade to answer such questions should be systematic reviews and meta-analyses, not single studies. Meta-analysis, which is the appraisal and synthesis of statistical information on programme impacts from all relevant studies, can offer credible answers.

When meta-analysis was first proposed in the 1970s, psychologist Hans Eysenck called it ‘an exercise in mega-silliness’. It still seems to be a dirty word in some policy and research circles, including lately in international development. Some of the concerns about meta-analysis, such as those around pooling evidence from wildly different contexts, may be justified. But others are due to misconceptions about why meta-analysis should be undertaken and the essential components of a good meta-analysis.

3ie and the Campbell Collaboration have recently published a systematic review and meta-analysis by Jos Vaessen and colleagues on the impact of microcredit programmes on women’s empowerment. Vaessen’s meta-analysis paints a very different picture of the impact of microcredit. The research team systematically collected, appraised and synthesised evidence from all the available impact studies. A naïve assessment of that evidence would have indicated that the majority of studies (15 of the 25) found a positive and statistically significant relationship between microcredit and women’s empowerment. The remaining 10 studies found no significant relationship.

So, the weight of evidence based on this vote-count would have supported the positive claims about microcredit. In contrast, Vaessen’s meta-analysis concluded “there is no evidence for an effect of microcredit on women’s control over household spending… (and) it is therefore very unlikely that, overall, microcredit has a meaningful and substantial impact on empowerment processes in a broader sense.” So, what then explains these different conclusions, and in particular, the unequivocal findings from the meta-analysis?

The Vaessen study is a good example of why meta-analysis is highly policy relevant. The meta-analysis process has four distinct phases: calculation of impacts from the studies into policy-relevant quantities, quality assessment of studies, assessment of reporting biases, and synthesis including the possible statistical pooling across studies to estimate an average impact. It uses these methods to overcome four serious problems in interpreting evidence from single impact evaluations for decision makers.

First, the size of the impacts found in single studies may not be policy significant. That is, impacts are not sufficiently large in magnitude to justify the costs of delivery or participation. But this information is often not communicated transparently. Thus, many single impact evaluations – and unfortunately a large number of systematic reviews in international development – focus their reporting on whether their impact findings are positive or negative, and not on how big the impact is. This is why an essential component of meta-analysis is to calculate study effect sizes, which measure the magnitude of the impacts in common units. Vaessen’s review concludes that the magnitude of the impacts found in all studies is too small to be of policy significance.

The second problem with single studies is that they are frequently biased. Biased studies usually overestimate impacts. Many microcredit evaluations illustrate this by naïvely comparing outcomes among beneficiaries and non-beneficiaries without accounting for innate personal characteristics such as entrepreneurial spirit and attitude to risk. These characteristics are very likely to be the reason why certain women get the loans and make successful investments. All good meta-analyses critically appraise evidence through systematic risk-of-bias assessment. The Vaessen review finds that 16 of the 25 included studies show ‘serious weaknesses’, and that these same studies also systematically over-estimate impacts. In contrast, the most trustworthy studies (the randomised controlled trials and credible quasi-experiments) do not find any evidence to suggest microcredit improved women’s position in the household in communities in Asia and Africa (see Figure).Figure

The third problem is that the sample size in many impact evaluations is too small to detect statistically significant changes in outcomes – that is, they are under-powered. As noted in recent 3ie blogs by Shagun Sabarwal and Howard White, the problem is so serious that perhaps half of all impact studies wrongly conclude that there is no significant impact, when in fact there is. Meta-analysis provides a powerful solution to this problem by taking advantage of the larger sample size from multiple evaluations and pooling that evidence. A good meta-analysis estimates the average impact across programmes and also illustrates how impacts in individual programmes vary, using what are called forest plots.

The forest plot for Vaessen’s study, presented in the figure, shows an average impact of zero (0.01), as indicated by the diamond, and also shows very little difference in impacts for the individual programmes, as indicated by the horizontal lines which measure the individual study confidence intervals.

There are of course legitimate concerns about how relevant and appropriate it is to pool evidence from different programmes across different contexts. Researchers have long expressed concerns about the misuse of meta-analysis to estimate a significant impact by pooling findings from incomparable contexts or biased results (“junk in, junk out”).

But where evaluations are not sufficiently comparable to pool statistically, for example because studies use different outcomes measures, a good meta-analysis should use some other method to account for problems of statistical power in the individual evaluation studies. Edoardo Masset’s systematic review of nutrition impacts in agriculture programmes assesses statistical power in individual studies, concluding that most studies simply lack the power to provide policy guidance.

In the case of Vaessen’s meta-analysis, which estimates the impacts of micro-credit programmes on a specific indicator of empowerment – women’s control over household spending – the interventions and outcomes were considered sufficiently similar to pool. Subsequent analysis concluded that any differences across programmes were unlikely to be due to contextual factors and much more likely a consequence of reporting biases.

This brings us to the fourth and final problem with single studies, which is that they are very unlikely to represent the full range of impacts that a programme might have. Publication bias, well-known across research fields, occurs where journal editors are more likely to accept findings that are able to prove or disprove a theorem. Conversely, they are less likely to publish studies with null or statistically insignificant findings. The Journal of Development Effectiveness explicitly encourages publication of null findings in an attempt to reduce this problem, but most journals still don’t. In what is possibly one of the most interesting advances in research science in recent years, meta-analysis can be used to test for publication bias. Vaessen’s analysis suggests that publication biases may well be present. But the problems of bias and ‘salami slicing’ in the individual evaluation studies, where multiple publications appeared on the same data and programmes, are also important.

Like all good meta-analyses, the Vaessen review incorporates quality appraisal, the calculation of impact magnitudes and assessment of reporting biases. The programmes and outcomes reported in the single impact evaluations were judged sufficiently similar to pool statistically. By doing this, the review reveals ‘reconcilable differences’ across single studies.

Microcredit and other small-scale financial services may have beneficial impacts for other outcomes, although other systematic reviews of impact evidence (here and here) suggest this is often not the case. But it doesn’t appear to stand up as a means of empowering women.


Demand creation for voluntary medical male circumcision: how can we influence emotional choices?

Chipiliro Khonje_15882458375This year in anticipation of World AIDS Day, UNAIDS is focusing more attention on reducing new infections as opposed to treatment expansion. As explained by Center for Global Development’s Mead Over in his blog post, reducing new infections is crucial for easing the strain on government budgets for treatment as well as for eventually reaching “the AIDS transition” when the total number of people living with HIV begins to decline.

Male circumcision is one of few biomedical HIV prevention strategies with evidence of a large impact on reducing HIV acquisition among men, based on three trials conducted in South Africa, Kenya, and Uganda.  In 2007, the World Health Organization and UNAIDS recommended scaling up voluntary medical male circumcision (VMMC) particularly in priority countries in Eastern and Southern Africa. Although some progress has been made the last few years, with close to 6 million circumcisions completed by the end of 2013 in the priority countries, we are still far from the goal of 20.2 million male circumcisions by 2015 necessary to avert 3.36 million new HIV infections. How can we design interventions to achieve the level of male circumcisions necessary to help reach the AIDS transition?

Interventions to promote increased VMMC have been quite successful at increasing the supply of circumcisions. The slow progress is blamed on demand. Governments and others employ two main approaches, informed by acceptability studies, for increasing the demand for VMMC—behaviour change communication (BCC) and opportunity or transaction cost reduction. BCC uses a variety of channels to provide information on benefits (primarily health benefits) from VMMC. The cost reduction approaches compensate men for financial costs incurred, such as travel expenses, and/or opportunities costs incurred, such as lost working days, when they are circumcised.

The results from these approaches have been disappointing. A recent randomised controlled trial evaluating the impact of comprehensive information about male circumcision and HIV risk in Lilongwe, Malawi shows no significant effect on adult or child demand for circumcisions after one year. In another 3ie-supported study in Malawi, information increased the likelihood of getting circumcised by only 1.4 percentage points. On the cost side, a randomised controlled trial evaluation conducted in Kenya of an intervention to reduce costs associated with VMMC finds that small, fixed economic incentives to compensate for lost wages ranging from KES 700-1200 (USD 8.75-15) increased VMMC uptake within two months among men aged 25-49 years by 7.1 per cent. Although this finding is statistically significant, the effect size is small suggesting that targeted level of male circumcision coverage might not be achieved by only addressing barriers related to costs.

Both these approaches are based on rational choice theory, that is, they assume that men make the decision to get circumcised as a rational choice that maximises benefits to them and minimises costs to them. Given the low numbers overall and the impact evaluation findings of small effects, we have to wonder whether the decision to be circumcised is really a standard rational choice type decision for many men. Maybe—just maybe—some men simply don’t want to be circumcised. Perhaps the decision about male circumcision is an emotional choice decision more than a rational choice decision.

Where does that leave us? Is there anything we can do to influence the emotional choice decision?

3ie’s thematic window for increasing the demand for VMMC is designed to promote and test innovative approaches for increasing the demand for male circumcision. In our scoping paper for this window, we suggest that one approach to innovation may be to engage peers and female intimate partners as catalysers to generate demand. These influencers may appeal to both the emotional choice and rational choice aspects of the circumcision decision. For example, men may find information provided by their peers to be more credible, but they may also feel more comfortable with circumcision if they know someone else who has chosen to do it. Intimate partners may be in a better position to frame the information being given to uncircumcised men, but they may also be able to convince men on an emotional level in the way that mass media certainly cannot. One concern that has been raised about promoting circumcision among men in consistent relationships is that circumcision could indicate that the man intends on having sex with others. Only within the relationship can a barrier to demand like this one be addressed.

Two studies funded under 3ie’s thematic window are testing interventions based on peers and intimate partners. One is in Zambia using peer referral incentives to increase demand for voluntary medical male circumcision. The second is in Uganda involving female intimate partners (pregnant women in their third trimester) to deliver a customised behavior change communication message to their partners in order to increase the uptake of VMMC.

There is one advantage for this desired behaviour change compared to many of the others we often seek to influence—we only need this behavioural response to occur once. A second approach to influencing the emotional choice takes advantage of this aspect. We know from behavioural economics and many other fields that people often have present-biased preferences, that is, they are not good at doing things they don’t want to do in the present in order to gain benefits over the long term. So perhaps an intervention designed to give a reward in the present can be used to induce behaviour in the present.

Two studies funded under the thematic window are testing interventions based on this idea in Kenya and Tanzania. These interventions employ a lottery (or raffle) for men getting circumcised to have a chance to win a prize, in both cases, some kind of phone. The material gain is only received by a subset of men, so the intent is not to compensate them for costs. Rather the theory is that the prospect of winning the prize will induce the desired behavioural for the single time needed. We hypothesise that these interventions will also benefit from the theory that people tend to overestimate probabilities near zero.

The final results from these and three other studies on demand creation for VMMC will be available in the first half of 2015. We hope to learn from them both what works for increasing the demand for VMMC and some insights into influencing emotional choices.