This article addresses research that evaluates communication programs designed to bring about change in individual behavior and social norms. These programs or campaigns may focus on health, agriculture, environment, water and sanitation, democracy and governance, gender equity, human rights, and related areas. They can be referred to by different labels: strategic communication, behavior change communication, information-education communication, communication for social change, and development communication, among others. Communication evaluation research serves both to guide the design of such programs and to determine their effectiveness in achieving their objectives. The myriad of communication forms used in such programs generally fall into three categories: mass media, community mobilization, and interpersonal communication/ counseling (IPC/C).
Types of Communication Evaluation
A comprehensive evaluation of a communication program includes three primary types of evaluation: formative, process, and summative. Large-scale, well-funded programs will use all three types of evaluation, whereas those with limited budgets or evaluation knowhow often stop at formative or process evaluation.
Formative evaluation refers to activities undertaken to furnish information that will guide program design. This information helps program planners to determine who is most affected by the problem; identify the needs of specific sub-groups; ascertain existing knowledge, beliefs and attitudes; determine access to levels of services, information, social support and other resources; understand barriers to action; and determine audience media habits and preferences (Bertrand 2005). Some relevant information may be available from existing data sources such as epidemiologic or demographic reports, situation assessments, surveys, secondary analyses of existing data, media ratings data, service delivery statistics, and other program records. However, the researcher often needs to collect new information regarding the opinions, aspirations, fears, beliefs, and other key psychological factors that influence a given behavior. Primary data collection may include quantitative or qualitative research methods. Surveys are by far the most commonly used quantitative method. Frequently used qualitative research techniques include focus groups, in-depth interviews, direct observation, and a range of different participatory methods. This information serves to guide the design of a communication strategy for a given setting. It captures existing levels of knowledge, attitudes, beliefs, and behaviors relevant to the topic. It also focuses on characteristics of the intended audience, the potentially most effective channels, the sources of motivation and barriers to change among the intended audience, and other elements.
Communication pre-testing before final production is one type of formative research specific to communication programs. This technique involves testing a communication product (e.g., radio spot, poster, TV storyboard) among a convenience sample of members of the intended audience to measure comprehension, attractiveness of the message, identification with the message, cultural acceptability, and related factors. Programs often conduct multiple pre-tests in an effort to “get the message right” before investing in expensive print runs or broadcast time.
Process evaluation involves tracking program implementation once the program is launched. One common form of process evaluation is to compare actual implementation to the proposed scope of work, to answer the question: to what extent is the project implemented according to plan? (Rossi et al. 2004). This information serves two useful purposes. First, it can alert program managers to delays or shortfalls in implementation that need to be addressed midcourse in program implementation. Second, it provides valuable information in later assessing the effectiveness of the program to achieve its objectives. If the program falls short in delivering the projected elements (e.g., training, print materials, community mobilization activities, counseling sessions, media broadcasts) it will likely not achieve the desired objective. Rossi et al. (2004) distinguish between implementation failure (the program not implemented as designed) and theory failure (the hypothesized effects not being realized despite complete and appropriate program implementation). The results of process evaluation also help managers to understand the program dynamics such that one can replicate successful components and eliminate ineffective elements in future efforts (Bertrand 2005).
Process evaluation may take several forms, one of which is monitoring outputs. At a minimum, programs can and should track the activities conducted, such as number of materials used, number of radio spots, number and frequency of broadcasts for radio or TV spots, number of community educational activities conducted, and so forth. The measure reflects program activity and program managers’ compliance with a work plan, but it should not be confused with achieving desired results or outcomes (measured in terms of actual change).
Measuring the reach of a communication program is another important type of process evaluation, especially for large-scale interventions. Such studies – conducted among a random sample or a convenience sample of the intended audience – indicate the percentage of the audience exposed to the campaign through different channels. They may also measure recall of specific elements such as the content of a radio spot or the logo of a campaign. The data indicate overall coverage by the program as well as the relative reach of different channels used. Data on reach via different channels also provide the basis for evaluating “dose–response” (i.e., the greater the exposure, the greater the effects) as part of summative evaluation, discussed below.
Special studies that examine the functioning of different elements of the program constitute another form of process evaluation. For example, a mystery client study to measure the quality of counseling in a given program captures one aspect of program implementation. Reviewing broadcast logs can indicate whether stations are delivering the number of broadcasts purchased or promised. Participants in community meetings or health fairs may give feedback on what they liked and disliked about the event.
In short, process evaluation tracks whether a communication program is implemented according to plan. Alternatively, it may capture how much the program has done and how well it has done it, measured by audience reaction and reach. However, process evaluation stops short of answering the question: did the desired behavior change occur among members of the intended audience?
Summative evaluation measures the extent to which change occurs, consistent with objectives of the program. It addresses the issue: did the program make a difference? Did it have an impact?
The answer to the question of impact relates directly to the objectives of the program and the types of change anticipated. In some cases, especially where funds for evaluation are limited, program planners may rely primarily on measuring service utilization or sales (also labeled “service outputs”), which serve as proxies to behavior change. For example, in the United States a smoking cessation campaign might track calls to a hotline or sales of nicotine patches. In an international public health context, programs might monitor the volume of contraception sold, the number of bed nets distributed, the number of prenatal visits made, or the number of doses of polio vaccine administered.
Measuring outputs has several advantages. First, the necessary data are often collected for programmatic purposes and become available to evaluators at little or no cost. Second, the results are easy to grasp, especially when presented in graphic form. And third, such measures reflect a behavioral response on the part of the intended audience. However, they fall short in answering the question of whether the program changed behavior, for several reasons. First, it is difficult to conclude with certainty that the program – and not other activities or factors – prompted the actions reflected in these measures of service utilization or sales. Second, the supplies or devices might be purchased but never used. And third, these measures of output yield “numerator data” but no indication of the denominator (50,000 prenatal visits might be impressive in a small country, whereas it would be insignificant in China).
For this reason many evaluators prefer to assess behavior change from a given program by conducting a population-based survey among members of the intended audience to measure outcomes consistent with the objectives of the program. Three categories of outcomes are initial, intermediate, and long-term. Initial outcomes relate to cognitive and psychosocial factors that are hypothesized to precede actual behavior change: knowledge, risk perception, attitudes, beliefs, and self-efficacy, among others. Even if a program does not achieve its behavioral objectives, program staff may want to know if the audience demonstrates any progress toward the behavior, as outlined by Piotrow et al. (1997).
Intermediate outcomes refer to actual behaviors that will lead to the desired end state (e.g., stopping smoking to reduce morbidity or mortality, conserving water to improve the environment, voting in an election to promote democracy). Most summative evaluations that measure outcomes focus most directly on this type of intermediate outcome, because it captures the behavior(s) promoted by the program and is measurable within the lifespan of the project.
By contrast, long-term outcomes measure the desired end state or ultimate objective of the program, such as increased life-span or decreased morbidity, increased agricultural yields, improved environmental conditions, more democratic societies, and higher quality of life. Yet communication research evaluation rarely measures these long-term outcomes because (1) they often happen long after the program itself ends, and (2) factors other than the communication program contribute to them.
With respect to measuring impact, the strength of the evidence varies depending on the study design and/or statistical techniques used. Not surprisingly, the methods needed to produce the strongest evidence are often the most costly and/or require knowledge of advanced analytic techniques, making it difficult for small projects with limited budgets to use the most rigorous evaluation methods (Bertrand 2005). The levels of evidence from strongest to weakest are described below.
Change occurred that can be attributed to the intervention/program
Most donor agencies and program directors would ideally like to be able to make this claim. Purists would argue that the only means to definitively establish cause and effect is to conduct a randomized trial (experimental design). Specifically, the evaluator would randomly assign subjects to two groups, apply the communication intervention to one (the experimental or treatment group) and withhold it from the other (the control group), then compare the behavioral outcomes for the two to determine whether the program had any effect. However, this type of randomized trial or controlled field test is only possible where the program can feasibly manipulate (control) exposure to the communication (e.g., in the case of home visits, mass mailings, clinic counseling sessions). By contrast, a large-scale program with a mass media component such as radio or TV potentially reaches all segments of the population; those not reached are generally “atypical” and thus unsuitable for purposes of comparison. This situation requires evaluators to seek alternative methods to determine the impact or effectiveness of communication programs, often requiring advanced statistical analyses.
Change occurred, which is associated with exposure to the intervention
Because randomized trials are not feasible for the evaluation of communication programs using broadcast media, evaluators must seek alternative methods that yield plausible evidence of impact (Victora et al. 2004). For example, they may use quasi-experimental designs (Fisher & Foreit 2002). The most frequently used are time series, pre-test/posttest nonequivalent control group designs, and separate sample pre-test/post-test designs. An alternative approach that is increasingly in use to measure communication effectiveness involves post-test only data, analyzed using a series of econometric techniques, such as simultaneous equations models, propensity score matching, or longitudinal/panel data methods. However, the latter require advanced statistical techniques and are used primarily in well-funded large-scale evaluations. Although still relatively rare, some evaluators have analyzed program costs in relation to effectiveness in an effort to measure “bang for the buck” (Bertrand & Hutchinson 2006).
Change occurred in the desired outcomes following the intervention
In the real world, many program managers, donor agencies and policymakers are happy if change in the desired outcome(s) occurs at all, and they are less concerned about the technical aspects of study designs (for example, that factors other than the communication program might have triggered the change). This type of trend data constitutes “adequate evidence” of change, to use the language of Victora et al. (2004). Programs with limited budgets or access to evaluation specialists can track changes in data from program statistics or surveys to demonstrate the extent of change in behavior(s), consistent with program objectives.
- Bertrand, J. T. (2005). Evaluating health communication programmes. At https://www.comminit.com/global/content/evaluating-health-communication-programmes.
- Bertrand, J. T., & Hutchinson, P. (eds.) (2006). Cost-effectiveness analysis [Special Issue]. Journal of Health Communication, 11(suppl. 2).
- Bertrand, J. T., & Kincaid, D. L. (1996). Evaluating information, education and communication (IEC) programs for family planning and reproductive health, final report of the IEC Working Group, University of North Carolina, EVALUATION Project.
- Figueroa, M. E., Bertrand, J. T., & Kincaid, D. L. (2002). Evaluating the impact of communication programs, summary of an expert meeting, October 4 –5, 2001, workshop summary series, University of North Carolina, MEASURE Evaluation.
- Fisher, A., & Foreit, J. (2002). Designing HIV/AIDS intervention studies: An operations research handbook. Washington, DC: Population Council.
- Piotrow, P. T., Kincaid, D. L., Rimon, J. G., II, & Rinehart, W. E. (1997). Health communication: Lessons from family planning and reproductive health. Westport, CT: Praeger.
- Rossi, P., Lipsey, M., & Freeman, H. (2004). Evaluation: A systematic approach, 7th edn. Thousand Oaks, CA: Sage.
- Valente, T. W. (1995). Network models of the diffusion of innovations. Cresskill, NJ: Hampton Press.
- Victora, C. G., Habicht, J. P., & Bryce, J. (2004). Evidence-based public health: beyond randomized trials. American Journal of Public Health, 94(3), 400 – 405.
Back to Development Communication.