10 23. Program evaluation

Content warning: discussions of BMI/weight/obesity, genocide, and residential schools for indigenous children.

Imagine you are working for a nonprofit focused on children’s health and wellness in school. One of the grants you received this year funds a full-time position at a local elementary school for a teacher who will be integrating kinesthetic learning into their lesson plans for math classes for third graders. Kinesthetic learning is learning that occurs when the students do something physical to help learn and reinforce information, instead of listening to a lecture or other verbal teaching activity. You have read research suggesting that students retain information better using kinesthetic teaching methods and that it can reduce student behavior issues. You want to know if it might benefit your community.

A group of elementary school-aged children in green uniforms standing together smiling.

When you applied for the grant, you had to come up with some outcome measures that would tell the foundation if your program was worth continuing to fund – if it’s having an effect on your target population (the kids at the school). You told the foundation you would look at three outcomes:

  1. How did using kinesthetic learning affect student behavior in classes?
  2. How did using kinesthetic learning affect student scores on end-of-year standardized tests?
  3. How did the students feel about kinesthetic teaching methods?

But, you say, this sounds like research! However, we have to take a look at the purpose, origin, effect, and execution of the project to understand the difference, which we do in section 23.1 in this chapter. Those domains are where we can find the similarities and differences between program evaluation and research.

Realistically, as a practitioner, you’re far more likely to engage in program evaluation than you are in research. So, you might ask why you are learning research methods and not program evaluation methods, and the answer is that you will use research methods in evaluating programs. Program evaluation tends to focus less on generalizability, experimental design, and replicability, and instead focuses on the practical application of research methods to a specific context in practice.

23.1 What is program evaluation?

Learning Objectives

Learners will be able to…

Program evaluation can be defined as the systematic process by which we determine if social programs are meeting their goals, how well the program runs, whether the program had the desired effect, and whether the program has merit according to stakeholders (including in terms of the monetary costs and benefits). It’s important to know what we mean when we say “evaluation.” Pruett (2000) [1] provides a useful definition: “Evaluation is the systematic application of scientific methods to assess the design, implementation, improvement or outcomes of a program” (para. 1). That nod to scientific methods is what ties program evaluation back to research, as we discussed above. Program evaluation is action-oriented, which makes it fit well into social work research (as we discussed in Chapter 1).

A quick side note: elsewhere in the text, we’ve talked about stakeholders and gatekeepers as distinct groups of people. While this distinction is accurate and important, in this chapter, I’m only going to use the word “stakeholders” because that’s what you’ll hear in the world of program evaluation and logic models.

Often, program evaluation will consist of mixed methods because its focus of is so heavily on the effect of the program in your specific context. Not that research doesn’t care about the effects of programs – of course it does! But with program evaluation, we seek to ensure the way that we are applying our program works in our agency, with our communities and clients. Thinking back to the example at the beginning of the chapter, consider the following: Does kinesthetic learning make sense for your school? What if your classroom spaces are too small? Are the activities appropriate for children with differing physical abilities who attend your school? What if school administrators are on board, but some parents are skeptical?

Bright green hedges trimmed into a maze

The project we talked about in the introductions – a real project, by the way – was funded by a grant from a foundation. The reality of the grant funding environment is that funders want to see that their money is not only being used wisely, but is having a material effect on the target population. This is a good thing, because we want to know our programs have a positive effect on clients and communities. We don’t want to just keep running a program because it’s what we’ve always done. (Consider the ethical implications of continuing to run an ineffective program.) It also forces us as practitioners to plan grant-funded programs with an eye toward evaluation. It’s much easier to evaluate your program when you can gather data at the beginning of the program than when you have to work backwards at the middle or end of the program.

How do program evaluation and research relate to each other?

As we talked about above, program evaluation and research are similar, particularly in that they both rely on scientific methods. Both use quantitative and qualitative methods, like data analysis and interviews. Effective program evaluation necessarily involves the research methods we’ve talked about in this book. Without understanding research methods, your program evaluation won’t be very rigorous and probably won’t give you much useful information.

However, there are some key differences between the two that render them distinct activities that are appropriate in different circumstances. Research is often exploratory and not evaluative at all, and instead looks for relationships between variables to build knowledge on a subject. It’s important to note at the outset that what we’re discussing below is not universally true of all projects. Instead, the framework we’re providing is a broad way to think about the differences between program evaluation and research. Scholars and practitioners disagree on whether program evaluation is a subset of research or something else entirely (and everything in between). The important thing to know about that debate is that it’s not settled, and what we’re presenting below is just one way to think about the relationship between the two.

According to Mathison (2008) [2] , the differences between program evaluation and research have to do with the domains of purpose, origins, effect and execution.

Program Evaluation Research
Purpose Judges merit or worth of the program Produces generalizable knowledge and evidence
Origins Stems from policy and program priorities of stakeholders Stems from scientific inquiry based on intellectual curiosity
Effect Provides information for decision-making on specific program Advances broad knowledge and theory
Execution Conducted within a setting of changing actors, priorities, resources and timelines Usually happens in a controlled setting

Let’s think back to our example from the start of the chapter – kinesthetic teaching methods for 3rd grade math – to talk more about these four domains.

Purpose

To understand this domain, we have to ask a few questions: why do we want to research or evaluate this program? What do we hope to gain? This is the why of our project (Mathison). Another way to think about it is as the aim of your research, which is a concept you hopefully remember from Chapter 2.

Through the lens of program evaluation, we’re evaluating this program because we want to know its effects, but also because our funder probably only wants to give money to programs that do what they’re supposed to do. We want to gather information to determine if it’s worth it for our funder – or for us – to invest resources in the program.

If this were a research project instead, our purpose would be congruent, but different. We would be seeking to add to the body of knowledge and evidence about kinesthetic learning, most likely hoping to provide information that can be generalized beyond 3rd grade math students. We’re trying to inform further development of the body of knowledge around kinesthetic learning and children. We’d also like to know if and how we can apply this program in contexts other than one specific school’s 3rd grade math classes. These are not the only research considerations, but just a few examples.

Origins

Purpose and origins can feel very similar and be a little hard to distinguish. The main difference is that origins are about the who, whereas purpose is about the why (Mathison). So, to understand this domain, we have to ask about the source of our project – who wanted to get the project started? What do they hope this project will contribute?

For a program evaluation, the project usually arises from the priorities of funders, agencies, practitioners and (hopefully) consumers of our services. They are the ones who define the purpose we discussed above and the questions we will ask.

In research, the project arises from a researcher’s intellectual curiosity and desire to add to a body of knowledge around something they think is important and interesting. Researchers define the purpose and the questions asked in the project.

Effect

The effect of program evaluation and research is essentially what we’re going to use our results for. For program evaluation, we will use them to make a decision about whether a program is worth continuing, what changes we might make to the program in the future or how we might change the resources we devote going forward. The results are often also used by our funders to make decisions about whether they want to keep funding our program or not. (Outcome evaluations aren’t the only thing that funders will look at – they also sometimes want to know whether our processes in the program were faithful to what we described when we requested funding. We’ll discuss outcome and process evaluations in section 23.4.)

The effect of research – again, what we’re going to use our results for – is typically to add to the knowledge and evidence base surrounding our topic. Research can certainly be used for decision-making about programs, especially to decide which program to implement in the first place. But that’s not what results are primarily used for, especially by other researchers.

Execution

Execution is fundamentally the how of our project. What are the circumstances under which we’re running the project?

Program evaluation projects that most of us will ever work on are frequently based in a nonprofit or government agency. Context is extremely important in program evaluation (and program implementation). As most of us will know, these are environments with lots of moving parts. As a result, running controlled experiments is usually not possible, and we sometimes have to be more flexible with our evaluations to work with the resources we actually have and the unique challenges and needs of our agencies. This doesn’t mean that program evaluations can’t be rigorous or use strong research methods. We just have to be realistic about our environments and plan for that when we’re planning our evaluation.

Research is typically a lot more controlled. We do everything we can to minimize outside influences on our variables of interest, which is expected of rigorous research. Of course, some research is extremely controlled, especially experimental research and randomized controlled trials. this all ties back to the purpose, origins, and effects of research versus those of program evaluation – we’re primarily building knowledge and evidence.

In the end, it’s important to remember that these are guidelines, and you will no doubt encounter program evaluation projects that cross the lines of research, and vice versa. Understanding how the two differ will help you decide how to move forward when you encounter the need to assess the effect of a program in practice.

Key Takeaways

Exercises

  1. If you were conducting a research project on the kinesthetic teaching methods that we talked about in this chapter, what is one research question you could study that aligns with the purpose, origins, and effects of research?
  2. Consider the research project you’ve been building throughout this book. What is one program evaluation question you could study that aligns with the purpose, origins, and effects of program evaluation? How might its execution look different than what you’ve envisioned so far?

23.2 Planning your program evaluation

Learning Objectives

Learners will be able to…

Planning a program evaluation project requires just as much care and thought as planning a research project. But as we discussed in section 23.1, there are some significant differences between program evaluation and research that mean your planning process is also going to look a little different. You have to involve the program stakeholders at a greater level than that found with most types of research, which will sometimes focus your program evaluation project on areas you wouldn’t have necessarily chosen (for better or worse). Your program evaluation questions are far less likely to be exploratory; they are typically evaluative and sometimes explanatory.

For instance, I worked on a project designed to increase physical activity for elementary school students at recess. The school had noticed a lot of kids would just sit around at recess instead of playing. As an intervention, the organization I was working with hired recess coaches to engage the kids with new games and activities to get them moving. Our plan to measure the effect of recess coaching was to give the kids pedometers at a couple of different points during the year, and see if there was any change in their activity level as measured by the number of steps they took during recess. However, the school was also concerned with the rate of obesity among students, and asked us to also measure the height and weight of the students to calculate BMI at the beginning and end of the year. I balked at this because kids are still growing and BMI isn’t a great measure to use for kids and some kids were uncomfortable with us weighing them (with parental consent), even though no other kids would be in the room. However, the school was insistent that we take those measurements, and so we did that for all kids whose parents consented and who themselves assented to have their weight measured. We didn’t think BMI was an important measure, but the school did, so this changed an element of our evaluation.

In an ideal world, your program evaluation is going to be part of your overall program plan. This very often doesn’t happen in practice, but for the purposes of this section, we’re going to assume you’re starting from scratch with a program and really internalized the first sentence of this paragraph. (It’s important to note that no one intentionally leaves evaluation out of their program planning; instead, it’s just not something many people running programs think about. They’re too busy… well, running programs. That’s why this chapter is so important!)

In this section, we’re going to learn about how to plan your program evaluation, including the importance of logic models. You may have heard people groan about logic models (or you may have groaned when you read those words), and the truth is, they’re a lot of work and a little complicated. Teaching you how to make one from start to finish is a little bit outside the scope of this section, but what I am going to try to do is teach you how to interpret them and build some evaluation questions from them. (Pro-tip: logic models are a heck of a lot easier to make in Excel than Word.)

The Centers for Disease Control has a great, simple framework for planning your program evaluation project that I’m going to walk through with you below.

It has three primary steps: engaging stakeholders, describing the program and focusing the evaluation.

Step 1: Engaging stakeholders

Stakeholders are the people and organizations that have some interest in or will be impacted by our program. Including as many stakeholders as possible when you plan your evaluation will help to make it as useful as possible for as many people as possible. The key to this step is to listen. However, a note of caution: sometimes stakeholders have competing priorities, and as the program evaluator, you’re going to have to help navigate that. For example, in our kinesthetic learning program, the teachers at your school might be interested in decreasing classroom disruptions or enhancing subject matter learning, while the administration is solely focused on test scores, while the administration is solely focused on test scores. Here is where it’s a great idea to use your social work ethics and research knowledge to guide conversations and planning. Improved test scores are great, but how much does that actually benefit the students?

Two colleagues, a transgender woman and a non-binary person, laughing in a meeting at work

Step 2: Describe the program

Once you’ve got stakeholder input on evaluation priorities, it’s time to describe what’s going into the program and what you hope your participants and stakeholders will get out of it. Here is where a logic model becomes an essential piece of program evaluation. A logic model “is a graphic depiction (road map) that presents the shared relationships among the resources, activities, outputs, outcomes, and impact for your program” (Centers for Disease Control, 2018, para. 1). Basically, it’s a way to show how what you’re doing is going to lead to an intended outcome and/or impact. (We’ll discuss the difference between outcomes and impacts in section 23.4.)

Logic models have several key components, which I describe in the list below (CDC, 2018). The components are numbered because of where they come in the “logic” of your program – basically, where they come in time order.

  1. Inputs: resources (e.g. people and material resources) that you have to execute your program.
  2. Activities: what you’re actually doing with your program resources.
  3. Outputs: the direct products and results of your program.
  4. Outcomes: the changes that happen because of your program inputs and activities.
  5. Impacts: the long-term effects of your program.
This video gives you a good introduction to logic models and the ways they can look.

The CDC also talks about moderators – what they call “contextual factors” – that affect the execution of your program evaluation. This is an important component of the execution of your project, which we talked about in 23.1. Context will also become important when we talk about implementation science in section 23.3.

Let’s think about our kinesthetic learning project. While you obviously don’t have full information about what the project looks like, you’ve got a good enough idea for a little exercise below.

Step 3: Focus the evaluation

So now you know what your stakeholder priorities are and you have described your program. It’s time to figure out what questions you want to ask that will reflect stakeholder priorities and are actually possible given your program inputs, activities and outputs.

Why do inputs, activities and outputs matter for your question?

Key Takeaways

Exercises