Skip to main content
SearchLoginLogin or Signup

The Case for Rigorous Comparative Research and Population Impacts in a New Era of Evidence-Based Interventions for Juvenile Offenders

Whatever conclusions that are drawn from or even critiques leveled at the review by Elliott and colleagues (2020), it is important to acknowledge the larger context in which this work has been carried out. That is to say, it is quite an achievement that we have reached a ...

Published onOct 13, 2020
The Case for Rigorous Comparative Research and Population Impacts in a New Era of Evidence-Based Interventions for Juvenile Offenders


Whatever conclusions that are drawn from or even critiques leveled at the review by Elliott and colleagues (2020), it is important to acknowledge the larger context in which this work has been carried out. That is to say, it is quite an achievement that we have reached a state of intervention science and implementation science where we can begin to evaluate the merits of alternative models for delivering effective, economically efficient, and humane interventions to improve the life chances of young people who have come in conflict with the law. It was not so long ago that punitiveness in all its varieties was the dominant and default approach to responding to juvenile offenders in the United States (Skeem, Scott, & Mulvey, 2014; Sullivan, 2019). To be sure, we have a long way to go to ending punitive practices in our juvenile justice systems, let alone getting to some type of system-wide application of the best available research evidence.

In Elliott et al. (2020) we learn that there are two main evidence-based models for guiding juvenile justice interventions. One involves the use of brand-name or commercially developed evidence-based programs. These are characterized by a “coherent package of activities with defined delivery protocols, implementation manuals, training, and technical assistance…” (Elliott et al., 2020, p. 1). Examples of some of these programs include multisystemic therapy (MST) and functional family therapy (FFT). The other model, known as the practice approach, draws upon the results of large-scale meta-analyses that offer a number of generalized strategies (or generics) for improving existing intervention practices. Lipsey’s Standardized Program Evaluation Protocol (SPEP) is a well-known and increasingly widely used example of this model (Lipsey, 2018; Lipsey & Howell, 2012; Lipsey, Howell, Kelly, Chapman, & Carver, 2010).

These models are the subject of the authors’ critical review, offering insights on how the models compare and contrast with one another on several important dimensions. Importantly, a key take away from the review is the need for a program of more rigorous comparative research to assess the different evidence-based intervention models. A sound measure of this research should be whether or not the different models are able to achieve (and sustain) population impacts. Elliott et al. (2020) give some attention to this matter, with a particular focus on the groundbreaking work being carried out in Washington State (Washington State Institute for Public Policy, 2019). It is argued that population impacts should be the primary goal of evidence-based interventions (Dodge, 2020; Gottfredson et al., 2015).

These are two key issues facing the evidence-based movement in juvenile justice today, and this essay sets out to make the case for greater attention to both. It begins with an overview of prior research on intervention models and discusses opportunities for new research. The next part focuses on the importance of achieving population impacts and how this should guide the scaling-up of evidence-based interventions. The essay ends with some concluding remarks.

Assessing Evidence-Based Intervention Models

Prior to Elliott et al.’s (2020) review, there has been an almost complete absence of research investigating the comparative utility of the two main evidence-based models (or other models; see below) for guiding juvenile justice interventions. In their review of policy reforms for high-risk juvenile offenders, Skeem and colleagues (2014) identify only one study, and a recent search of the literature by this author did not turn up any additional studies. This one known study uses a decision-tree model to investigate the expected value of three comparable programs as they move through different conditions under the two different evidence-based models (Welsh, Rocque, & Greenwood, 2014). Data on the three programs (two brand-names and one generic) are drawn from the Washington State Institute for Public Policy’s (WSIPP) inventory of benefit-cost analyses of juvenile justice programs. Results are found to vary according to the different conditions. Under one condition where brand-name programs have a large advantage in implementation success over generic programs, it is reported that the brand-name programs have the highest expected values. Under the other condition that considered the role of the SPEP, it is reported that all three programs produce highly favorable expected values. The authors conclude that state governments wanting to expand evidence-based juvenile justice interventions should consider the merits of both models. Relatedly, Skeem et al. (2014: 724) note that the two models are “not mutually exclusive—in fact, they can and do inform one another.” This is evident, for example, in the design of the SPEP being informed by brand-name programs.

These findings run counter to those by Elliott et al. (2020). There are several factors that may account for this. One is the additional data that has become available on brand-name evidence-based programs and the SPEP in the intervening years. Another factor has to do with differences in the operationalization of effectiveness. Elliott and colleagues use what they refer to as a “basic set of standards,” which amounts to four main criteria, to evaluate the effectiveness of the two models. Informed by prospect theory, Welsh et al. (2014) use “expected value,” a function of feasibility (i.e., probabilities of implementation success), monetary costs and benefits, and effects on recidivism.[1] A third factor has to do with the different methodological approaches used in the two studies. To take nothing away from the work by Elliott et al., it needs to be asked if a more rigorous method for comparing the two models would have altered their findings. The critical (narrative) review is a useful method, but suffers from a number of serious limitations.

The decision-tree method—used by Welsh et al. (2014) and used under more complex scenarios supported by statistical software (e.g., Ortega-Campos, García-García, Gill-Fenoy, & Zaldívar-Basurto, 2016)—offers a more rigorous approach for comparing and contrasting brand-name programs and generic practices. Part of the reason for this is that it allows for the determination of optimal choices under a variety of circumstances, based on a range of assumptions and proclivities. Also, decision-makers have the ability to visualize—often in the form of a flow chart—realistic options confronting the two models (e.g., implementation success) and the possible outputs and outcomes of the options. Future research in this area should consider using the decision-tree method. Different brand-name programs (e.g., MST, FFT) can be compared to various generic strategies operating under a range of conditions. One might even consider replicating Elliott et al.’s review with the decision-tree method.

Future research in this area should also consider the use of computer simulation modeling. This approach can be useful in providing an instructive understanding of the potential impacts of interventions in situations where the ability to manipulate real-world conditions is otherwise limited (Groff & Mazerolle, 2008). Simulation modeling is a key component of WSIPP’s benefit-cost analysis model of prevention and intervention programs (Drake, Aos, & Miller, 2009).

In addition to the use of more rigorous methods, it will also be important to consider other existing models as part of a program of future research. In their review, Skeem et al. (2014: 726) identify the risk-need-responsivity model (Andrews et al., 1990) as another leading model to “distill generic principles of effective correctional treatment.” Relying on the Correctional Program Assessment Inventory, this model holds many similarities to the SPEP. Another model that merits consideration is Communities That Care (CTC). Known as an operating system to guide community-wide prevention efforts, CTC aims to reduce delinquency and later offending by implementing particular programs—from a menu of strategies—that have demonstrated effectiveness in reducing risk factors and enhancing protective factors (Fagan, Hawkins, Catalano, & Hawkins, 2019). It is most closely associated with the brand-name programs model.

Achieving Population Impacts

It is a longstanding view that achieving population impacts should be the primary goal of evidence-based interventions.[2] This is true whether applied in the context of juvenile justice or other public systems (Fagan et al., 2019). The scope of the population can be wide ranging, including a city, county, state, or country. Importantly, there is a widely agreed upon scientific process to facilitate evidence-based interventions in trying to attain population impacts. This is known as scaling up, a progression of steps for interventions that involves moving from efficacy trials to community effectiveness trials to broad-scale dissemination (Flay et al., 2005). Now viewed as the “traditional preventive intervention research cycle” (Gottfredson et al., 2015: 895), there is greater recognition today for the utility of other processes to facilitate scaling up evidence-based interventions (see Haskins & Joo, 2017; Schindler, Fisher, & Shonkoff, 2017).

A key criticism of evidence-based interventions, whether they are brand-name programs or generic practices, is the limited research demonstrating their ability to achieve population impacts (see Fagan et al., 2019; Gottfredson et al., 2015; Spoth et al., 2013). Several views are at the root of this criticism. One is that too much attention has been given to usage rates or penetration rates instead of population impacts. For example, Dodge and Mandel (2012) drew attention to this in the context of a study examining how state governments support and promote the use of brand-name programs (Greenwood & Welsh, 2012; see also Welsh & Greenwood, 2015). Noting that rates of use are often low and questioning the utility of the outcome even if usage rates are higher, the authors ask, “What evidence supports the contention that increasing the EBP [evidence-based intervention program] penetration rate will bring improved population-level impact on youth outcomes?” (Dodge & Mandel, 2012: 526).

Another view focuses on the scaling up process itself. A central issue here is the widely established finding that effects will attenuate—sometimes rather substantially—as interventions progress through the stages and are disseminated for wider public use. Loss of treatment fidelity, heterogeneity in target populations, heterogeneity of service providers, and changes in implementation context are among the main factors that contribute to attenuation of effects (Yohros & Welsh, 2019). In a series of works centered around a universal nurse home visitation program for newborns in North Carolina, Dodge (2018; 2020; Goodman, O’Donnell, Murphy, & Dodge, 2019) calls for a move away from the traditional process of scaling and its focus on individual programs and practices. In blunt terms, Dodge (2018: 1118) refers to this as a “scale-up failure.”

It is here where some push back is warranted. Advancements in the science of implementation over the last two decades are the backbone of facilitating the scaling up (whatever form this takes) of evidence-based interventions to achieve population impacts. Indeed, Fixsen and colleagues (2013: 214) report that the field of implementation science “is on the verge of having evidence-based implementation methods to reliably realize the promise of evidence-based programs in practice.” This is by no means cause for self-congratulation, let alone complacency. Instead, as argued by Fixsen, Fagan, and other leaders of implementation science, what is needed is sustained policy attention—at the state and local levels—and research on the complexities of the implementation of evidence-based interventions (e.g., Fagan, 2017; Fixsen, Blase, & Fixsen, 2017). This includes some of the finer points raised by Dodge and others, including various system-context issues (e.g., Dodge, 2018; Gottfredson et al., 2015). But none of this should take away from the present state and future of what the science of implementation holds for scaling up evidence-based juvenile justice interventions and attaining impacts at the population level.

It also needs to be acknowledged that efforts in several states that are supporting and promoting evidence-based juvenile justice programs and practices are in fact beginning to realize population impacts. This is a particularly important feature of the work by Elliott and colleagues (2020). While the authors acknowledge that this is not yet happening at the national level in the U.S., they are on firm ground to profile the efforts in Washington State—and this goes beyond the impressive results that the state has been able to achieve so far.

The Washington State success story started with a transparent and rigorous research-informed process for the selection and field-testing of several brand-name evidence-based programs for juvenile offenders (e.g., MST, FFT), as well as the development of some of their own programs that have since been certified as evidence-based. Guided by state legislation and working with a wide range of practitioners and other stakeholders, the state put in place several initiatives to facilitate the expansion of these programs across the state. For instance, as a way to help maintain treatment fidelity, several quality assurance procedures and mechanisms were established, including a state-level quality assurance committee that works closely with similar committees at the county level (Welsh & Greenwood, 2015). Impressively, in recent years, the state has taken similar efforts to scale-up evidence-based interventions in other systems that affect children and youth, including child welfare and mental health.


Major advancements in intervention science and in implementation science, not to mention some highly successful and promising developments at scaling evidence-based interventions at the state level, have helped to usher in what can be viewed as a new era of evidence-based juvenile justice in this country. The review by Elliott and colleagues (2020) captures some of this work, and assesses if a brand-name program or practice approach should be guiding evidence-based interventions for juvenile offenders in the years ahead. Conducting rigorous comparative research on the different models of evidence-based interventions and continually striving to achieve population impacts will go a long way toward better understanding if one approach or both is what we need. Importantly, this is not just a research exercise. As the experiences in Washington State demonstrate, real and meaningful change is taking place. It may not be happening as quickly or on a large enough scale as many concerned parties would like and as the needs demand. But this can improve. It begins by building on the totality of the advancements that have been made so far and a renewed political will.


Andrews, D. A., Ivan Zinger, Robert D. Hoge, James Bonta, Paul Gendreau, and Francis T. Cullen. 1990. Does correctional treatment work? A clinically relevant and psychologically informed meta-analysis. Criminology, 28, 369-404.

Dodge, Kenneth A. 2018. Toward population impact from early childhood psychological interventions. American Psychologist, 73, 1117-1129.

Dodge, Kenneth A. 2020. Annual research review: Universal and targeted strategies for assigning interventions to achieve population impact. Journal of Child Psychology and Psychiatry, 61, 255-267.

Dodge, Kenneth A., and Adam D. Mandel. 2012. Building evidence for evidence-based policy making. Criminology & Public Policy, 11, 525-534.

Drake, Elizabeth K., Steve Aos, and Marna G. Miller. 2009. Evidence-based public policy options to reduce crime and criminal justice costs: Implications in Washington State. Victims & Offenders, 4, 170-196.

Elliott, Delbert S., et al. XXXX. 2020. Evidence-based juvenile justice programs and practices: A critical review. Criminology & Public Policy, 19, XXX-XXX.

Fagan, Abigail A. 2017. Illuminating the black box of implementation in crime prevention. Criminology & Public Policy, 16, 451-455.

Fagan, Abigail A., Brian K. Bumbarger, Richard P. Barth, Catherine P. Bradshaw, Brittany Rhoades Cooper, Lauren H. Supplee, et al. 2019. Scaling up evidence-based interventions in US public health systems to prevent behavioral health problems: Challenges and opportunities. Prevention Science, 20, 1147-1168.

Fagan, Abigail A., J. David Hawkins, Richard F. Catalano, and David P. Farrington. 2019. Communities That Care: Building community engagement and capacity to prevent youth behavior problems. New York: Oxford University Press.

Fixsen, Dean L., Karen A. Blase, Allison Metz, and Melissa van Dyke. 2013. Statewide implementation of evidence-based programs. Exceptional Children, 79, 213-230.

Fixsen, Dean L, Karen A. Blase, and Amanda A. M. Fixsen. 2017. Scaling effective innovations. Criminology & Public Policy, 16, 487-499.

Flay, Brian R., Anthony Biglan, Robert F. Boruch, Felipe González-Castro, Denise C. Gottfredson, Sheppard Kellam, et al. 2005. Standards of evidence: Criteria for efficacy, effectiveness and dissemination. Prevention Science, 6, 151-175.

Goodman, W. Benjamin, Karen O’Donnell, Robert A. Murphy, and Kenneth A. Dodge. 2019. Moving beyond program to population impact: Toward a universal early childhood system of care. Journal of Family Theory and Review, 11, 112-126.

Gottfredson, Denise C., Thomas D. Cook, Frances E. M. Gardner, Deborah Gorman-Smith, George W. Howe, Irwin N. Sandler, et al. 2015. Standards of evidence for efficacy, effectiveness, and scale-up research in prevention science: Next generation. Prevention Science, 16, 893-926.

Greenwood, Peter W., and Brandon C. Welsh. 2012. Promoting evidence-based practice in delinquency prevention at the state level: Principles, progress, and policy directions. Criminology & Public Policy, 11, 493-513.

Groff, Elizabeth R., and Lorraine Mazerolle. 2008. Simulated experiments and their potential role in criminology and criminal justice. Journal of Experimental Criminology, 4, 187-193.

Haskins, Ron, and Nathan Joo. 2017. Tiered evidence: What happens when evidence-based teen pregnancy programs are scaled-up to new sites? Working paper. Washington, DC: Brookings Institution.

Homel, Ross, and Paul Homel. 2014. Implementing crime prevention: Good governance and a science of implementation. In Brandon C. Welsh and David P. Farrington (Eds.), The Oxford handbook of crime prevention. New York: Oxford University Press.

Kahneman, Daniel, and Amos Tversky. 1979. Prospect theory: An analysis of decision under risk. Econometrica, 47, 263-292.

Lipsey, Mark W. 2018. Effective use of the large body of research on the effectiveness of programs for juvenile offenders and the failure of the model programs approach. Criminology & Public Policy, 17, 189-198.

Lipsey, Mark W., and James C. Howell. 2012. A broader view of evidence-based programs reveals more options for state juvenile justice systems. Criminology & Public Policy, 11, 515-523.

Lipsey, Mark W., James C. Howell, Marion R. Kelly, Gabrielle Chapman, and Darin Carver. 2010. Improving the effectiveness of juvenile justice programs: A new perspective on evidence-based practice. Washington, DC: Center for Juvenile Justice Reform, Georgetown University.

Ortega-Campos, Elena, Juan García-García, Maria José Gill-Fenoy, and Flor Zaldívar-Basurto. 2016. Identifying risk and protective factors in recidivist juvenile offenders: A decision tree approach. PLoS ONE, 11(9), 1-16 (e0160423).

Schindler, Holly S., Philip A. Fisher, and Jack P. Shonkoff. 2017. From innovation to impact at scale: Lessons learned from a cluster of research-community partnerships. Child Development, 88, 1435-1446.

Skeem, Jennifer L., Elizabeth Scott, and Edward P. Mulvey. 2014. Justice policy reform for high-risk juveniles: Using science to achieve large-scale crime reduction. Annual Review of Clinical Psychology, 10, 709-739.

Spoth, Richard, Louise A. Rohrbach, Mark Greenberg, Philip Leaf, C. Hendricks Brown, Abigail A. Fagan, et al. 2013. Addressing core challenges for the next generation of type 2 translational research and systems: The translation science to population impact (TSci Impact) framework. Prevention Science, 14, 319-351.

Sullivan, Christopher J. 2019. Taking juvenile justice seriously: Developmental insights and system challenges. Philadelphia, PA: Temple University Press.

Tversky, Amos, and Daniel Kahneman. 1986. Rational choice and the framing of decisions. Journal of Business, 59(4), S251-S278.

Washington State Institute for Public Policy. 2019. Updated inventory of evidence-based, research-based, and promising practices for prevention and intervention services for children and juveniles in the child welfare, juvenile justice, and mental health systems. Olympia, WA: Washington State Institute for Public Policy.

Welsh, Brandon C., and Peter W. Greenwood. 2015. Making it happen: State progress in implementing evidence-based programs for delinquent youth. Youth Violence and Juvenile Justice, 13, 243-257.

Welsh, Brandon C., Michael Rocque, and Peter W. Greenwood. 2014. Translating research into evidence-based practice in juvenile justice: Brand-name programs, meta-analysis, and key issues. Journal of Experimental Criminology, 10, 207-225.

Yohros, Alexis, and Brandon C. Welsh. 2019. Understanding and quantifying the scale-up penalty: A systematic review of early developmental preventive interventions with criminological outcomes. Journal of Developmental and Life-Course Criminology, 5, 481-497.

[1] It is beyond the scope of this essay to provide a full description of prospect theory, as well as how it informs the expected value of choices. For prospect theory, interested readers should consult Kahneman and Tversky (1979) and Tversky and Kahneman (1986). For more details on the components of and formula for calculating expected value, see Welsh et al. (2014: 214-216).

[2] Generally speaking, achieving populating impacts also implies that there is some effort directed at sustaining or maintaining the population impacts for some indeterminate period of time. However, the case can be made that they represent two distinct goals, and through the process of achieving population impacts does in no way guarantee the ability to sustain population impacts. As important as this distinction may be, this essay adopts the approach taken in Elliott et al. (2020) of focusing solely on the process of achieving population impacts. For sustaining population impacts, interested readers should consult Homel and Homel (2014).

No comments here
Why not start the discussion?