Skip to main content
SearchLoginLogin or Signup

If the Face Fits: Predicting Future Promotions from Police Cadets’ Facial Traits

Published onJul 04, 2023
If the Face Fits: Predicting Future Promotions from Police Cadets’ Facial Traits
·

Abstract

Objective To evaluate the relationship between police cadets’ facial traits and their subsequent promotional success. Methods Using archival police academy photographs, we use a two-phase experiment to evaluate the impact of facial traits on future promotional success. First, respondents (n = 507) view randomly selected photographs of cadets (observations= 15,669) and evaluate them for facial traits and perceived leadership ability. Second, respondents are presented with random dyads of differentially promoted recruits, and choose one based on the highest perceived leadership ability. We compare those leadership evaluations to the subsequent promotional success of the cadets featured in the photographs (observations = 5739). We employ Bayesian multilevel modeling throughout both phases. Results Facial traits are the primary driver of subject perceptions of leadership ability, and those perceptions successfully predict promotional success later in the cadets’ careers. When selecting for leadership potential based on police cadet photographs, respondents predict correct promotional choices at levels well above chance as measured by an AUC score of .70. Further, respondents’ evaluations successfully discriminate both between no promotion and lieutenant promotion, and sergeant versus lieutenant promotions. Conclusions Promoting the most capable police officers is a critical feature of public service. Our findings cast a degree of doubt on the purportedly meritocratic foundations of police promotion and selection. Extra-legal information, such as facial features, predicts later promotional success.

“We look at a person and immediately a certain impression of his character forms in us. A glance…[is] sufficient to tell us a story about a highly complex matter (Asch, 1946, p. 520).”

Introduction

Good policing requires good police officers, who should, in turn, be led by the best-qualified leaders. The promotion systems of policing agencies are critical when providing professionalized policing services. The wide variety of promotional systems find foundation in the ideal—whether implicit or explicit—of selecting the most qualified and successful officers to lead. We entrust promoted officers with powers that include decisions about agency missions, personnel, reasonable risk, and cost/benefit tradeoffs in policy formulation. Promotion to any rank is relatively rare, which makes this step all the more important; most officers never attain any form of promotion, and promotions remain a contested area of police service that significantly impacts law enforcement personnel and agency operations (Archbold & Schulz, 2008; Whetstone, 2001). Consequently, agencies dedicate considerable resources to design meritocratic promotion systems and to identify the best-qualified candidates (Drew & Saunders, 2020).

Successful promotion processes, which identify and select the most qualified candidates, result in better policing; moreover, police agencies better attain their goals (Drew & Saunders, 2020; Savery, 1994; Shjarback & Todak, 2019). Promotion to sergeant, lieutenant, and command positions is ostensibly rooted in meritocratic principles, such as the evaluation of law enforcement capability. Indeed, Bittner (1990, p. 239) contends that ability should override all other considerations when selecting leaders in policing: “Career advancement in departments is heavily determined by an officer’s show of initiative and ability in law enforcement…[these attributes] weigh more heavily in his favor when it comes to assessing his overall performance than any other factor.”

Promotional systems hinge on the notion that tests are an effective way to identify leadership capability. However, there is reason to believe that assessed merit is not entirely responsible for how officers rise through the ranks. For example, it is well documented that people are often quick to ascribe qualities of mind and character to individuals based solely on their physical appearance (Berggren et al., 2010; Chiao et al., 2008; Lawson et al., 2010; Sandrin et al., 2022), and facial attributes were the central focus of a handful of impactful studies of military promotion. These foundational experimental studies demonstrated that respondents could accurately predict military academy recruits' later promotional success based on photographs alone, thereby affirming facial characteristics as an extra-professional factor in military promotion (Mazur et al., 1984; Mueller & Mazur, 1996a). Given the quasi-militaristic structure of policing, we investigate whether similar extra-professional factors—specifically, facial traits—affect police promotions. We do this by replicating the above military-oriented experiments in a police setting.

As with similar experiments, we present non-police respondents with a random selection of photographs of police recruits and ask them to rate facial attributes, leadership potential, and future promotional success across two separate study phases. Despite the high stakes and rigorous assessment efforts commonly found in police leadership selection, neutral survey respondents can predict future promotional success based only on a static image of a police cadet and at rates not explainable by chance alone. For example, in a test of promotional success between unpromoted officers and those eventually promoted to lieutenant (a command-level position in most agencies), respondents correctly distinguished between the two groups approximately 70% of the time.

While there have been parallel studies in various fields, this is the first study focusing on the effect of facial traits on police promotion. We assess the impact of perceived dominance, trustworthiness, attractiveness, and masculinity of police officers to arrive at facial trait scores. These scores represent the perceived leadership ability of law enforcement officers, who are currently working in a large, capital city police department. Our results demonstrate that perceptions about specific facial traits infer leadership ability. A model of facial characteristics has a higher predictive capability than competing models utilizing only respondent demographics, personality traits, and other physical non-facial trait observations of the officers being rated (e.g., wearing glasses, balding, sex, minority status).

In the second analysis, we find that respondents can accurately impute leadership ability from facial traits displayed in police academy graduation headshots. We show that respondents can predict with approximately 70% accuracy which officers would be promoted to leadership positions in later decades. These predictions are based solely on viewing police academy photographs taken decades earlier, with no additional information on the professional identity of the recruits. Our results therefore call into question the presumed meritocratic foundations of the police promotional process.

Literature Review

Prospect theory provides a theoretical scaffold to explain the role of ascribed attribution in decision making. Building on earlier work (Kahneman & Tversky, 1982, 1984; Tversky & Kahneman, 1974), Kahneman identifies “System 1” (fast) thinking and “System 2” (slow) thinking as the two main ways in which people tend to form thoughts, which are the foundation of many types of social and private behaviors. System 1 thinking operates automatically and quickly, with little effort and a scant sense of conscious control. System 2 thinking is much more cognitively demanding and is associated with the subjective experience of sustained concentration and ultimate calculative choice. In Kahneman’s words, “When we think of ourselves, we identify with System 2, the conscious, reasoning self that has beliefs, makes choices, and decides what to think” (p. 21).

Mazur (1985) offers an auxiliary explanation founded in nature. Virtually all social species engage in leadership designation. Typically, as leader, a single individual has ultimate control over group actions and their followers proffer this leader their due deference. The followers have a sense, based on appearance and demeanor, that the leader is better suited to lead and make decisions (Lovrich et al., 2018).

Though Kahneman identifies many System 1 heuristic types, we focus our study on the phenomenon of leadership selection based on an individual’s facial traits. Humans use facial traits to make judgments about competence (Antonakis & Dalgas, 2009; Todorov et al., 2005), trustworthiness (Linke et al., 2016), intelligence, honesty (Bull et al., 1983), and dominance (Berinsky et al., 2019; Bull et al., 1983; Chiao et al., 2008; Ferguson et al., 2019; Fruhen et al., 2015). These judgments correlate with perceived leadership ability and with assessments of candidate leadership potential (Van Vugt & Grabo, 2015) such that human leadership choices can often be predicted based on particular facial image cues (Mazur et al., 1984; Mazur & Mueller, 1996; Mueller & Mazur, 1996).

Facial Traits, Leadership Perceptions, and Promotion

Mazur and Mueller completed foundational studies involving the impact of facial characteristics on administrative leadership selection (Mazur et al., 1984; Mazur & Mueller, 1996; Mueller & Mazur, 1996b). They collected data for the West Point class of 1950 and, in 1989, sent a questionnaire to 1950 West Point class members. The questionnaire asked about their parental background, family tradition of military service, pre-academy formative experiences and adolescent character, academy-period performance, post-academy service postings and advanced educational opportunities, career placements, and also about their record of promotion up the general officer ranks. Following an analysis of all the plausible factors that might explain promotion beyond field grade ranks, “facial dominance” – an ascribed trait measured from cadet portraits taken 20+ years earlier – significantly predicted advancements among the West Point Academy graduates of 1950 (Mazur et al., 1984; Mazur & Mueller, 1996; Mueller & Mazur, 1996a). This seminal set of studies spurred additional inquiries linking facial dominance to a perceived ability to lead, with subsequent studies indicating that facially dominant individuals were more likely to be judged as leaders in various settings (Berinsky et al., 2019; Ferguson et al., 2019; Fruhen et al., 2015; Van Vugt & Grabo, 2015). Later studies suggest that other facial cues, such as perceived competence, are crucial in evaluating prospective leaders (Todorov et al., 2005).

Our approach is similar to the studies of Mazur and Mueller (1984; 1996; 1996b), as well as two other sets of experimental research (Antonakis & Dalgas, 2009; Chiao et al., 2008). Antonakis and Dalgas (2009) reported two related experiments using facial photographs of pairs of competing candidates in the 2002 French parliamentary elections. In the first experiment, the researchers presented facial images of the two finalists in these elections to Swiss undergraduate students. They found that the candidate who was judged “most competent in appearance” (as selected by respondents) was the actual winner in 72% of parliamentary elections. In a second experiment, Swiss children aged 5 to 13 were shown the same French election photos and were asked, “Which of these two persons would be the better captain for a voyage from Troy to Ithaca?” In this case, their candidate preferences coincided those of French parliamentary voters 71 percent of the time. Chiao et al. (2008) similarly found that voters were more likely to vote for candidates judged to have facial traits of perceived competence. In their study, university students proved significantly more likely to vote for candidates (male or female) whom they perceived to be more competent in an experiment in which they were presented with pairs of headshots of political candidates with whom they were unfamiliar.

Various other studies support the notion that perceived competence impacts perceptions of leadership. Todorov et al. (2005) found that, of three qualities of character ascribed to people based solely on facial traits analyzed in their study (competence, trust, and likability), competence (conceptualized as innate intelligence and leadership potential for inspiring followership) most highly predicted election outcomes, with an accuracy rate of approximately 70 percent (see also Banducci et al., 2008; Lawson et al., 2010). Other facial traits that have been found to impact perceptions of leadership ability include attractiveness (Bull et al., 1983; Chiao et al., 2008; Fruhen et al., 2015; Rosar et al., 2008; Van Vugt & Grabo, 2015), approachability (Chiao et al., 2008), intelligence (Bull et al., 1983; Mazur et al., 1984; Todorov et al., 2005), masculinity as exemplified through facial hair (Bauer & Carpinella, 2018; Wehner et al., 2015), facial shape (Re et al., 2013), criminality (Rockey et al., 2022), and honesty (Bull et al., 1983).

Facial Traits in Policing Promotions

The above discussion raises an interesting question for the current study, which examines how facial traits influence administrative leadership selection within a police agency. Using facial traits as a heuristic to assess leadership ability likely occurs in numerous diverse settings, and an anchoring effect is likely to occur, even in information-rich leadership selection processes. A long line of evidence shows that use of facial cues as a heuristic does occur. This effect is most strongly evident in the case of voting in elections (Antonakis & Dalgas, 2009; Berggren et al., 2010; Berinsky et al., 2019; Bull et al., 1983; Chiao et al., 2008; Ferguson et al., 2019; Lawson et al., 2010; Little et al., 2007; Rosar et al., 2008; Todorov et al., 2005, 2015; Van Vugt & Grabo, 2015). There is also evidence that this effect manifests, albeit to a lesser degree, when selecting individuals for administrative leadership positions (Fruhen et al., 2015; Linke et al., 2016; Mazur et al., 1984; Mazur & Mueller, 1996; Mueller & Mazur, 1996a). There is, however, a key difference between voters and the promotion process: Voters make election decisions with low to moderate amounts of relevant information, whereas elevation to administrative leadership positions within police agencies is typically determined through a codified examination process, and should therefore be less subject to preconceptions, especially since this kind of promotion process can be found in most state, county, and municipal police agency settings throughout the U.S.

Despite evidence that facial traits can impact promotion in the military and, equally, can affect election outcomes, there have been no studies to date examining the effect of facial traits on police promotions. Generally speaking, promotion in policing relies on prescribed systems wherein agencies assess candidates based on a variety of criteria: They take into account previous job evaluations, academic and professional education, performance on written examinations testing knowledge of laws, policies and procedures, through interview panels, and their practical problem-solving abilities as demonstrated in management exercises and assessment centers (Arthur Jr. et al., 2003; Police Executive Research Forum, 2018; Savery, 1994). Moreover, applying for police leadership positions is normally not permitted until a certain level of seniority has been obtained. The requirement for time-in-service before a promotion indicates a procedural intent to ‘anchor’ past job performance and establish a professional reputation based on mastery of the street-level demands of the job of law enforcement. Given the breadth and rigor of these job-relevant, merit-oriented selection processes, facial traits should not be highly predictive of promotional success within police agencies. If they are predictive, then it would raise serious questions about the equity of promotion and, necessarily, about the ability of those chosen to lead. Given that past research on military promotions suggests that facial traits predict promotional success for administrative leadership positions (Mazur et al., 1984; Mazur & Mueller, 1996; Mueller & Mazur, 1996a) it certainly seems worthwhile to test whether this finding holds true in the context of contemporary policing.

Hypotheses

The research literature reviewed above guides two main hypotheses, which we test across two phases of this study.

H1: Facial traits of police cadets will predict perceived leadership ability better than other observable traits of the pictured cadet, or self-assessed and observable traits of the respondent.

H2: Respondent’s perceptions of cadets’ leadership potential will predict later promotion to sergeant and lieutenant ranks during the cadets’ careers, at a rate greater than chance alone would obtain.

Study Design and Procedure

We use an online Qualtrics survey format to present archival photographs of police recruits to respondents. In both study phases, we rely on experimental designs to evaluate the causal effects. In the first task, respondents were presented with a random selection drawn from 98 photographs and asked to rate the facial traits (attractiveness, trustworthiness, distinctiveness, dominance, masculinity, and secureness) and the leadership ability of the pictured recruit. In the second task, participants were provided with 88 dyads of photographs, and asked to identify which of the people in the photograph would make a better leader. In both tasks, the respondents were not aware they were rating police recruits; moreover, the headshots of the recruits did not include any police markings or paraphernalia.

We first model competing explanations of perceived leadership. In this phase we demonstrate that perceptions of leadership are primarily driven by observations of attractiveness, trustworthiness, distinctiveness, dominance, masculinity, and secureness (i.e., confidence). In the second phase of analysis (detailed later), we link the respondent’s choice of “better leader” to actual promotional records. This phase demonstrates that respondents can accurately discriminate between 1) promoted versus non-promoted officers, and 2) higher levels of promotion (lieutenant) versus lower levels (sergeant), at a rate far greater than chance alone would yield.

Participants

Data were collected between March 8 and April 11, 2019, using a Qualtrics survey panel. Previous research has established Qualtrics panels as the most demographically and politically representative peer service offering large online samples (Boas et al., 2020). The authors specified the following parameters in the recruited panel: the target population must be 18+ years old, reside in the United States, and currently be enrolled in a 4-year college/university. The Qualtrics panel design allows researchers to set demographic parameters. Our race/ethnicity target demographics on population parameters were provided by the American Council on Education (2022): White 52%; Black 15.2%; Hispanic 19.8%; Asian 5.7%; Pacific Islander 0.4%; and Native American 0.8%. A total of 507 respondents comprised the final sample, and we report the descriptive results of the sample in Table 1.

We controlled for whether respondent demographics played a role in perceptions of leadership qualities. We examined a variety of respondent demographics including the survey participants’ gender, their self-reported race/ethnicity, and their political party affiliation. We also examined the extent to which survey respondents’ personality traits affected their leadership potential evaluations, as research has demonstrated respondent and target traits are correlated in complicated ways (Sacco & Brown, 2018). On a series of 7-point Likert-type scales, respondents were asked to self-assess the following core [“big five”] personality traits: openness, conscientiousness (i.e., self-discipline), extraversion, agreeableness (i.e., warmth), and anxiousness.

Gender

Male

Female

Other

41%

58%

1%

Race/Ethnicity

White

Black

Native

American

Asian

Hispanic

Pacific Islander

Other

60%

16%

1%

8%

14%

1%

1%

Political Party

Dem

Rep

Ind

Other

50%

22%

26%

2%

Self-Assessed Personality Traits

Min

Max

Mean

SD

Open

1

7

5.28

1.28

Conscientiousness/Self-Discipline

1

7

5.30

1.38

Extrovert

1

7

3.96

1.90

Agreeableness/Warm

1

7

5.33

1.35

Anxiousness

1

7

4.41

1.72

Table 1: Respondent Descriptive Statistics (n=507)

Procedure Overview

We first asked each respondent to rate a random selection of police recruit photographs. While viewing the photographs, survey respondents judged the facial images presented to them and rated the relative attractiveness, trustworthiness, distinctiveness, dominance, masculinity, and secureness of the pictured recruit. Following those ratings, the respondents rated the leadership ability of the pictured person. A total of 15,905 observations were collected during this survey phase. Respondents varied in the number of photographs they saw, rating between 2 and 63 photographs (mean=30.91, sd= 3.13). Missing values equated to approximately one percent of the total data, which meant that 236 cases were removed from the analysis. This exclusion reduced the total number of observations to 15,669.

We presented respondents with randomly selected dyads of archival cadet photographs in the second task. Respondents were asked, “Which of these two individuals would make a better leader?” Respondents were provided with a random selection drawn from 88 sets of randomly-paired facial images. Respondents varied in the number of dyads they saw, rating between 2 and 24 (mean=11.36, sd=3.83). All paired images met one of two criteria: (1) an officer who was not promoted was compared to an officer who had been promoted to the rank of lieutenant; or (2) someone who had been promoted to the rank of sergeant was compared to someone who had been promoted to the rank of lieutenant. Ranks above lieutenant were not included due to data sparsity concerns. We relied on promotional status information provided by the participating agency.

Recruit Photographs

We collected approximately 500 graduation photos of police officers from a capital city police agency’s recruit academy. These images came from academy graduating classes between 1988 and 2008 and had no names or records attached. We took a stratified random sample of 100 of the images, overweighting to ensure that enough photographs of officers who eventually promoted to sergeant or lieutenant were included. Two of the images from the stratified sample were discarded for unusable image quality, leaving us with a deployed photograph sample of n=98.

In the Qualtrics online survey, 507 respondents were presented with 98 randomly selected headshots from the collection of images. In the second task, respondents were presented with 88 randomly selected dyads from the same set of images of police recruits. It’s also important to note that these photographs are headshots that do not depict any equipment or accoutrements that would identify the person pictured as a police officer. Examples of the photographs can be found in the appendix materials.

The dimensions we sought to capture through coding are dimensions discussed in the literature on the role of image in shaping job promotions (for example, see Mueller & Mazur, 1996b). Each image was independently coded for key characteristics: perceived gender (male or female); hair (full head of hair or bald); wearing glasses (yes or no); and facial hair (yes or no); smile (full smile, half smile, no smile); perceived race (Asian/Pacific Islander, Black, Native American, or White). Most police academy graduates are within a generation of each other, so we did not code for perceived age of policy academy graduate in our image analysis, though we report the age of cadets (at time of photograph) in the interest of full reporting. Table 2 gives the descriptive statistics for the images used in the study.

Mean

Std. Dev.

Age

27.8

4.0

N

Pct.

Female

No

87

88.8

Yes

11

11.2

Minority

No

81

82.7

Yes

17

17.3

Facial Hair

No

85

86.7

Yes

13

13.3

Glasses

No

91

92.9

Yes

7

7.1

Smile

No

45

45.9

Half

8

8.2

Yes

45

45.9

Bald

No

95

96.9

Yes

3

3.1

Promotion

None

62

63.3

Sergeant

18

18.4

Lieutenant

18

18.4

Table 2: Recruit Photograph Descriptives (n=98)

Given changes in photographic technology between 1988 and 2008, the quality of the police academy graduation photos was not uniform. Earlier photos used film processing image capture techniques, and those images were then digitized; more recent academy graduate photos relied on high quality digital cameras which produced clearer, crisper images. Since image quality could impact respondent scores, two data analysts additionally rated image quality in terms of image darkness, sharpness of the image i.e., the level of pixilation between the facial image and the background (known as the “edge”), and the clarity of the face shown in the image. These analysts focused on quality differences between two comparison images, scoring the pairing as either “0” (image #1 not better than image #2) or “1” (image #1 better than image #2).

Two coders were chosen to independently evaluate every image used in our analysis and to code a dataset for the characteristics noted above (see Chacón Moscoso et al., 2019). The coders were not informed of the studies’ objectives—namely, to determine if there was a relationship between candidate image and likelihood of promotion—so as to reduce or eliminate coding bias. The coders were instructed on what and how to code the image characteristics noted above and were given a post-instructional coding exercise—coding 50 images—to determine if they had mastered our coding methodology. At this point, we felt comfortable in having the coders independently code all images for the characteristics noted above. The coders conducted their tasks in a controlled office environment, observing images (with image identification numbers) on computer screens and independently coding datasets for the listed characteristics.

With the coded data in hand, we calculated the percent agreement between the two analysts for each dimension. In no instance did the percent agreement between coders fall below .90. In other words, 90% of the time or greater, the coders were agreement for each dimension coded. The level of percent agreement or reliability led us to not take the step further to calculate kappa statistics, which would have been in the range of .90-.95 had we taken this interesting, yet, in this instance, unnecessary further step (see Belur et al., 2021; McHugh, 2012).

Measures

Dependent variables

In the first task, we conducted a single-frame analysis. We presented respondents with 98 photographs, each randomly selected from the pool of photographs described earlier. The dependent variable of central interest is an overall assessment of perceived leadership ability. This leadership rating is based on asking survey respondents, “What is the likelihood that this person would be a good leader?” Respondents recorded their judgment on a 7-point Likert-type scale, ranging from “extremely likely” (1) to “extremely unlikely” (7). Appendix Photograph 2 shows an example of this first task.

In the second task, we presented respondents with a dyad containing two cadet photographs and asked to select which of the two “would make a better leader?” We subsequently compared the selections in this task to the real-world promotional success and level (i.e., officer versus sergeant versus lieutenant) of the pictured cadets. Appendix Photograph 3 shows an example of this task.

Independent Variables

The primary independent variables in this analysis are the perceived facial characteristics of the photographed police cadets. Respondents were asked to rate each cadet’s facial image with respect to attractiveness, trustworthiness, distinctiveness, dominance, masculinity, and secureness (i.e., confidence). All these ratings were structured as a uniform 7-point Likert-type scale, with an anchoring description at each scalar pole (i.e., “unattractive” to “attractive”, rated as “extremely” at each pole). An example of this task is pictured in Appendix Photograph 1. A correlation matrix was used to confirm the distinctiveness of these five personality traits (Appendix Figure 2). As we expected, there is little correlation between any of the five commonly assessed personality traits. We conducted a principal components analysis with direct oblimin rotation of the five personality traits, and our results indicate different factors for each attribute within our subject population. Descriptive statistics for respondent ratings of individual photographs for leadership and facial traits are reported in Table 3.

Dimension

Min

Max

Mean

SD

Leadership

1

7

4.37

1.44

Facial Traits

Attractiveness

1

7

3.74

1.57

Trustworthiness

1

7

3.97

1.45

Distinctive

1

7

3.75

1.49

Dominant

1

7

4.08

1.36

Masculine

1

7

4.45

1.60

Secureness

1

7

4.07

1.42

Table 3: Respondent Ratings of Photographs (n=15,669)

Results

Facial Traits Influence on Perceptions of Leadership

We tested four competing models of perceived leadership and found that a collection of facial traits are the best predictors. Multilevel general linear models were fit and compared using the `rstanarm,` `stan,` and `brms` packages available in R (Bürkner, 2021; Goodrich et al., 2020). Multilevel modeling is advantageous because it allows researchers to model the variation among respondents within the data explicitly. Variation is likely to exist between respondents as to how they respond to the survey prompts due to some unmeasured aspects of the individuals. Multilevel modeling allows us to account for this likely variation (Hox, 1994; McElreath, 2020)1.

The first model estimates facial trait parameters; the second model estimates parameters for the other physically observable traits; the third model estimates parameters for respondent demographics; the fourth model estimates parameters for respondent personality traits. We generated a correlation matrix (Appendix Figure 1) to explore the bivariate relationships among these perceptions of facial traits. Results in this matrix show minimal overlap in facial trait ratings (strongest between attractiveness and trustworthiness, and dominance and secureness), where each facial trait measure captures a distinct dimension of perceived qualities of character and mind. The analysis also shows that masculinity was not correlated with the other facial features assessed. Further, scree plots and exploratory factor analyses confirmed that each facial trait measures a distinct aspect of imputed qualities of character and mind.

Weakly informative priors were used to fit all four models2. Weakly informative priors have little influence on the final parameter estimates with as large a sample size as we use in our analysis. These priors help constrain parameters to reasonable ranges in Bayesian modeling (McElreath, 2020; Mourtgos et al., 2021; Van de Schoot & Depaoli, 2014). All models were fit via the process of Hamilton Monte Carlo (HMC) sampling. We computed ten thousand iterations to estimate the posterior distribution for each model.

Table 4 reports the mean, standard deviation, and 95% credible intervals for each parameter’s distribution in each model.3 The analytic focus of this table is the Widely Applicable Information Criteria (WAIC) score,4 a statistic that tests the predictive capability of each model, rather than the statistical significance of individual parameters within each model. Models with lower WAIC scores are interpreted as having higher predictive capability.

When comparing WAIC scores for each of the four models, we find that Model 1 (the facial trait model) features a substantially higher predictive capability than any of the other models, thereby establishing that facial traits better predict perceptions of leadership than other observed physical traits or the survey respondents’ characteristics5. Model 2 has higher predictive capability than Models 3 and 4, which are similar in their predictive capabilities. We also estimated a model combining all four individual models (not reported in the table). Even when combining all parameters from all individual models, the posterior means for the facial trait parameters do not change substantially. Further, the WAIC score for the kitchen-sink model provides only a small relative improvement in predictive capability (difference of 251.7). Taken in combination, these noteworthy results indicate that little additional information is being added to leadership evaluations beyond facial trait assessments.

Accordingly, Model 1 is retained, with Figure 1 plotting posterior means, 50% credible intervals, and 95% credible intervals6. Except for perception of dominance, no perceived facial trait parameter posteriors cross zero, indicating that there is a greater than 95% probability that the perceived facial traits of attractiveness, trustworthiness, masculinity, and secureness predict inferred leadership ability. It is noteworthy that the distinctiveness and dominance traits have negative coefficients, while the attractiveness, trustworthiness, masculinity, and secureness traits are all positive.

Imputed Traits from Facial Images

Observable Traits from Facial Images

Subject Demographics

Subject Self-Assessed Personality

M

SD

L95%

U95%

Model 1: Imputed Traits

Attractive

.11

.01

.09

.13

Trustworthy

.26

.01

.24

.28

Distinctive

-.04

.01

-.06

-.03

Dominant

-.02

.01

-.03

.00

Masculine

.07

.01

.05

.08

Secure

.18

.01

.16

.20

Sigma

1.12

.01

1.11

1.13

Intercept

2.13

.07

2.00

2.27

WAIC = 48,548.7

Model 2: Observable Traits

M

SD

L95%

U95%

Female Yes

.39

.03

.33

.46

Facial Hair

-.10

.03

-.16

-.04

Glasses

-.23

.04

-.30

-.15

Smile ½

.35

.04

.28

.42

Smile Full

.43

.02

.38

.47

Non-white

.16

.03

.11

.21

Bald

.01

.06

-.11

.12

Sigma

1.20

.01

1.18

1.21

Intercept

4.11

.04

4.04

4.18

WAIC = 50,578.3

Model 3: Subject Demographics

M

SD

L95%

U95%

Gender Female

-.19

.07

-.33

-.04

Gender Other

-.88

.29

-1.45

-.30

Race Black

.10

.11

-.11

.30

Race Native Amer.

-.33

.41

-1.15

.45

Race Asian

-.18

.13

-.44

.08

Race Hispanic

.03

.11

-.18

.23

Race Pacific Is.

.12

.40

-.68

.89

Race Other

.17

.29

-.39

.73

Republican

-.05

.10

-.23

.15

Independent

-.26

.09

-.44

-.09

Other Party

.09

.24

-.38

0.57

Sigma

1.23

.01

1.22

1.25

Intercept

4.57

.08

4.40

4.73

WAIC = 51,474.6

Model 4: Subject Self-Assessed Personality

M

SD

L95%

U95%

Open

.02

.03

-.04

.08

Self-Discipline

.06

.03

.01

.11

Extrovert

.03

.02

-.01

.07

Warm

.07

.03

.02

.13

Anxious

.01

.02

-.03

.04

Sigma

1.23

.01

1.22

1.25

Intercept

3.43

.21

3.02

3.85

WAIC = 51,475.5

Table 4: Competing Models of Perceived Leadership

Graphical user interface, application Description automatically generated

Figure 1

Predicting Promotions in the Blind

Can perceptions of leadership ability, which the first study phase shows are substantially driven by facial traits, lead to the identification of promotions simply from assessing police academy archival photographs? Simply put, is there evidence that police promotions are influenced by how candidates look? The prior analysis established that facial traits drive perceptions of leadership ability above and beyond any other measured characteristics. Here, we take the next important step and investigate whether perceptions of leadership ability have practical import.

To proceed with the second phase, we used a dual-frame experiment to test the relationship of leadership evaluations and real-world promotional outcomes. At the time of recruit photograph collection, 70 (13.8%) of the pictured police recruits had promoted to sergeant, while 44 (8.7%) had promoted to the rank of lieutenant (144 total promotions out of 507 original police recruits, or 28.4%). Our stratified sample (n=98) has a slight overweight of lieutenant and sergeant photographs (18.4% each), to ensure enough variation in the resulting dyads. Each dyad contained two photographs of cadets who differed in their eventual promotional success (i.e., a lieutenant and a sergeant, or a lieutenant and an unpromoted officer).

We estimated three separate multilevel Bayesian logistic regression models in phase 2. As in phase 1 of our analysis, weakly informative priors were used, and 10,000 MCMC iterations were estimated for each model7. The first model assessed whether leadership assessments can predict promotions accurately for all promotional queries; the second, promotional queries of unpromoted officers versus officers promoted to the rank of lieutenant; the third, promotional queries of officers promoted to the rank of sergeant versus officers promoted to the rank of lieutenant. We controlled for image quality in all models due to the timespan (1988-2008) and varied quality of the images (see data section). We report results across three related analyses, and all three models are reported in Table 5, which is focused on recovering the area under the curve (AUC) for each model. We discuss AUC and interpretation at length below.

M

S.E.

95% CI

IRR

AUC

All Promotion Choices

(n=5739)

Intercept

-.82

.10

-1.02

-.62

.43

Leader

.24

.02

.20

.28

1.27

Image Quality

.15

.06

.03

.26

1.16

.70

No Promotion vs. Lt.

(n=2884)

Intercept

-.35

.13

-.60

-.10

.70

Leader

.18

.03

.12

.23

1.19

Image Quality

-.04

.09

-.22

.13

.96

.67

Sgt. vs. Lt.

(n=2855)

Intercept

-1.32

.15

-1.61

-1.04

.27

Leader

.29

.03

.24

.35

1.34

Image Quality

.42

.08

.26

.59

1.53

.70

Table 5: Predicting Promotional Choice Models

All Promotional Choices. When assessing all promotional choices (n=5739), both good leader perceptions (M = .24 [.20, .28]) and image quality (M = .15 [.03, .26)] have positive relationships with correct promotional choices. We exponentiated the results for ease of interpretation, calculating for every one unit increase on the good leader assessment Likert scale the odds of a correct promotional choice increase by 27% when holding image quality constant. This finding suggests that as a respondent’s perception of a pictured officer’s leadership ability increases, it becomes easier for that respondent to correctly guess whether the officer was promoted or not. However, we are more interested in evaluating the selection accuracy of the respondents’ predictions. To accomplish this evaluation, we adopt the use of the Area Under the Curve (AUC) measure as a metric. AUC calculates the probability of respondents’ ability to correctly classify those who were promoted and those who were not. Described in more detail below, by estimating the AUC we are able to determine how well respondents are able to differentiate between the two groups and thus infer the selection accuracy of their responses.

To observe the kind of predictions the multilevel model is making at the level of individual respondents, we plotted 100 random samples from the data set against the fitted marginal effect line. There is a clear linear relationship where higher leadership ratings lead to greater mean predictions of promotion, and vice versa. In Appendix Figure 3 we visualize the predicted means of promotional choice. As is evident, there is rather little heterogeneity around the mean marginal effect line.

Finally, we determined how good the model is at classifying correct promotional choices using an Area Under the Curve (AUC) measure. AUC is a commonly used statistical measure employed to evaluate the performance of binary classification models. AUC is not a measure of the proportion of correct guesses, but rather is a measure of discrimination, testing the ability to correctly distinguish between the two classes. AUC is advantageous to the more basic “correct classification rate” method procedure because the AUC is not dependent on the balance of the proportions of classes in the outcome variable. In other words, AUC is valuable because it considers all possible thresholds of classification and is not affected by imbalanced class distribution, which can skew the results of the “correct classification rate” measure.

AUC ranges between 0 and 1, where an AUC of 1 represents perfect classification. A value of 0.50 means that the model does not classify better than chance. A good model should have an AUC score higher than 0.50. Often, a score higher than .80 is desired, but this is arbitrary and relative, and describing model quality depends on what one is measuring. In the current data, the target response is a correct promotional guess. As per the model’s predictions, if we randomly pick one promotional guess from the “promoted” group and one from the “not promoted” group, the guess with the higher predicted probability of being an actual promotion should be the one from the promoted group. This is the principle on which the AUC measure is based on.

The AUC for the current model is 0.70, indicating that respondents predict correct promotional choices at levels well above chance when making assessments based on leadership perceptions derived from observed facial characteristics. This finding is important given that promotions in policing are ostensibly based on time-in-service, skills, reputations, and evaluations in assessment centers – all factors that were unknowable at the time recruits were photographed, and completely unknown to respondents evaluating those photographs.

No Promotion and Lieutenant Promotion. When assessing officers who were never promoted versus officers promoted to lieutenant (n=2884), good leader perceptions (M = .18 [.12, .23]) have a positive relationship with correct promotional choices. Image quality (M = -.04 [-.22, .13)] does not have an influence on correct promotional choices: this measurement had a 95% credible interval, including zero. Exponentiating the results: for every one unit increase on the good leader assessment Likert scale, the odds of a correct promotional choice in the no promotion versus lieutenant promotion cases increases by 19%, when holding image quality constant. The AUC score for this model is 0.67. Appendix Figure 4 visualizes this model's predicted means of the promotional response, including 100 random samples from the data set against the fitted marginal effect line.

Sergeant Promotion and Lieutenant Promotion. When assessing officers who were promoted to the rank of sergeant versus officers promoted to the rank of lieutenant (n=2855), both perceptions of good leadership (M = .29 [.24, .35]) and image quality (M = .42 [.26, .59)] have positive relationships with correct promotional choices. When we exponentiate the results, for every one unit increase on the good leader assessment Likert scale, the odds of a correct promotional choice in the sergeant promotion versus lieutenant promotion cases increases 34%. The AUC score for this model is 0.70. Appendix Figure 5 visualizes the predicted means of the promotional response for this model, including 100 random samples from the data set against the fitted marginal effect line.

Discussion

Clearly, it matters whether your face fits. The same facial traits that generate positive perceptions about leadership also predict professional promotion in a municipal police setting. When presented with static images of police cadets, respondents predict correct promotional choices at levels well above chance as measured by an AUC score of .70. Our findings support previous research in other contexts, and they have noteworthy implications for theories of meritocratic public service and the practice of policing. However, the findings are limited by a lack of professional performance data and would require extensive testing in other policing contexts, particularly in non-US and non-Western contexts, to better understand generalizability of the results.

The two phases of our study extend the facial traits literature into the critical context of policing promotions. In the first study phase we establish that, when presented with a photograph of entry-level police officers, survey subjects show little difficulty in cueing in on facial traits to impute qualities of mind and character, or in assessing who is and who is not likely to be a good leader. In the second phase, we test whether those leadership assessments have a better-than-random chance of predicting which officers would be promoted some 20-odd years later. Respondents correctly guess promotional success based on static facial traits alone to a degree that is not explainable by chance. This said, there is some noteworthy variance among the dimensions of the facial traits model. For example, the trait of trustworthiness appears to have the most impact on assessing leadership potential, whereas the trait of distinctiveness has a negative effect; dominance does not contribute to those assessments in any meaningful way.

The relationship between leadership perceptions and promotion predictions is strong and linear. This relationship is striking, given the effort that policing institutions undertake to ensure they can make merit-based decisions when choosing between promotion-eligible candidates (Savery, 1994). Indeed, agencies commonly use long batteries of promotional tests, including varieties of written assessments, long-form interviews with internal and external stakeholders, reviews of candidates’ work quality, and even role-playing scenarios to assess reactions “in the moment.” Agencies put forth this effort because the consequences of bad choices in police leadership roles are relatively high (Drew & Saunders, 2020; Savery, 1994; Shjarback & Todak, 2019).

Our findings have significant implications for both the police and public service as a whole. Facial features—largely outside police officers' and public employees' control—exert a demonstrable influence on future professional prospects. Though the role of performance is not quantified in this study, and enough unexplained variation exists to confidently assert performance has a role, our results should give rise to concern. This preference for inherent physical characteristics runs quite counter to the presumed equity that undergirds public service recruitment: Merit-based recruitment, impartial evaluation, and advancement based on job-related criteria and temperament are all fundamental to personnel administration fairness. However, the above finding—that facial features are a primary sorting device—undermines the notion that the public sector workplace serves a model for equitable human resource practices.

High prediction rates of promotion based on facial traits alone raises questions about the fairness of police promotional systems. However, unexplained variation in our models leaves room for police officer performance to impact promotions, and there may be plausible casual links between looking good and performing well, a possibility we return to in later discussion. It is not enough that meritocratic systems achieve fairness – they must be perceived as fair to achieve the goals of a well-functioning workplace (Adams, 1965; Colquitt, 2001). Perceptions of fairness in the police workplace are critical (Wolfe & Lawson, 2020). Undercutting the perceptions of a meritocratic promotional system could well result in further negative outcomes. First, a lack of meritocratic balance in promotions could hinder evidence-based policing: a core assumption of the evidence-based policing movement is that by implementing scientifically-based policy better policing will result (Lum & Koper, 2017). Indeed, scholars broadly share the belief that having highly skilled practitioners inside a police agency is a critical step in advancing the evidence-based policing paradigm (Sherman, 2015). However, this presumption of maximal competence relies on the assumption that those most capable of applying EBP principles will be prepared to do so. If advancement in an agency is, in fact, more reliant on facial traits than on genuine merit, then achieving the goals of EBP may well be more problematic.

Second, there is a likelihood that failure to promote on merit (as opposed to “if a face fits”) will generate detrimental organizational issues: prior research finds that when talented officers sense that they will not be promoted in rank, and are more likely to leave police service in search of more rewarding work (Boag-Munroe et al., 2017). Moreover, failure to achieve promotion can lead to rifts among officers, estrangement from managers, and disengagement with their agencies (Savery, 1994). Perceived fairness is also part of this equation: fairness in promotion outcomes is a major component of organizational justice (Adams, 1965). Indeed, a recently published meta-analysis demonstrates that, across a number of studies, organizational justice perceptions “have a sizeable effect on criminal justice employees’ work outcomes” (Wolfe & Lawson, 2020, p. 630). For example, low perceptions of organizational justice are linked to lower levels of procedural justice being observed when dealing with citizens (Tankebe, 2014), while higher levels of procedural justice are associated with fewer civilian complaints, fewer initial internal affairs investigations, and fewer disciplinary charges (Wolfe & Piquero, 2011).

Police executives have warned of a retention crisis in policing for at least the last decade now (Linos et al., 2017). However, in the immediate aftermath of the protests against policing in 2020, scholars have found that voluntary resignations have risen approximately 280% over expected levels (Mourtgos et al., 2022). Given the widely accepted evidence that having more uniformed police officers available for duty is associated with reductions in violent crime, law enforcement agencies and the communities they serve should be motivated to craft policy that ensures promotional systems are as meritocratic as possible, and that advancement in the profession is not just a matter of “looking” like a leader.

There is nonetheless some promising news. Our finding that facial characteristics are the most explanatory collection of variables in evaluating leadership potential revealed a positive finding in relation to race and gender. A long line of sound research has demonstrated the structural biases that have long plagued policing, with particularly pernicious effects on the advancement of women and racial minorities in the law enforcement profession (Drew & Saunders, 2020; Martin, 1982; Rabe-Hemp, 2008; Rief & Clinkinbeard, 2020; Shjarback & Todak, 2019). The challenges of career achievement, stability, and recruitment are inextricably linked, and inequities in these areas of police employment continue to be inequitable for women and racialized minorities (Linos, 2018; Schuck, 2014, 2021).

Our findings do not suggest that women and racial minorities no longer suffer inequities in police promotional practices. Scholars point to convincing evidence that women in policing are delaying or forgoing promotional opportunities (Todak et al., 2021), and these challenges are more pronounced for non-white women (Todak & Brown, 2019). However, our findings contribute to this research by demonstrating that facial trait characteristics are not the primary drivers of well-established inequities in policing careers. In Table 4 earlier, the second model shows that survey participants rated facial images of women and non-white cadet officers as having higher leadership potential than both men and white cadets. These findings help narrow down the potential mechanisms for implicit bias that might contribute to unequal promotional opportunities during officers’ careers. As such, they are noteworthy and are deserving of a fuller discussion than can be offered here; accordingly, these results will be the subject of a detailed follow-up study.

Limitations and Future Research

Our findings are not without limitations. First, while the method used here has high internal validity, there are clear limits to generalizing the findings. For example, while survey respondents in our study were presented with static photographs of officers, that is not how those same officers would be perceived outside of laboratory conditions. Simply put, officers act and are perceived in all three dimensions. There are likely differences between the photographic representation and in-person presentation. It’s unknown at this time exactly how those contextual differences might interact with perceptions of facial traits and, thereafter, leadership potential. Future work in this area might consider short video vignettes designed to capture a more “real” picture of the officer

Another limitation derives from differences in data collection in our study compared to some earlier studies. Earlier studies had access to personal and background characteristics of the people featured in pictures. For example, Mazur et al. (1984) collected and analyzed background information on pictured West Point cadets, including their family history in the military, their religious affiliation, and the types of postings they undertook during their military service. We have no such background information on the pictured police cadets, so we could not explore correlations between any of those background characteristics and the outcomes studied here.

We are also limited by a lack of individual police officer performance data, and only observe career outcomes. A lack of performance data means that while our study can establish the causal link between facial traits and police promotions, questions about how exactly that mechanism works must be left unanswered at this time (see Rockey et al., 2022 for discussion on linking performance data and facial traits studies). One possible factor relating to police performance is the possibility that facial appearance, particularly the appearance of being highly trustworthy or attractive, may be functionally related to higher professional performance. This is not to suggest that appearance is the sole determinant of competence or that it is impossible for someone who is unattractive to be competent. Rather, appearance may be correlated with certain behaviors or abilities that contribute to professional success. For example, an officer who is perceived as trustworthy may be more effective at generating victim trust and/or suspect confessions, work outcomes which could lead to their advancement within the organization.

Linking performance and facial traits to promotional outcomes might be particularly relevant in the policing profession, where interpersonal interactions play a significant role and appearance may influence trust-building. Scholars have long noted that across various contexts facial attractiveness is associated with more favorable outcomes (Rhodes, 2006). This relationship exists across professional contexts (Sala et al., 2013), though to our knowledge in the police context remains scant, presenting an opportunity for much further research involving a range of police settings. Despite this possibility, it must be noted that research has clearly shown the human instinct to quickly judge faces is a heuristic that too often produces inaccurate judgments (Todorov et al., 2015).

As we noted above, our findings cannot be interpreted as showing that the institution of policing provides minority race and gender promotional candidates an equitable playing field. Indeed, while attractiveness benefits both men and women in terms of occupational prestige and socio-economic outcomes over their lifetimes (Sala et al., 2013), in the specific context of policing women and non-white candidates have historically fared poorly in comparison to their white, male colleagues (Todak & Brown, 2019). Put simply, we can only remark that in this specific experiment, we found that respondents perceived women and non-white officers as having higher leadership potential than white male officers, and that those perceptions were unusually accurate in predicting future promotional success. In other words, our finding implies that facial characteristics are not the locus of well-established professional biases against minoritized police officers in the workplace. A related limitation is that our study was conducted in the US, with US officers rated by US respondents. There is strong empirical reason to suspect that these results would not generalize across countries and cultures, as facial emotive characteristics, such as emotional labor, have been shown to lack measurement invariance across cultural contexts (Mastracci & Adams, 2019). Our findings should compel further questions about how policing in non-US contexts, and particularly non-Western cultural contexts, experiences this phenomenon.

A final limitation concerns an implicit assumption of the method presented here. We assume that the 500+ sampled survey respondents taking part in the Qualtrics panel are subject to the same heuristic biases as those responsible for making promotional decisions within police departments. In a more-perfect experiment, the survey would be administered to a random selection of high-level police executives who hold actual power over law enforcement promotional opportunities. We agree that such a study would produce valuable insight. However, this is not a strong objection to the current study, as we rely on a core argument of Kahneman’s theory (2011): that these identifiable heuristics are common to humankind across widely varying cultural and professional contexts. The broad portability of these heuristics is what makes them so fascinating and valuable—they help uncover all too common errors in judgment that overwhelm even the most highly trained and skilled professionals, let alone novices (Sanchez & Dunning, 2018). Further cushioning against this possible weakness is that the very naivete of our sample – a cross-section of adults without substantial professional experience – should theoretically result in basically random predictions of future promotional success in professional policing capacities they have never experienced themselves. Instead, as was observed a long line of research in other contexts, we find quickly formed decisions about a police cadet face strongly predicts later career promotion. We believe that the much better-than-random predictions from this non-professional population is a strong argument for generalizing our findings outside of the police-specific context.

Conclusions

We have reproduced the findings of a classic set of experiments first published over thirty years ago in an important new setting. We have extended earlier insights into the non-meritocratic foundations of military promotion to the related field of policing. Despite the theoretically strong incentives to promote the “right” candidates in policing, we have demonstrated that some facial characteristics outside the control of officers are inferred as leadership capability. Further, those same inferences strongly predict later promotion to sergeant and lieutenant. In fact, our respondents – non police officers not aware they were viewing police cadet photographs – made leadership distinctions based on photographs alone, and those distinctions lined up with actual promotions of the pictured recruits at a rate greater than chance alone could explain.

Over 2400 years ago, Plato (2000) warned against selecting leaders based on their appearance alone. Much later, having carried out early experiments (Peirce & Jastrow, 1884) on the human propensity for guessing correctly more often than chance alone could explain, researchers (Peirce, 1929, pp. 281–282) noted, “…we often derive from observation strong intimations of truth, without being able to specify what were the circumstances we had observed which conveyed those intimations.” Today, people in all walks of life show little difficulty believing that they can read faces and tell the difference between trustworthy and deceptive individuals—be they military officers, candidates for public office, captains able to navigate rough waters, or leaders of their law enforcement agencies. Our collective faith in what Kahneman (2011) calls “fast thinking” is such that we are prone to mistakenly rely on our intuition about others’ trustworthiness. Our findings offer fair warning: Police leadership selection processes must persist in making use of multiple forms of testing, as well as be open to new testing procedures helping them become more capable of identifying the best leaders available to the agency. Doing so can guard against the promotion of candidates who have a face that fits, but lack the aptitude, strength of character, and dedication to task required to shoulder their leadership burdens.

References

Adams, J. S. (1965). Inequity In Social Exchange. In L. Berkowitz (Ed.), Advances in Experimental Social Psychology (Vol. 2, pp. 267–299). Academic Press. https://doi.org/10.1016/S0065-2601(08)60108-2

Antonakis, J., & Dalgas, O. (2009). Predicting elections: Child’s play! Science, 323(5918), 1183–1183. https://doi.org/10.1126/science.1167748

Archbold, C. A., & Schulz, D. M. (2008). Making Rank: The Lingering Effects of Tokenism on Female Police Officers’ Promotion Aspirations. Police Quarterly, 11(1), 50–73. https://doi.org/10.1177/1098611107309628

Arthur Jr., W., Day, E. A., Mcnelly, T. L., & Edens, P. S. (2003). A Meta-Analysis of the Criterion-Related Validity of Assessment Center Dimensions. Personnel Psychology, 56(1), 125–153. https://doi.org/10.1111/j.1744-6570.2003.tb00146.x

Asch, S. E. (1946). Forming impressions of personality. Journal of Abnormal Social Psychology, 41, 258–290.

Banducci, S. A., Karp, J. A., Thrasher, M., & Rallings, C. (2008). Ballot Photographs as Cues in Low-Information Elections. Political Psychology, 29(6), 903–917. https://doi.org/10.1111/j.1467-9221.2008.00672.x

Bauer, N. M., & Carpinella, C. (2018). Visual Information and Candidate Evaluations: The Influence of Feminine and Masculine Images on Support for Female Candidates. Political Research Quarterly, 71(2), 395–407. https://doi.org/10.1177/1065912917738579

Belur, J., Tompson, L., Thornton, A., & Simon, M. (2021). Interrater Reliability in Systematic Review Methodology: Exploring Variation in Coder Decision-Making. Sociological Methods & Research, 50(2), 837–865. https://doi.org/10.1177/0049124118799372

Berggren, N., Jordahl, H., & Poutvaara, P. (2010). The looks of a winner: Beauty and electoral success. Journal of Public Economics, 94, 8–15. https://doi.org/10.1016/j.jpubeco.2009.11.002

Berinsky, A. J., Chatfield, S., & Lenz, G. (2019). Facial dominance and electoral success in times of war and peace. The Journal of Politics, 81(3), 1096–1100. https://doi.org/10.1086/703384

Berry, W. D., Berry, W. D., Feldman, S., & Stanley Feldman, D. (1985). Multiple regression in practice. Sage.

Bittner, E. (1990). Florence Nightingale in Pursuit of Willie Sutton. In Aspects of Police Work. Northeastern University Press.

Boag-Munroe, F., Donnelly, J., van Mechelen, D., & Elliott-Davies, M. (2017). Police Officers’ Promotion Prospects and Intention to Leave the Police. Policing: A Journal of Policy and Practice, 11(2), 132–145. https://doi.org/10.1093/police/paw033

Boas, T. C., Christenson, D. P., & Glick, D. M. (2020). Recruiting large online samples in the United States and India: Facebook, Mechanical Turk, and Qualtrics. Political Science Research and Methods, 8(2), 232–250. https://doi.org/10.1017/psrm.2018.28

Bull, R., Jenkins, M., & Stevens, J. (1983). Evaluations of politicians’ faces. Political Psychology, 4(4), 713–716.

Bürkner, P.-C. (2021). Bayesian item response modeling in R with brms and Stan. Journal of Statistical Software, 100(5), 1–54. https://doi.org/10.18637/jss.v100.i05

Chacón Moscoso, S., Anguera Argilaga, M. T., Sanduvete Chaves, S., Losada López, J. L., & Portell Vidal, M. (2019). Methodological quality checklist for studies based on observational methodology (MQCOM). Psicothema, 31 (4), 458-464.

Chiao, J. Y., Bowman, N. E., & Gill, H. (2008). The political gender gap: Gender bias in facial inferences that predict voting behavior. PLOS ONE, 3(10). https://doi.org/10.1371/journal.pone.0003666

Colquitt, J. A. (2001). On the dimensionality of organizational justice: A construct validation of a measure. Journal of Applied Psychology, 86(3), 386–400. https://doi.org/10.1037/0021-9010.86.3.386

Drew, J. M., & Saunders, J. (2020). Navigating the police promotion system: A comparison by gender of moving up the ranks. Police Practice and Research, 21(5), 476–490. https://doi.org/10.1080/15614263.2019.1672290

Enders, C. K. (2010). Applied missing data analysis. Guilford Press.

Ferguson, H. S., Owen, A., Hahn, A. C., Torrance, J., Debruine, L. M., & Jones, B. C. (2019). Context-specific effects of facial dominance and trustworthiness on hypothetical leadership decisions. PLOS ONE, 14(7), e0214261. https://doi.org/10.1371/journal.pone.0214261

Fruhen, L., Watkins, C. D., & Jones, B. C. (2015). Perceptions of facial attractiveness, dominance and trustworthiness predict managerial pay awards in experimental tasks. The Leadership Quarterly, 26(6), 1005–1016. https://doi.org/10.1016/j.leaqua.2015.07.001

Gelman, A., & Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.

Goodrich, B., Gabry, J., Ali, I., & Brilleman, S. (2020). rstanarm: Bayesian applied regression modeling via Stan. https://mc-stan.org/rstanarm

Hox, J. J. (1994). Hierarchical regression models for interviewer and respondent effects. Sociological Methods & Research, 22(3), 300–318. https://doi.org/10.1177/0049124194022003002

Kahneman, D. (2011). Thinking, fast and slow. FSG Books.

Kahneman, D., & Tversky, A. (1982). The Psychology of Preferences. Scientific American, 246(1), 160–173.

Kahneman, D., & Tversky, A. (1984). Choices, values, and frames. American Psychologist, 39, 341–350.

Kruschke, J. (2014). Doing Bayesian data analysis: A tutorial with r, jags, and stan. Academic Press.

Lawson, C., Lenz, G. S., Baker, A., & Myers, M. (2010). Looking like a winner: Candidate appearance and electoral success in new Ddemocracies. Source: World Politics, 62(4), 561–593. https://doi.org/10.1017/S00438871

Linke, L., Saribay, S. A., & Kleisner, K. (2016). Perceived trustworthiness is associated with position in a corporate hierarchy. Personality and Individual Differences, 99, 22–27. https://doi.org/10.1016/j.paid.2016.04.076

Linos, E. (2018). More Than Public Service: A Field Experiment on Job Advertisements and Diversity in the Police. Journal of Public Administration Research and Theory, 28(1), 67–85. https://doi.org/10.1093/jopart/mux032

Linos, E., Reinhard, J., & Ruda, S. (2017). Levelling the playing field in police recruitment: Evidence from a field experiment on test performance. Public Administration, 95(4), 943–956.

Little, A. C., Burriss, R. P., Jones, B. C., & Roberts, S. C. (2007). Facial appearance affects voting decisions. Evolution and Human Behavior, 28(1), 18–27. https://doi.org/10.1016/j.evolhumbehav.2006.09.002

Lovrich, N. P., Benjamin, F., & Simon, C. A. (2018). Facial image dominance and election outcomes: An analysis of Washington State nonpartisan judicial candidates. Western Political Science Association Annual Meeting, San Francisco, March 29-31, 2018.

Lum, C., & Koper, C. S. (2017). Evidence-based policing. Oxford Univ. Press.

Martin, S. E. (1982). Breaking and Entering: Policewomen on Patrol. University of California Press.

Mastracci, S. H., & Adams, I. T. (2019). Is Emotional Labor Easier in Collectivist or Individualist Cultures? An East–West Comparison. Public Personnel Management, 48(3), 325–344. https://doi.org/10.1177/0091026018814569

Mazur, A. (1985). A biosocial model of status in face-to-face primate groups. Social Forces, 64(2), 377–402.

Mazur, A., Mazur, J., & Keating, C. (1984). Military rank attainment of a West Point class: Effects of cadets’ physical features. American Journal of Sociology, 90(1), 125–150.

Mazur, A., & Mueller, U. (1996). Channel modeling: From West Point cadet to general. Public Administration Review, 56(2), 191–198.

McElreath, R. (2020). Statistical rethinking: A Bayesian course with examples in R and Stan (2nd ed.). CRC Press.

McHugh, M. L. (2012). Interrater reliability: The kappa statistic. Biochemia Medica, 22(3), 276–282.

Mourtgos, S. M., Adams, I. T., & Mastracci, S. H. (2021). Improving victim engagement and officer response in rape investigations: A longitudinal assessment of a brief training. Journal of Criminal Justice, 10.

Mourtgos, S. M., Adams, I. T., & Nix, J. (2022). Elevated police turnover following the summer of George Floyd protests: A synthetic control study. Criminology & Public Policy, 21(1), 9–33. https://doi.org/10.1111/1745-9133.12556

Mueller, U., & Mazur, A. (1996a). Facial dominance of West Point cadets as a predictor of later military rank. Social Forces, 74(3), 823–850. https://doi.org/10.1093/sf/74.3.823

Mueller, U., & Mazur, A. (1996b). Facial dominance of West Point cadets as a predictor of later military rank. Social Forces, 74(3), 823–850. https://doi.org/10.1093/sf/74.3.823

Peirce, C. S. (1929). Guessing. The Hound & Horn, 2(3), 267–285.

Peirce, C. S., & Jastrow, J. (1884). On small differences in sensation. Memoirs of the National Academy of Sciences, 3.

Plato. (2000). Plato: “The Republic.” Cambridge University Press.

Police Executive Research Forum. (2018). Promoting excellence in first-line supervision: New approaches to selection, training, and leadership development.

Rabe-Hemp, C. E. (2008). Survival in an “all boys club”: Policewomen and their fight for acceptance. Policing: An International Journal of Police Strategies & Management, 31(2), 251–270.

Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). SAGE Publications.

Re, D. E., Hunter, D. W., Coetzee, V., Tiddeman, B. P., Xiao, D., DeBruine, L. M., Jones, B. C., & Perrett, D. I. (2013). Looking Like a Leader–Facial Shape Predicts Perceived Height and Leadership Ability. PLOS ONE, 8(12), e80957. https://doi.org/10.1371/journal.pone.0080957

Rhodes, G. (2006). The evolutionary psychology of facial beauty. Annual Review of Psychology, 57, 199.

Rief, R. M., & Clinkinbeard, S. S. (2020). Exploring Gendered Environments in Policing: Workplace Incivilities and Fit Perceptions in Men and Women Officers. Police Quarterly, 1098611120917942. https://doi.org/10.1177/1098611120917942

Rockey, J. C., Smith, H. M. J., & Flowe, H. D. (2022). Dirty looks: Politicians’ appearance and unethical behaviour. The Leadership Quarterly, 33(2), 101561. https://doi.org/10.1016/j.leaqua.2021.101561

Rosar, U., Klein, M., & Beckers, T. (2008). The frog pond beauty contest: Physical attractiveness and electoral success of the constituency candidates at the North Rhine-Westphalia state election of 2005. European Journal of Political Research, 47(1), 64–79. https://doi.org/10.1111/j.1475-6765.2007.00720.x

Sacco, D. F., & Brown, M. (2018). Preferences for facially communicated big five personality traits and their relation to self-reported big five personality. Personality and Individual Differences, 134, 195–200. https://doi.org/10.1016/j.paid.2018.06.024

Sala, E., Terraneo, M., Lucchini, M., & Knies, G. (2013). Exploring the impact of male and female facial attractiveness on occupational prestige. Research in Social Stratification and Mobility, 31, 69–81. https://doi.org/10.1016/j.rssm.2012.10.003

Sanchez, C., & Dunning, D. (2018). Overconfidence among beginners: Is a little learning a dangerous thing? Journal of Personality and Social Psychology, 114(1), 10–28. https://doi.org/10.1037/pspa0000102

Sandrin, R., Simpson, R., & Gaub, J. E. (2022). An experimental examination of the perceptual paradox surrounding police canine units. Journal of Experimental Criminology. https://doi.org/10.1007/s11292-022-09516-y

Savery, L. K. (1994). Merit-Based Promotion System: Police Officers’ Views. Police Journal, 67(4), 309–322.

Schuck, A. M. (2014). Female Representation in Law Enforcement: The Influence of Screening, Unions, Incentives, Community Policing, CALEA, and Size. Police Quarterly, 17(1), 54–78. https://doi.org/10.1177/1098611114522467

Schuck, A. M. (2021). Motivations for a career in policing: Social group differences and occupational satisfaction. Police Practice and Research, 22(5), 1507–1523. https://doi.org/10.1080/15614263.2020.1830772

Sherman, L. W. (2015). A tipping point for “totally evidenced policing” ten ideas for building an evidence-based police agency. International Criminal Justice Review, 25(1), 11–29.

Shjarback, J. A., & Todak, N. (2019). The Prevalence of Female Representation in Supervisory and Management Positions in American Law Enforcement: An Examination of Organizational Correlates. Women & Criminal Justice, 29(3), 129–147. https://doi.org/10.1080/08974454.2018.1520674

Tankebe, J. (2014). The making of ‘democracy’s champions’: Understanding police support for democracy in Ghana. Criminology & Criminal Justice, 14(1), 25–43. https://doi.org/10.1177/1748895812469380

Todak, N., & Brown, K. (2019). Policewomen of color: A state-of-the-art review. Policing: An International Journal, 42(6), 1052–1062. https://doi.org/10.1108/PIJPSM-07-2019-0111

Todak, N., Leban, L., & Hixon, B. (2021). Are Women Opting Out? A Mixed Methods Study of Women Patrol Officers’ Promotional Aspirations. Feminist Criminology, 16(5), 658–679. https://doi.org/10.1177/15570851211004749

Todorov, A., Mandisodza, A. N., Goren, A., & Hall, C. C. (2005). Inferences of competence from faces predict election outcomes. Science, 308(5728), 1623–1626. https://doi.org/10.1126/science.1110589

Todorov, A., Olivola, C. Y., Dotsch, R., & Mende-Siedlecki, P. (2015). Social attributions from faces: Determinants, consequences, accuracy, and functional significance. Annu. Rev. Psychol, 66, 519–545. https://doi.org/10.1146/annurev-psych-113011-143831

Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124–1131.

Van de Schoot, R., & Depaoli, S. (2014). Bayesian analyses: Where to start and what to report. The European Health Psychologist, 16(2), 75–84.

Van Vugt, M., & Grabo, A. E. (2015). The many faces of leadership: An evolutionary-psychology approach. Current Directions in Psychological Science, 24(6), 484–489. https://doi.org/10.1177/0963721415601971

Wehner, M. R., Nead, K. T., Linos, K., & Linos, E. (2015). Plenty of moustaches but not enough women: Cross sectional study of medical leaders. BMJ, 351, h6311. https://doi.org/10.1136/bmj.h6311

Whetstone, T. S. (2001). Copping out: Why police officers decline to participate in the sergeant’s promotional process. American Journal of Criminal Justice, 25(2), 147. https://doi.org/10.1007/BF02886842

Wolfe, S. E., & Lawson, S. G. (2020). The organizational justice effect among criminal justice employees: A meta-analysis*. Criminology, 58(4), 619–644. https://doi.org/10.1111/1745-9125.12251

Wolfe, S. E., & Piquero, A. R. (2011). Organizational Justice and Police Misconduct. Criminal Justice and Behavior, 38(4), 332–353. https://doi.org/10.1177/0093854810397739

Acknowledgments

We extend our thanks to Dr. Daniel Schiff (Purdue) for his help improving an earlier version of this paper, and Dr. Carl Jenkinson (University of South Carolina) for his quaint friendship and helpful editing of a later draft.

Appendix

Appendix Figure 1

Appendix Figure 2

Appendix Figure 3

Appendix Figure 4

Appendix Figure 5

Appendix Photograph 1 (Task 1, Question 1)

Appendix Photograph 2 (Task 1, Question 2)

Appendix Photograph 3 (Task 2)

Appendix Figure 6

Comments
0
comment
No comments here
Why not start the discussion?