Skip to main content
SearchLoginLogin or Signup

Spatial heterogeneity in crime analysis

Andresen, M.A., & Malleson, N. (2013). Spatial heterogeneity in crime analysis. In M. Leitner (Ed.), Crime modeling and mapping using geospatial technologies (pp. 3 – 23). New York, NY: Springer.

Published onJan 01, 2013
Spatial heterogeneity in crime analysis

Abstract Issues related to the modifiable areal unit problem are well-understood within geography. Though these issues are acknowledged in the spatial crime analysis literature, there is little research that assesses their impact. In fact, much of the cited spatial crime analysis literature that investigates the impact of modified areal units suggests that there is no problem—there is, however, an alternative literature. In this paper, we employ a new area-based spatial point pattern test to investigate the impact of modified areal units on crime patterns. We are able to show that despite the appearance of similarity in a (spatial) regression context, smaller units of analysis do show a high degree of variation within the larger units they are nested. Though this result in and of itself is not new, we also quantify how much spatial heterogeneity is present. This quantification is undertaken using multiple crime classifications and in a cross-national comparison.

Keywords Spatial crime analysis, Spatial heterogeneity, Modifiable areal unit problem (MAUP), Point pattern analysis

1. Introduction

Over the past 180 years, the geography of crime literature has moved to ever finer spatial scales of resolution. Beginning with the work of Quetelet (1831, 1842) and Guerry (1833), this literature has moved from French Departments, to counties, towns, neighborhoods and now the street segment (Glyde 1856; Burgess 1916; Shaw and McKay 1931, 1942; Sherman et al. 1989; Weisburd et al. 2004, 2009). The drive for analyses to be undertaken at these ever finer spatial scales is the discovery of significant heterogeneity within smaller spatial units of analysis: there are safe places within bad neighborhoods and dangerous places within good neighborhoods (Sherman et al. 1989).

An obvious question to emerge within this geography of crime literature because of this finding is: what is the appropriate spatial scale of analysis? Indeed, those that advocate for smaller spatial units of analysis state that micro-places are now deemed appropriate whereas larger spatial units of analysis are not (Andresen and Malleson 2010). But how much does this issue really matter? Yes, there may be significant spatial heterogeneity, but does this impact the analysis?1

A small branch of literature has investigated this question. The results most frequently cited show that the choice of the spatial unit of analysis is irrelevant (Land et al, 1990; Wooldredge 2002). Because of this finding, much of the literature that follows has used this as a justification for only analyzing one type of spatial unit (Schulenberg 2003; Bernasco and Block 2009; Matthews et al. 2010; Osgood and Anderson 2004). But is this a reasonable assumption to be made in all contexts? We argue that it is not.

In this paper, we use calls for service and recorded crime data from police forces in two municipalities (one in Canada and another in England) and a similarity-based spatial point pattern test. We are able to show that despite similarities in the results of global analyses, the results are significantly different at alternative spatial scales of analysis. Because of the nature of this spatial point pattern test we are able to show how results change when the spatial unit of analysis is changed. Previous research has investigated this phenomenon, but we explicitly show the results using two different spatial units of analysis: census tracts and dissemination areas in Canada and middle layer super output areas and output areas in England. Specifically, we are able to quantify the spatial heterogeneity within larger units of analysis for multiple crime classifications and in a cross-national comparison: Vancouver, Canada and Leeds, England.

2. Scale and Spatial Crime Analysis

In geography, scale matters: changing the size or shape of the spatial unit under analysis may lead to unexpected and substantial changes in results (Blalock 1964; Clark and Avery 1976; Gehlke and Biehl 1934; Fotheringham and Wong 1991; Openshaw 1984a, 1984b). This is referred to as the modifiable areal unit problem (MAUP). Faced with the MAUP, there are three possible scenarios that may emerge when modifying the spatial units of analysis under study. First, there may be no impact. In other words, the results are identical (or differences are statistically insignificant) at all spatial scales of analysis. This is clearly the ideal situation. Second, there may be a quantitative impact on the results, but the qualitative results are the same. In this situation, the estimated parameters for the variables in an analysis may change (with statistical significance, so bias is present) but those estimated parameters do not change signs (positive to negative, become statistically insignificant, or negative to positive); as such, variables may be thought to have a stronger or weaker relationship with the dependent variable than is actually the case, but the qualitative interpretations are the same. Third, there may be a qualitative impact in the results. If this occurs, the results may lead the researcher to make substantively incorrect statements: rejecting or accepting a theory when they should not, and/or making incorrect statements regarding a policy initiative in an evaluation. This is the worst-case scenario and is the possibility outlined by Fotheringham and Wong (1991).

Another, but related issue emerges when one makes inference based on an a analysis at one spatial scale and applies it to another spatial scale; when the inference is based on a larger spatial unit and applied to a smaller spatial unit it is referred to as the ecological fallacy (what it true of the whole is not necessarily true of its parts) and when the inference is based on a smaller spatial unit of analysis and applied to a larger spatial unit it is referred to as the atomistic fallacy (what is true of the parts is not necessarily true of the whole). Such problems in inference have been known for a long time and are most often in the context of assigning neighborhood characteristics/relationships to individuals, the ecological fallacy (Robinson 1950). Because of the ecological fallacy, change that occurs at a larger spatial scale may be driven by a small number of the smaller spatial scale units within the larger spatial scale unit. Consequently, there may be variations in the spatial patterns at different scales.

Of course, there may be limitations in the geography of crime when it comes to the choice of spatial scale that are beyond the control of the researcher. For example, when using census data, issues of confidentiality may arise that lead to missing data values and preclude the analysis at a particular geography—Andresen (2006) was unable to undertake an analysis at a smaller spatial scale because almost 25 percent of the census boundary units were missing data because of confidentiality issues.2 Additionally, there may be a number of factors in a decision-making process for research that leads to the use of only one spatial scale. First, data availability may prevent the use of multiple scales of analysis; because of confidentiality concerns, a police department may only provide counts of crime based on one spatial unit. Second, there may be specific spatial scales of interest for those performing the analysis; in such a research context, other spatial scales are simply not of interest or relevant. Third, the researchers may be interested in replicating (being consistent with) previous research that is only concerned with one spatial scale. Though not an exhaustive list, these examples do show that spatial scale is not necessarily being ignored by researchers. Barring situations such as this, we did expect to find the use of multiple scales of analysis in the geography of crime literature in order to investigate the role of the MAUP. However, we found that this is not the case.

For example, Wooldredge (2002), comparing census tracts to administrative neighborhoods, found that the substantive results for different spatial units of analysis are the same. This led Wooldredge (2002, 681) to refer to the “(ir)relevance of aggregation bias” in the context of the MAUP, for the geography of crime.3 Despite the increasing availability of crime data as points (addresses, street intersections, and x-y coordinates), aggregation will still be a concern as long as those using these data aggregate points in order to analyze crime relative to other data that are only available as area polygons, such as census data. More importantly for this issue, it is not that a small number of studies have found such a relationship, but that these studies, particularly Wooldredge (2002), are used as justification for only using one spatial unit of analysis. The use of Wooldredge (2002) for this purpose was picked up almost immediately (Schulenberg 2003), and continues in a variety of contexts (see, for example, Bernasco and Block 2009; Matthews et al. 2010; and Osgood and Anderson 2004).

Despite this rapid adoption of Wooldredge's (2002) conclusion, there is another side to this literature. Ouimet (2000) showed that using census tracts versus neighborhoods does impact the results; specifically, the choice of spatial aggregation impacts the theory that is supported by the data. More recently, and similar to Ouimet (2000), Hipp (2007) showed that explanatory variables exhibit different effects on crime and disorder based on the level of aggregation. Consequently, it is curious that Wooldredge (2002) is almost always cited to support the use of only one level of spatial aggregation.

We are in no way being critical of the work done by Wooldredge (2002) and others. In fact, for the case of Vancouver using the crime data described below, we find that the choice of spatial unit of analysis matters little for the substantive results of a spatial regression. Rather, we are asking if we can simply dispense with multiple spatial units of analysis when studying the geography of crime? In other words, is there any spatial heterogeneity and does it matter? We are unable to find any research that quantifies the degree of spatial heterogeneity, so this is our task in this paper.

3. Data and Methods

Data for Vancouver, Canada and Leeds, England are used in the analysis below. We use data from these two cities for three reasons. First, these are the police data available to us for analysis. Second, we know these cities and are, therefore, able to make interpretation using local knowledge. Third, and most significantly, the inclusion of data from two different countries aids in our ability to make generalizations rather than relying on one set of data that may produce spurious results.

3.1. Vancouver and its Data

The Vancouver data used in the analysis below are for the years 1991 and 2001. The Vancouver Census Metropolitan Area (CMA) is the third largest metropolitan area in Canada, based on population (currently approximately 2 million people), and the largest metropolitan area in western Canada. In 2001, the City of Vancouver had a population of 546 000. In recent years, Vancouver has experienced substantial growth in its resident population: 431 000 in 1986, 472 000 in 1991, and 514 000 in 1996. This high rate of growth is often attributed to the 1986 World Exposition on Transportation and Communication that led to Vancouver receiving worldwide attention and is expected to continue because of the most recent 2010 Winter Olympics held in the Vancouver CMA. With an area of approximately 115 square kilometres, the City of Vancouver has 110 census tracts (CTs) and 990 dissemination areas (DAs), defined by Statistics Canada. Census tracts are relatively small and stable geographic areas that tend to have a population ranging from 2500 to 8000—the average is 4000 persons. Dissemination areas are smaller than census tracts, equivalent in size to a census block group in the U.S. census—approximately 400 to 700 persons, composed of one or more blocks.4

Though Vancouver has had a decreasing crime rate from 1991 – 2001, its crime rate remains substantially higher than the national average. In fact, the Vancouver CMA had the highest crime rates among the three largest metropolitan areas in Canada at 11,367 criminal code offences per 100,000 persons in 2001, more than doubling the rate found in Toronto (5381 per 100,000 persons) and almost doubling that in Montreal (6979 per 100, 000 persons). The same relative standing held for the 2001 violent crime rate in the Vancouver CMA (1058 per 100 000 persons) in comparison to the Toronto CMA (882 per 100 000 persons) and the Montreal CMA (886 per 100 000), but to a lesser degree. These differences in crime rates between these three cities have been decreasing in recent years (Kong 1997; Savoie 2002; Wallace 2003).

All crime data used below come from the Vancouver Police Department’s Calls for Service Database (VPD-CFS Database) generated by its Computer Aided Dispatch system. The VPD-CFS Database is the set of requests for police service made directly to the VPD, through the 911 Emergency Service and allocated to the VPD, and calls for service made by the VPD members while on patrol. The VPD-CFS Database contains information on both the location and the complaint code/description for each call. For each call, there are two codes: the initial complaint code and a complaint code filed by the officer on the scene. The code provided by the officer is always taken to be correct. Though the VPD-CFS Database is actually a proxy for actual crime data because not all calls for service represent actual crimes, the primary advantage of the VPD-CFS Database is this raw form—these data are not dependent on a criminal charge. It should be noted, however, that few calls for service are subsequently unfounded by the VPD. The crime classifications of assault, burglary, robbery, sexual assault, theft, theft of vehicle, and theft from vehicle are all analyzed below.

<Insert Table 1 About Here>

The counts and percentages of these crime classifications are presented in Table 1. In Vancouver, there has been a notable decrease in the counts of crime, consistent with the international crime drop phenomenon (Tseloni et al. 2010; Farrell et al. 2011). Despite this significant decrease in crime (31 percent drop), the distribution of the different crime classifications has remained rather constant; assault has experienced a decrease, with corresponding increases in theft of vehicle and theft from vehicle. Leeds has experienced an increase in crime from 2001 to 2004, just under 4 percent. However, it should be noted that there was a change in recording practices in Leeds between these time periods. Also, the two data sets for Leeds are only 3 years apart. As such, this increase in crime counts could simply be a result of year-to-year fluctuations. Regardless, aside from the assault classifications, the distribution of the different crime classifications has remained relatively constant, aside from an increase in theft.

3.2. Leeds and its Data

The Leeds data used in the analysis are for the years 2001 and 2004. Ideally the same time period would be used for both countries, but data constraints mean that the most reliable crime data in Leeds are only available for these years—reliability refers to the standardization of crime data, discussed below. Leeds is the third largest city in the UK, after London and Birmingham, with a population estimated to be approximately 812 000 in 2011 (Office for National Statistics 2010). Spatially, Leeds is the second largest city in the UK, covering an area of approximately 550 square kilometres. As a consequence of hosting two universities, Leeds has a very large student population that has had a strong influence on the development of the city in order to cater for a large number of student migrants. The student population is also highly concentrated into a relatively small area to the north of the universities that has a substantial effect on crime patterns. Spatially, the Leeds area can be subdivided into 108 medium-level super output areas (MSOAs) and further into 2440 output areas (OAs). An output area is the smallest 2001 census geography available and contains a minimum of 40 households or 100 people, but the recommended size is approximately 125 households. MSOAs are a larger geography which contain 7 200 people on average and have been designed to fit to the borders of OAs to allow for data aggregation.

In terms of crime, Leeds has generally followed the UK national trend and has seen consistent yearly reductions in most types of crime since 1997. Leeds has higher than average crime rates compared to the average for England and Wales, although this is not unexpected given its demographic and socioeconomic characteristics. The most unusual observation is that rates of residential burglary are particularly high: almost double the national average (18.4 crimes per 1 000 people compared to 9.6) and it has not exhibited the decline that the other types of crime have shown. The explanation for this is largely tied in with the effects of the student population who generally suffer a disproportionate number of burglaries.

The crime data used in the analysis below consist of all crimes recorded by the police in the Leeds area. The data cover the time periods 1st April 2000 – 31st March 2001 (hereby abbreviated to ‘2001’) and 1st April 2003 – 31st March 2004 (‘2004’). The data are coded by crime type and are stored with a location address that can be geocoded. There are numerous implications for using this type of data in research; namely not all crime is reported to the police in the first place and, even if it is reported, the crime might not necessarily be recorded by the police. In fact, recording practices varied substantially across police forces so to standardize them the National Crime Recording Standard was phased in through 2001 to 2004. The new standard followed a more “victim centred” approach so that a crime should be recorded even if there is no evidence that it has taken place. This led to an apparent increase in some types of crime, particularly violent crimes: assault and, to a lesser extent, sexual assault. The counts and percentages of these crime classifications are presented in Table 1.

3.3. Geocoding

Geocoding has the potential to introduce error into any analysis. Previous research has noted that geocoding algorithms are not only inaccurate at times, but are also at risk of not locating all street addresses or street intersections for (criminal) incidents (Ratcliffe 2001; Cayo and Talbot 2003; Zandbergen 2008). Consequently, potential for spatial bias is present. Ratcliffe (2004) addresses this issue through the identification of a minimum acceptable hit/success rate of 85 percent. The geocoding procedure used for the current data generated 93 and 94 percent success rates for 1991 and 2001 in Vancouver. In Leeds, the data were already geocoded by the police and although actual hit rates are not known the data were put through an extensive cleaning process and can be assumed to be sufficiently high. It should be noted that error may still be present: geocoding to the wrong address, being placed to a centroid, or a correct match may be aggregated to the incorrect spatial unit. With our success rates exceeding the minimum acceptable success rate generated by Ratcliffe (2004) and the indication that improper address records are random, the analysis is undertaken with little concern for spatial bias.

In addition to acceptable hit/success rates in geocoding, there are a number of potential problems that are country specific. In Vancouver – where the ‘block’ street system means that building location can be estimated from its number on a street – long streets may be arbitrarily broken into segments that are not based on intersections; events are placed on the street segment using an interpolation process that may place the event in the wrong place on the street segment; a geocoding match may be made on an areal unit and subsequently misplaced on the wrong street segment; and there is variation in street segment length that may skew the analysis.

In the UK, the street system is not regular so it is not possible to estimate a location based on a building number. Instead, a lookup table is used to match an address directly to some spatial coordinates. The Leeds data were then matched directly to the coordinates of the building at which the crime occurred, or they were assigned manually in places where no building was available to link to. The data were cleaned considerably (both manually and using computer software) before use so we are confident that geocoding issues will not influence the analysis.

Therefore the Vancouver data used in the current analysis are geocoded to the street network and the Leeds data are geocoded directly to points. Both data are then subsequently aggregated to their respective census boundary units using a spatial join function.5 We use the same (most recent) street network or address lookup table in each city for geocoding different years of data; this avoids the problem of not being able to find new streets or buildings but has the potential of old roads being closed and old buildings being torn down. No such street closures occurred in Vancouver and, as mentioned above, if no building was present (from a possible tear-down) points were manually assigned to the spatial units of analysis. Though shorter street segments may have a lower probability of having a criminal event, ceteris paribus, the randomization process in the spatial point pattern test minimizes the potential for having less scope for change than longer street segments. Lastly, the interpolation issue of geocoding algorithms, whereby a point’s position on a street segment might be inaccurate, is not a concern here because no inference is made at a finer scale than the dissemination area (Vancouver) or the output area (Leeds).

3.4. The Spatial Point Pattern Test

In order to investigate spatial heterogeneity within larger spatial units of analysis (census tracts in Vancouver and MSOAs in Leeds), a testing methodology that identifies changes in spatial crime patterns at multiple scales is necessary. The spatial point pattern test developed by Andresen (2009) serves this purpose well because it can be used to independently identify changes in the spatial patterns of crime at different spatial scales and the output may then be used to quantify spatial heterogeneity. The change for each smaller spatial unit of analysis (DAs and OAs) can be assigned to its respective larger spatial unit of analysis (CTs and MSOAs) and then spatial heterogeneity (or homogeneity) can be assessed by counting the number (percentage) of smaller spatial units within their larger units of analysis that have the same classification of change. This spatial point pattern test has been applied to investigate pattern changes in international trade (Andresen 2010) and for testing the stability in crime patterns (Andresen and Malleson 2011).

The Andresen (2009) spatial point pattern test is area-based6 and is concerned with the similarity between two different spatial point patterns at the local level. This particular spatial point pattern test is not concerned with null hypotheses of random, uniform, or clustered distributions, but may be used to compare a particular point pattern with these distributions. An advantage of the test, as we demonstrate here, is that it can be calculated for different area boundaries using the same original point datasets. In order to simplify the process of calculating the test we developed a computer program that is freely available from the authors. The test is computed as follows:

  1. Nominate a base dataset (1991 assaults, for example) and count, for each area, the number of points that fall within it.

  2. From the test dataset (1996 assaults, for example), randomly sample 85 percent of the points, with replacement.7 As with the previous step, count the number of points within each area using the sample. This is effectively a bootstrap created by sampling from the test dataset.

  3. Repeat (2) a number of times (in our analysis below we used 200 iterations).

  4. For each area in the test data set, calculate the percentage of crime that has occurred in the area. Use these percentages to generate a 95 percent nonparametric confidence interval by removing the top and bottom 2.5 percent of all counts (5 from the top and 5 from the bottom in this case). The minimum and maximum of the remaining percentages represent the confidence interval. It should be noted that the effect of the sampling procedure will be to reduce the number of observations in the test dataset but, by using percentages rather than the absolute counts, comparisons between data sets can be made even if the total number of observations are different.

  5. Calculate the percentage of points within each area for the base dataset and compare this to the confidence interval generated from the test dataset. If the base percentage falls within the confidence interval then the two datasets exhibit a similar proportion of points in the given area. Otherwise they are significantly different.8

The purpose of this spatial point pattern test is to create variability in one dataset so that it can be compared statistically to another dataset. The 85 percent samples generated, each maintain the spatial pattern of the test dataset and allows for a “confidence interval” to be created for each spatial unit that may be compared to the base dataset. Therefore, statistically significant changes/differences are identified at the local level.

The output of the test consists of two parts. First, there is a global parameter that ranges from 0 (no similarity) to 1 (perfect similarity): the index of similarity, S, is calculated as:

,

where si is equal to one if two crimes are similar in spatial unit i and zero otherwise, and n is the total number of spatial units. As such, the S-Index represents the proportion of spatial units that have a similar spatial pattern within both data sets. Second, the test generates mappable output to show where statistically significant change occurs; i.e. which census tracts, dissemination areas, middle layer super output areas ,and output areas have undergone a statistically significant change. Though this spatial point pattern test is not a local indicator of spatial association (LISA, see Anselin 1995) and there is much more to LISA than being able to produce maps of results, it is in the spirit of LISA because the output may be mapped.9

A number of tests for similarity are performed. For each crime classification and each spatial unit of analysis, indices of similarity are calculated for 1991 – 2001 (Vancouver) and 2001 – 2004 (Leeds). These indices are then used to quantify the degree of spatial heterogeneity present with the changes of the spatial point patterns at the different scales of analysis.

4. Results

Before we turn to the results for the examination of spatial heterogeneity, it is important to examine the Indices of Similarity within each of the different spatial units of analysis for each crime classification. These results are presented in Table 2. In the case of Vancouver, census tracts do not exhibit much similarity over time, with most values of S being less than 0.300; the results for the dissemination areas are most often more similar over time, close to twice that of census tracts in half of the crime classifications. Noteworthy here is the high degree of similarity for sexual assault, especially within census tracts; sexual assault also has the same proportion of criminal events in both years. In the case of Leeds, the S indices are most often much greater magnitude than in Vancouver. However, this is expected as the time frame for the Leeds crime data is much shorter: 3 years instead of 10 years. And aside from the crime classifications for robbery and sexual assault the S values for the output areas are similar to those for the middle layer super output areas.

<Insert Table 2 About Here>

With regard to spatial heterogeneity, the results are remarkably similar across not only crime classifications but also across municipalities. In the case of assault (Table 3), the number of larger areas with zero smaller areas in Vancouver and Leeds is always zero. This is the expected result. In fact, if (when) this occurs, it is highly problematic; such a situation is further discussed below. However, the number of larger areas with all smaller areas having the same classification is also zero in most cases—all cases in Leeds. When this does occur (2001 < 1991 and insignificant change, in Vancouver), it occurs in very few cases. Overall, the average percentage of smaller areas with the same larger area classification is surprisingly low. The best case scenario, for both Vancouver and Leeds, is that a little more than half of the smaller areas have the same larger area classification. Though this may be viewed positively, it also means that a little less than half do not have the same classification. This is a substantial degree of spatial heterogeneity that must be considered when inference is being made at only one level of analysis. The results for burglary (Table 4) are similar to those for assault and require little further discussion. The primary result to note here is that assault and burglary have similar results despite these two crime classifications exhibiting different patterns over time in Table 1: relatively speaking, assault is decreasing in Vancouver and increasing in Leeds, but burglary in both cities is constant. As such, the degree of spatial heterogeneity does not necessarily depend on other changes in a crime's distribution.

<Insert Table 3 and 4 About Here>

The results for robbery (Table 5) and sexual assault (Table 6) have similar results for the average percentage of smaller areas with the same larger area classification, but some of the other results are worthy of note. In both Leeds and Vancouver, robbery and sexual assault have some larger areas with zero smaller areas having the same classification: 2001 > 1991, for both cases in Vancouver, and insignificant change for both cases in Leeds. Such a result is particularly problematic because the nature of the spatial heterogeneity is such that the smaller spatial units of analysis have nothing in common with the larger spatial units of analysis. A problem emerges here specifically in the context of policy. If policy is being implemented based on global results and the larger area is used as a reference point for policy implementation, the policy may be applied in error. This will lead to a misallocation of resources, at best, or aggravate the original situation that policy-makers are trying to correct, at worst.

<Insert Table 5 and 6 About Here>

Turning to the three classifications of theft—theft (Table 7), theft of vehicle (Table 8), and theft from vehicle (Table 9)—the results are more promising in terms of the magnitude of within larger spatial unit spatial heterogeneity. The average percentages of smaller areas with the same larger area classification are of the same magnitude as the other crime classifications. Though theft from vehicle (Leeds) and theft of vehicle (Vancouver) do have a small number of larger areas with zero small areas with the same classification, Vancouver has promising results for the number of larger areas with all corresponding small areas having the same classification. The magnitudes of the percentages are not that great, ranging from 2.8 (theft of vehicle) to 18.8 (theft from vehicle) percent, but this is a definite improvement over the results for the other crime classifications.

<Insert Table 7, 8 and 9 About Here>

5. Discussion

In this paper we have investigated the phenomenon of spatial heterogeneity in the context of spatial point patterns changing over time. Though this is only one dimension of change that may be investigated, the results are strong enough to cause some concern over the lack of sensitivity analyses in the geography of crime literature—the lack of using multiple spatial scales of analysis. The general result is that, on average, approximately one-half of smaller spatial units of analysis have the same classification as their larger counterparts. Though this may translate into an irrelevant effect when using a global statistical technique, as it does using the data in the current analysis, the magnitude of the spatial heterogeneity cannot be ignored. Therefore, spatial heterogeneity in the presence of an irrelevant effect in a particular context does not mean there are no aggregation biases present, generally speaking. As such, we as researchers cannot simply assume that aggregation bias is not present and only perform analyses at one spatial scale because of a small number of research projects have not found evidence for aggregation bias; aggregation is present, it just does not manifest itself in particular contexts using particular techniques.

The case of sexual assault in Vancouver is of particular interest here. Figure 1 shows the results from the spatial point pattern test. All four census tracts shown in Figure 1 have statistically significant increases, 2001 > 1991. The two middle census tracts are likely representative of the presence of spatial heterogeneity: some DAs exhibit increasing trends, some DAs exhibit decreasing trends, and some DAs exhibit insignificant change. In these cases, there are a small number of DAs (one in the case of the CT on top of the map) that are driving the results for the larger CTs. However, for the CTs on either side of Figure 1, there is clearly something else going on. In each case, there are no DAs that exhibit increasing trends; rather, most have statistically insignificant change with a small number of decreasing trends. How can this be the case?

<Insert Figure 1 About Here>

As it turns out for the CTs on the sides of Figure 1, there are DAs with statistically insignificant changes that have increasing trends. And these increasing trends are close to being statistically significant; if a 90 percent confidence interval had been chosen, for example, the results of those DAs would have been statistically significant and increasing. But the point of this discussion is not in regard to the choice of statistical significance. Rather, the point is that insignificant changes at the level of a smaller spatial unit of analysis may become statistically significant with a larger spatial unit of analysis. In other words, there is an aggregation effect.

Comparing the results from Tables 3 to 9 to Table 1, an interesting relationship emerges. The crime classifications that had the most problematic results (robbery and sexual assault) had the lowest counts and percentages for both Vancouver and Leeds, and the crime classifications that had the most promising results (burglary, theft, and theft from vehicle) had the greatest counts and percentages for both Vancouver and Leeds—theft of vehicle also had promising results. Therefore, it would appear that if the event is more common, the results are less problematic. This does not mean that spatial heterogeneity is not an issue when there are more crimes, just that the issue does not appear to be as great. This result relates to the discussion above regarding the ecological fallacy. Variations in spatial patterns may be more evident when the count of points in the spatial pattern is less. Such a situation is understood intuitively: a spatial pattern with fewer points is more likely to have zero values in spatial units, leading to more spatial heterogeneity withing larger spatial units. This is confirmed in Table 2 for Vancouver that has the highest S-Index values for the low-count crime classification of robbery and sexual assault. Therefore, the degree of concern for spatial heterogeneity should be inversely related to the number of points in the spatial pattern. Consequently, if an analysis (for the purposes of pure academic interests, policy, or a combination of both) is restricted to one spatial unit of analysis, results may have to be tempered depending on the number of points under analysis.

There are a number of obvious directions for future research. Though we have performed this analysis in two municipalities that are quite distant from one another, more replication is always preferable. We claim that too often research relies on a small number of other studies that claim aggregation bias is minimal or non-existent as justification for only performing analyses at one spatial scale. Consequently, we wish to be careful with our generalizations. The form of replication needs to be varied as well. Not only should further investigations into spatial heterogeneity be in other urban areas, but suburban and rural areas as well. Because rural areas tend to have less crime than urban areas, the spatial heterogeneity may be more of a problem in rural areas than suburban and urban areas. Similarly, the crime mix likely varies across urban, suburban, and rural areas, so may the issue of spatial heterogeneity. The more context we have regarding spatial heterogeneity, the better choices we can make regarding spatial scale.

Though the current analysis is instructive, the format of quantifying spatial heterogeneity should be performed in different ways. For example, it would be most useful to investigate spatial heterogeneity in the context of standard spatial theories of crime. It may be the case that a small number of small area (DAs/OAs) are driving the results for their aggregate areas (CTs/MSOAs). Specifically, once we have more information regarding the role of spatial scale and spatial heterogeneity we may be able to further develop/refine/test spatial theories of crime. Not only may a small number of small areas be driving aggregate results, but the way we think about particular theoretical frameworks may change.

References

Andresen, M. A. (2006). A spatial analysis of crime in Vancouver, British Columbia: a

synthesis of social disorganization and routine activity theory. Canadian

Geographer, 50(4), 487 - 502.

Andresen, M. A. (2009). Testing for similarity in area-based spatial patterns: a

nonparametric Monte Carlo approach. Applied Geography, 29(3), 333 - 345.

Andresen, M. A. (2010). Canada - United States interregional trade: quasi-points and

spatial change. Canadian Geographer, 54(2), 139 - 157.

Andresen, M. A., & Malleson, N. (2011). Testing the stability of crime patterns:

implications for theory and policy. Journal of Research in Crime and

Delinquency, 48(1), 58 – 82.

Anselin, L. (1995). Local indicators of spatial association – LISA. Geographical

Analysis, 27(2), 93 – 115.

Bernasco, W., & Block, R. (2009). Where offenders choose to attack: a discrete choice

model of robberies in Chicago. Criminology, 47(1), 93 – 130.

Blalock, H. M. (1964). Causal inferences in nonexperimental research. Chapel Hill, NC:

University of North Carolina Press.

Burgess, E. W. (1916). Juvenile delinquency in a small city. Journal of the American

Institute of Criminal Law and Criminology, 6(5), 724 – 728.

Cayo, N. R., & Talbot, T. O. (2003). Positional error in automated geocoding of

residential addresses. International Journal of Health Geographics, 2(1), 1 – 12.

Clark, W. A. V, & Avery, K.L. (1976). The effects of data aggregation in statistical

analysis. Geographical Analysis, 8(4), 428 – 438.

Farrell, G., Tseloni, A., Mailley, J., and Tilley, N. (2011). The crime drop and the

security hypothesis. Journal of Research in Crime and Delinquency 48(2), 147 –

175.

Fotheringham, A. S., & Wong, D.W. (1991). The modifiable areal unit problem in

multivariate statistical analysis. Environment and Planning A, 23(7), 1025 – 1044.

Gehlke, C.E., & Biehl, K. (1934). Certain effects of grouping upon the size of the

correlation coefficient in census tract material. Journal of the American Statistical

Association Supplement, 29(185), 169 – 170.

Glyde, J. (1856). Localities of crime in Suffolk. Journal of the Statistical Society of

London, 19(2), 102 – 106.

Guerry, A-M. (1833). Essai sur la statistique morale de la France. Paris:

Crochard.

Hipp, J. R. (2007). Block, tract, and levels of aggregation: neighborhood structure and

crime and disorder as a case in point. American Sociological Review, 72(5), 659 –

680.

Kong, R. (1997). Canadian crime statistics, 1996. Ottawa, ON: Statistics Canada,

Canadian Centre for Justice Statistics.

Land, K. C., McCall, P. L., & Cohen, L. E. (1990). Structural covariates of homicide

rates: are there any invariances across time and social space? American Journal of

Sociology, 95(4), 922 – 963.

Lloyd, C. D. (2011). Local models for spatial analysis (second edition). Boca Raton, FL:

CRC Press, Taylor & Francis Group.

Matthews, S. A., Yang, T-C., Hayslett, K. L., & Ruback, R. B. (2010). Built environment

and property crime in Seattle, 1998 – 2000: a Bayesian analysis. Environment and

Planning A, 42(6), 1403 – 1420.

Office for National Statistics (2010). 2008-based Subnational Population Projections for England. Newport: Office for National Statistics. Report available on-line http://www.ons.gov.uk/ons/ [accessed July 2011]

Openshaw, S. (1984a). The modifiable areal unit problem. CATMOG (Concepts and

Techniques in Modern Geography) 38. Norwich: Geo Books.

Openshaw, S. (1984b). Ecological fallacies and the analysis of areal census data.

Environment and Planning A, 16(1), 17 – 31.

Osgood, D. W., & Anderson, A.L. (2004). Unstructured socializing and rates of

delinquency. Criminology, 42(3), 519 – 549.

Ouimet, M. (2000). Aggregation bias in ecological research: how social disorganization

and criminal opportunities shape the spatial distribution of juvenile delinquency in

Montreal. Canadian Journal of Criminology, 42(2), 135 – 156.

Quetelet, L. A. J. (1831) [1984]. Research on the propensity for crime at different

ages. Cincinnati, OH: Anderson Publishing.

Quetelet, L. A. J. (1842). A treatise on man and the development of his faculties.

Edinburgh: W. and R. Chambers.

Ratcliffe, J. H. (2001). On the accuracy of TIGER type geocoded address data in relation

to cadastral and census areal units. International Journal of Geographical

Information Science, 15(5), 473 – 485.

Ratcliffe, J. H. (2004). Geocoding crime and a first estimate of a minimum acceptable hit

rate. International Journal of Geographical Information Science, 18(1), 61 – 72.

Robinson, W. S. (1950). Ecological correlations and the behavior of individuals.

American Sociological Review, 15(3), 351 – 357.

Savoie, J. (2002). Crime statistics in Canada, 2001. Ottawa. ON: Statistics Canada,

Canadian Centre for Justice Statistics.

Schulenberg, J. L. (2003). The social context of police discretion with young offenders:

an ecological analysis. Canadian Journal of Criminology and Criminal Justice,

45(2), 127 – 157.

Shaw, C. R., & McKay, H. D. (1931). Social factors in juvenile delinquency.

Washington, DC: U.S. Government Printing Office.

Shaw, C. R., & MacKay, H. D. (1942). Juvenile delinquency and urban areas: A study of

rates of delinquency in relation to differential characteristics of local

communities in American cities. Chicago, IL: University of Chicago Press.

Sherman, L. W., Gartin, P. R., & Buerger, M. E. (1989). Hot spots of predatory crime:

routine activities and the criminology of place. Criminology, 27(1), 27 – 56.

Tseloni, A., Mailley, J., Farrell, G., and Tilley, N. (2010). Exploring the international

decline in crime rates. European Journal of Criminology 7(5), 375 – 394.

Wallace, M. (2003). Crime statistics in Canada, 2002. Ottawa, ON: Statistics Canada,

Canadian Centre for Justice Statistics.

Weisburd, D., Bernasco,W., & Bruinsma, G. J. N. (2009). Putting crime in its place:

Units of analysis in geographic criminology. New York, NY: Springer.

Weisburd, D., Bushway, S., Lum, C., & Yang, S-M. (2004). Trajectories of crime at

places: a longitudinal study of street segments in the City of Seattle. Criminology,

42(2), 283 – 321.

Wooldredge, J. (2002). Examining the (ir)relevance of aggregation bias for multilevel

studies of neighborhoods and crime with an example of comparing census tracts

to official neighborhoods in Cincinnati. Criminology, 40(3), 681 – 709.

Zandbergen, P. A. (2008). A comparison of address point, parcel and street geocoding

techniques. Computers, Environment and Urban Systems, 32(3), 214 – 232.

Figure 1. Sexual Assault, Census Tract to Dissemination Area

Table 1. Counts and Percentages for Crime Types

Vancouver

1991

2001

Count

Percent

Count

Percent

Assault

16556

20.1

7643

13.4

Burglary

18068

22.0

13025

22.9

Robbery

1421

1.7

1251

2.2

Sexual Assault

672

0.8

440

0.8

Theft

16862

20.5

11255

19.8

Theft of Vehicle

5957

7.2

6273

11.0

Theft from Vehicle

22728

27.6

16991

29.9

Total

82264

100.0

56878

100.0

Leeds

2001

2004

Count

Percent

Count

Percent

Assault

5830

9.3

15323

23.4

Burglary

13610

21.6

13838

21.1

Robbery

2289

3.6

1985

3.0

Sexual Assault

493

0.8

807

1.2

Theft

22043

34.9

27458

41.9

Theft of Vehicle

9098

14.4

7354

11.2

Theft from Vehicle

15493

24.6

14167

21.6

Total

63026

100.0

65609

100.0

Note. The substantial increase for assaults in Leeds is due to a change in recording practices.

Table 2. Indices of Similarity

Vancouver, 1991 - 2001

Census Tracts

Dissemination Areas

Assault

0.300

0.335

Burglary

0.155

0.299

Robbery

0.327

0.662

Sexual Assault

0.509

0.691

Theft

0.136

0.237

Theft of Vehicle

0.300

0.332

Theft from Vehicle

0.146

0.261

Leeds, 2001 - 2004

Middle Layer Super Output Areas

Output Areas

Assault

0.769

0.639

Burglary

0.667

0.726

Robbery

0.667

0.283

Sexual Assault

0.667

0.148

Theft

0.722

0.701

Theft of Vehicle

0.898

0.677

Theft from Vehicle

0.796

0.718

Table 3. Spatial heterogeneity, assault

Vancouver

Census tracts

2001 > 1991

Census tracts

2001 < 1991

Census tracts

Insignificant change

Number (percentage) of CTs with zero DAs having the same classification

0

0

0

Number (percentage) of CTs with all DAs having the same classification

0

1 (2.2)

4 (12.1)

Average percentage of DAs with same CT classification

0.35

0.61

0.43

Total number of CTs with this classification

31

46

33

Leeds

MSOAs

2004 > 2001

MSOAs

2004 < 2001

MSOAs

Insignificant change

Number (percentage) of MSOAs with zero OAs having the same classification

0

0

0

Number (percentage) of MSOAs with all OAs having the same classification

0

0

0

Average percentage of OAs with same MSOA classification

0.58

0.43

0.27

Total number of MSOAs with this classification

48

35

25

Notes. CTs – census tracts; DAs – dissemination areas; MSOAs – middle layer super output areas; OAs – output areas; total CTs = 110; total MSOAs = 108.

Table 4. Spatial heterogeneity, burglary

Vancouver

Census tracts

2001 > 1991

Census tracts

2001 < 1991

Census tracts

Insignificant change

Number (percentage) of CTs with zero DAs having the same classification

0

0

1 (5.9)

Number (percentage) of CTs with all DAs having the same classification

1 (2.4)

0

3 (17.6)

Average percentage of DAs with same CT classification

0.50

0.57

0.45

Total number of CTs with this classification

41

52

17

Leeds

MSOAs

2004 > 2001

MSOAs

2004 < 2001

MSOAs

Insignificant change

Number (percentage) of MSOAs with zero OAs having the same classification

0

0

0

Number (percentage) of MSOAs with all OAs having the same classification

0

0

0

Average percentage of OAs with same MSOA classification

0.66

0.49

0.14

Total number of MSOAs with this classification

55

37

16

Notes. CTs – census tracts; DAs – dissemination areas; MSOAs – middle layer super output areas; OAs – output areas; total CTs = 110; total MSOAs = 108.

Table 5. Spatial heterogeneity, robbery

Vancouver

Census tracts

2001 > 1991

Census tracts

2001 < 1991

Census tracts

Insignificant change

Number (percentage) of CTs with zero DAs having the same classification

3 (11.1)

0

0

Number (percentage) of CTs with all DAs having the same classification

0

0

8 (22.2)

Average percentage of DAs with same CT classification

0.21

0.36

0.74

Total number of CTs with this classification

27

47

36

Leeds

MSOAs

2004 > 2001

MSOAs

2004 < 2001

MSOAs

Insignificant change

Number (percentage) of MSOAs with zero OAs having the same classification

0

0

2 (5.6)

Number (percentage) of MSOAs with all OAs having the same classification

1 (2.4)

0

0

Average percentage of OAs with same MSOA classification

0.61

0.39

0.35

Total number of MSOAs with this classification

42

30

36

Notes. CTs – census tracts; DAs – dissemination areas; MSOAs – middle layer super output areas; OAs – output areas; total CTs = 110; total MSOAs = 108.

Table 6. Spatial heterogeneity, sexual assault

Vancouver

Census tracts

2001 > 1991

Census tracts

2001 < 1991

Census tracts

Insignificant change

Number (percentage) of CTs with zero DAs having the same classification

7 (43.8)

0

0

Number (percentage) of CTs with all DAs having the same classification

0

0

8 (14.3)

Average percentage of DAs with same CT classification

0.11

0.31

0.69

Total number of CTs with this classification

16

38

56

Leeds

MSOAs

2004 > 2001

MSOAs

2004 < 2001

MSOAs

Insignificant change

Number (percentage) of MSOAs with zero OAs having the same classification

0

0

5 (13.9)

Number (percentage) of MSOAs with all OAs having the same classification

11 (37.9)

0

0

Average percentage of OAs with same MSOA classification

0.53

0.51

0.35

Total number of MSOAs with this classification

29

43

36

Notes. CTs – census tracts; DAs – dissemination areas; MSOAs – middle layer super output areas; OAs – output areas; total CTs = 110; total MSOAs = 108.

Table 7. Spatial heterogeneity, theft

Vancouver

Census tracts

2001 > 1991

Census tracts

2001 < 1991

Census tracts

Insignificant change

Number (percentage) of CTs with zero DAs having the same classification

0

0

0

Number (percentage) of CTs with all DAs having the same classification

2 (8.7)

5 (6.9)

2 (13.3)

Average percentage of DAs with same CT classification

0.46

0.70

0.45

Total number of CTs with this classification

23

72

15

Leeds

MSOAs

2004 > 2001

MSOAs

2004 < 2001

MSOAs

Insignificant change

Number (percentage) of MSOAs with zero OAs having the same classification

0

0

0

Number (percentage) of MSOAs with all OAs having the same classification

0

0

0

Average percentage of OAs with same MSOA classification

0.53

0.36

0.29

Total number of MSOAs with this classification

45

33

30

Notes. CTs – census tracts; DAs – dissemination areas; MSOAs – middle layer super output areas; OAs – output areas; total CTs = 110; total MSOAs = 108.

Table 8. Spatial heterogeneity, theft of vehicle

Vancouver

Census tracts

2001 > 1991

Census tracts

2001 < 1991

Census tracts

Insignificant change

Number (percentage) of CTs with zero DAs having the same classification

0

0

1 (3.0)

Number (percentage) of CTs with all DAs having the same classification

1 (2.8)

2 (4.9)

3 (9.1)

Average percentage of DAs with same CT classification

0.42

0.59

0.42

Total number of CTs with this classification

36

41

33

Leeds

MSOAs

2004 > 2001

MSOAs

2004 < 2001

MSOAs

Insignificant change

Number (percentage) of MSOAs with zero OAs having the same classification

0

0

0

Number (percentage) of MSOAs with all OAs having the same classification

0

0

0

Average percentage of OAs with same MSOA classification

0.71

0.52

0.11

Total number of MSOAs with this classification

56

41

11

Notes. CTs – census tracts; DAs – dissemination areas; MSOAs – middle layer super output areas; OAs – output areas; total CTs = 110; total MSOAs = 108.

Table 9. Spatial heterogeneity, theft from vehicle

Vancouver

Census tracts

2001 > 1991

Census tracts

2001 < 1991

Census tracts

Insignificant change

Number (percentage) of CTs with zero DAs having the same classification

0

0

0

Number (percentage) of CTs with all DAs having the same classification

1 (5.0)

3 (4.1)

3 (18.8)

Average percentage of DAs with same CT classification

0.46

0.67

0.43

Total number of CTs with this classification

20

74

16

Leeds

MSOAs

2004 > 2001

MSOAs

2004 < 2001

MSOAs

Insignificant change

Number (percentage) of MSOAs with zero OAs having the same classification

0

1 (2.6)

0

Number (percentage) of MSOAs with all OAs having the same classification

0

0

0

Average percentage of OAs with same MSOA classification

0.68

0.53

0.24

Total number of MSOAs with this classification

48

38

22

Notes. CTs – census tracts; DAs – dissemination areas; MSOAs – middle layer super output areas; OAs – output areas; total CTs = 110; total MSOAs = 108.

Comments
0
comment
No comments here
Why not start the discussion?