Skip to main content
SearchLoginLogin or Signup

Connecting the Corrupt: Data Sources to Study Networks of Serious Financial Crime in the United Kingdom

Diviák, T., & Lord, N. (in press). Connecting the Corrupt: Data Sources to Study Networks of Serious Financial Crime in the United Kingdom. In L. Huey & D. Buil-Gil (Eds.), The Crime Data Handbook. Bristol: Policy Press.

Published onFeb 07, 2024
Connecting the Corrupt: Data Sources to Study Networks of Serious Financial Crime in the United Kingdom


Social network analysis (SNA) is an approach concerned with analysing networks of relations and interactions among a defined set of actors. In recent years, SNA has become known as a useful tool for analysing a wide range of criminal networks including networks of serious financial crime. However, using SNA in the study of crime is hindered by the aim of actors involved in these to conceal their interactions making data collection complicated. These complications stem from issues with data availability, validity, and reliability. To tackle these issues, we first introduce a framework for thinking about six aspects of network data collection: nodes, ties, attributes, levels, dynamics, and context. In the light of this framework, we subsequently review three types of data sources usable for analysing financial crime networks in the context of the United Kingdom. These data sources are documents accompanying Deferred Prosecution Agreements, enforcement case files, and commercial transaction data. We illustrate the contents of each of these data sources together with their potential for extracting network data and the types of conclusions that can be drawn analysing them. These data sources share common problems in being of secondary non-scientific nature and being prone to contain missing information. In conclusion, we illustrate further uses of SNA and possible extensions of the introduced data sources to other types of criminal networks and jurisdictions beyond the United Kingdom.

Corresponding author: Tomáš Diviák. Department of Criminology and Mitchell Centre for Social Network Analysis, University of Manchester, United Kingdom. E-mail: [email protected]. Website:


Serious financial crimes, in particular varied forms of white-collar, organizational and organized crimes, are all intrinsically relational; that is, they are fundamentally embedded in relations and interactions between actors who participate in them (Morselli, 2009; Campana, 2016). For instance, actors exchange resources in the form of money or illegal commodities such as drugs, they communicate with one another or they collaborate in order to reach their common goals. Analysing how actors in such crimes interact with each other is thus crucial both for scholarly understanding as well as for designing intervention and prevention measures to combat them.

Social network analysis (SNA; Robins, 2015; Borgatti et al, 2022) is an approach specifically developed to analyse networks of ties that capture the relations and interactions among a set of actors. In criminology, SNA has been applied to study criminal networks, in which criminal actors cooperate or communicate in committing crimes and avoiding detection, as in the cases of organized crime, gangs or terrorism (Morselli, 2009, 2014). Applying SNA to criminal networks allows researchers to describe the structure of a network, uncover key actors within it or their closely interconnected groupings or to model its formation and temporal dynamics.

The fact that actors involved in criminal networks strive for remaining undetected (predominantly by law enforcement) makes it difficult to collect data. Primary data collection is, due to these as well as ethical reasons, usually impossible, and so researchers in this area have to rely on various secondary sources of information from which to extract network data (Diviák, 2019; Bright et al, 2021). Whether researchers work with police sources, court and prosecution files or media reports, they are likely to encounter a wide array of issues with the validity, reliability and accessibility of the data.

Social network analysis and social network data

A network is defined as a set of nodes, in criminal networks typically representing actors, and ties among them representing their relations and interactions (compare Robins, 2015; Borgatti et al, 2022). This is the minimum necessary information to conduct SNA, although the analysis may not stop there and further information may be taken into account to provide a richer picture of the analysed case. Here, we will distinguish six different aspects of network datasets relevant for analysis of criminal networks: nodes; attributes; ties; dynamics; modes; and context (Diviák, 2019).


Defining the node set is done by specifying the boundaries of the network (Laumann et al, 1983) wherein researchers either let the actors themselves define the boundaries (realist approach) or define a criterion which they use for inclusion of nodes in the network (nominalist approach). As primary data collection is usually impossible in this area of study, criminal network researchers mostly rely on the nominalist approach. The criteria used for doing so may vary, based on the research question and on the available information, from including every person mentioned in the data source to only those convicted in the given case (Morselli, 2009).


Nodes can be further described by their attributes – individual variables denoting various relevant traits of actors such as their gender, age or specific skills and so on. Much like in usual data analysis, the choice of which attributes to analyse and how depends on the research question and available information. In criminal network analysis, the availability of some attributes can be a serious obstacle. Researchers may, for example, be interested in the role of skills in law or accounting when analysing fraud or white-collar crime, but information about these attributes is not always present in the source, making such research questions impossible to answer.


Ties (or edges) represent the connections between the nodes, their relations or interactions. Ties among the chosen set of nodes are established by extracting the information from the source, which may be in the form of a log of contacts (police surveillance or intercepted communication), textual summary (prosecution or judicial documents) or records of transactions (electronic record of payments between accounts). Ties can be either directed (that is, having a direction from one actor to another, such as when A calls B) or undirected (A and B meeting together). Furthermore, ties can also have different strengths, ranging from binary ties (indicating only presence/absence of a tie) to numeric values representing, for instance, the number of times the two actors met or the volume of money exchanged between them. Different types of ties can also be distinguished based on the different types of relations they represent, such as when one makes a distinction between communication and exchange of resources. In criminal networks we may, for instance, want to distinguish between criminal ties (that is, collaboration on carrying out criminal tasks) and non-criminal ties (for example, kinship or friendship, cf Smith and Papachristos, 2016).


Dynamics denotes the evolution of the network over time. Network dynamics can be seen as a spectrum, ranging from a static snapshot of a network at one point in time capturing no temporal dynamic, through multiple such consecutive snapshots, to the most fine-grained picture provided by a specific time every single tie was created. In practice, the most detailed information about when each tie was created is rarely available, and thus criminal network analysts usually work with less granular network dynamics distinguishing different periods of a given network’s evolution (Diviák and Lord, 2022). These periods may be separated either temporally by a unit of time (for example, year or month) or by a specific event (for example, before and after a certain contract is awarded).


A network comprising only a single set of nodes that can all in principle be connected by a given type of tie can be referred to as a one-mode network. A two-mode (bipartite) network is a network with two distinct sets of nodes with ties permitted only across these types (modes), not within them. A typical example in social scientific context is a network of actors and groups or settings of which they are members or which they attend. The actors on the one hand and the groups on the other are the two modes with ties representing membership; it is meaningless to think of an actor who would be a member of another actor or a group that would be a member of another group, thus ties of membership within modes are not permitted.


By context, we mean the qualitative information pertaining to the given case that cannot be captured within the previous five aspects. The contextual information may be important for two reasons in criminological analysis. Firstly, it may aid in interpretation of results, such as when revealing potential data collection biases (Bright et al, 2021) or complementing findings from analyses by using the qualitative information to corroborate or refute quantitative SNA results. Secondly, the contextual information is vital for constructing network-level variables that refer to the network’s predispositions (such as its goals) or its outcomes (such as its success or failure).

In the following section, we turn our attention to three data sources which researchers may use to extract network data about white-collar and organizational crime. For each data source, we outline its main features together with what information it contains in terms of the six network data aspects already outlined.

Data sources

When it comes to researching ‘serious financial crimes’, how we conceptualize and operationalize our scope of inquiry is more than semantics, as the definitional parameters we use have implications for the types of relational data we need to access. There is no need here to revisit long-standing definitional debates, but, for indicative purposes, we are interested in this chapter in looking at available data sources that can help us to make sense of crimes undertaken by actors, individual and organizational, that seek to generate individual and/or organizational (financial) gain, notably corporate bribery, counterfeiting and money laundering. We focus also on the UK, where rich (relational) data on financial crimes are not routinely available or accessible, creating obstacles to analysing the social networks of these white-collar phenomena.

Official documentation from Deferred Prosecution Agreements

We start with a data source that has been available in the UK only since 2015 – the official documentation that accompanies Deferred Prosecution Agreements (DPA) negotiated with corporate entities for serious corruption. In brief, a DPA is a voluntary agreement between the public prosecutor, usually the Serious Fraud Office (SFO), and a corporation implicated in fraud or corruption. DPAs are not available to individual, human actors, and can currently be used only in relation to specific criminal offences. Once negotiated, they require judicial approval to make sure the agreement is in the interests of justice. To date, nine DPAs have been approved. On approval of the DPA, the following documentation must legally be made publicly available on the prosecutor’s website:

•           the Deferred Prosecution Agreement (contains details on the terms and conditions of the agreement);

•           the Statement of Facts (contains a detailed account of the alleged criminal behaviours, including what happened, who was involved and when, agreed upon by the prosecutor and the corporation);

•           the Approved Judgment (contains the presiding judge’s overview of the case and sets out factors for and against approval alongside case specifics).

In some cases, other documentation may be provided, but the above three documents represent the core of what must be published, and all offer rich data for social scientific analysis. In terms of relational data for SNA, it is the Statement of Fact that provides relevant information. These documents provide written accounts that include background information on the corporation (for example, size and scope), the timeline of events (for example, from when the case came to the attention of the corporation and/or prosecutor, and when investigations commenced), and detailed descriptions of each criminal offence that occurred (for example, what happened, who was involved, when and how) that also draw upon raw materials from the corporation, such as extracts from communications between key players, or transactional data relating to the flows of resources. The following extract provides an excerpt from the Airline Services Ltd [ASL] DPA.

23.         In acting on behalf of ASL, ASL Agent 1 worked closely with ASL Senior Employee 3 and ASL Senior Employee 4 who were based in Germany. A large part of the business that ASL Agent 1 introduced to ASL was with Lufthansa.

24.         At the same time as acting as an agent for ASL, ASL Agent 1 was also retained by Lufthansa as a consultant project manager in a department named Product Competence Centre Cabin Interior & In-flight Entertainment. Former Lufthansa Senior Employee 2 allocated work and gave instructions to ASL Agent 1.

As you can see in the extract, there is much of interest to inform questions about the relations between actors. Specific information that describes relations or interactions in the Statement of Facts among involved actors can be coded using content analysis. For instance, the sentence in paragraph 23 from the foregoing excerpt related to Airline Services Ltd DPA, wherein some employees of this company colluded and manipulated contracts to achieve personal financial gain, ‘ASL Agent 1 worked closely with ASL Senior Employee 3 and ASL Senior Employee 4. mentions cooperation between the three actors which can be coded as two undirected binary ties: one tie connecting ASL Agent 1 and ASL Senior Employee 3 and another tie connecting ASL Agent 1 and ASL Senior Employee 4. The entire Statement of Facts can be coded this way, eventually yielding a network visualized as a sociogram, depicting nodes and ties among them (Figure 1). One might wonder whether subjective interpretation of the text might not come into play during such coding. That is certainly possible in some cases, and in others the text may be far from straightforward to distinguish the actors, their attributes or ties. For this reason, we recommend that at least two coders code the source independently, which enables verification of coding and assessment of reliability. An in-depth treatment on how to extract network data from text and assess coding reliability on the example of Statements of Facts can be found in (Diviák and Lord, 2023).

Figure 1: Sociogram of Airline Services Ltd network. Tie thickness corresponds to the number of interactions between the two actors.

Regarding the six aspects of criminal network data, DPA Statements of Facts provide a good picture of who was involved (actors), sometimes even allowing the use of different criteria for the definition of network boundaries. It is also always possible to establish ties, although their direction, strength or type is usually not reported consistently enough to reliably permit treating the network as directed or multiplex. Individual attributes and network dynamics are considerably inconsistent across the different DPAs and thus, in some cases, it is possible to use multiple fine-grained attributes or detailed timelines, whereas in other cases it may be impossible. Multiple different modes may be established as well, although this usually does not bring substantial improvement to the analysis as, for instance, the actors are usually not affiliated with multiple companies. Statements of Facts are quite rich in terms of all the contextual information which enables both qualitative complementing of the analysis and the inclusion of case-level variables.

Our previous research (Diviák and Lord, 2022) on three corruption networks based on DPAs suggests that while there is considerable variability in the number of participants and the temporal span of the networks, they display remarkable structural similarities – they are organized into a structure of densely interconnected core and loose periphery, ties tend to disproportionately accumulate in a few dyads and the activity of the actors escalates in correspondence to newly arising opportunities for corruption (that is, emergence of new contracts susceptible to manipulation).

Analogous documentation outside of DPAs may also be available, such as judicial sentencing remarks relating to convictions of fraudsters via the Crown Prosecution Service, but their availability is inconsistent. For instance, judges may not use pre-written scripts of their remarks, meaning that researchers would have to be present in court to hear the full details, or rely on piecemeal media reporting.

Enforcement investigation case files

The second data source we draw attention to is enforcement investigation case files. These are private and sensitive data sources, especially when relating to ‘live’ cases, but also when relating to ‘closed’ cases, but in both instances they can be lawfully accessible through negotiation as part of data sharing agreements between research institutions and enforcement authorities and/or regulators. Note that different jurisdictions present different challenges when negotiating case file access. When accessed, case files can incorporate diverse data sources. For instance, Lord et al (2017) analysed the files of an ongoing case relating to counterfeiting being undertaken by a regulator in Europe. These case files included textual and visual data, such as:

•           the written reports of investigators;

•           the social media profiles of actors under suspicion;

•           geo-spatial data on the locations of implicated businesses;

•           logistics information on the movement of vehicles and goods, plus photographic intelligence;

•           seizure data;

•           information about arrests and the materials in possession of those arrested;

•           business and financial relationships between those actors and companies implicated;

•           (fabricated) invoices;

•           personal information on family networks and histories.

Relational insights can be extracted from all these varied data sources and this of course is beneficial for SNA. The advantage of enforcement case files is the fact that they compile a multitude of different data sources which may be used to corroborate each other, thus increasing data validity. The disadvantage is that data reliability may be problematic, as the underlying sources may contain very different information not usually intended for scientific research, making them difficult to compare in turn. Additionally, ‘live’ cases may suffer from incomplete coverage and sometimes from unverified intelligence or evidence – some actors or ties may not be uncovered early on in the investigation, while some suspicions may turn out to be inaccurate. If such actors or ties are central to the network, their absence may seriously distort our picture of the analysed case. The incomplete coverage extends also to closed investigations in cases of particularly complicated heterogeneous sets of criminal activities. To access case file data, formal agreements are usually needed that ensure the data are securely and confidentially stored and that non-anonymized content is not shared beyond research teams, and certainly not publicly. It may be necessary to also agree on publication and presentation permissions so as to ensure no sensitive data are inappropriately shared.

In terms of the six aspects of network data, enforcement case files contain a wealth of information about individual actors. We recommend treating it carefully, because the boundaries of the network may be relatively unclear – some actors may appear in some of the sources, but the reason for their inclusion may vary from source to source, making interpretation difficult. This inconsistency also extends to ties, attributes and dynamics, because these three aspects are also frequently captured in very different ways from source to source. Where enforcement case files ‘shine’ is in the possibility to extract information about many different modes and incorporate it in the network. These modes may represent legal persons (firms), physical places or movable objects, and they may illuminate the indirect connections between actors by being associated with these entities. Lastly, enforcement case files enable triangulation of findings from different points of view based on the different sources a given case file contains.

Figure 2: Sociogram of a multimodal counterfeit alcohol distribution network (taken from Bellotti et al, 2022).

The network depicted in Figure 2 represents the aforementioned counterfeit alcohol distribution network. This network is multimodal – there are different types of nodes with only specific types of ties being possible between the modes. The modes in the data are actors, firms and locations. The affiliations of individuals to firms or their usage of certain products reveal indirect connections between the modes that would otherwise be hidden. Such network representation of the network in turn enables analysis of not only the structure of the network and positions of key actors within it but also the structure of the crime-commissioning process and how it is overseen by the key players (Bellotti et al, 2018).

Commercial transaction data

Gaining insights into the inner workings of businesses, whether for enforcement authorities or academic researchers, is highly challenging. Corporations are like ‘black boxes’, that is, enclosed systems of social relations that obscure internal workings (Whyte, 2020: 87), creating challenges to accessing the relational aspects of commercial interactions. One option is to explore data leaks from whistle-blowers and investigative journalists that illuminate the hidden financial arrangements and structures of the global financial system. For instance, in recent years we have seen the Panama and Paradise Papers, the Swiss and Bahamas Leaks, and in 2020 the so-called ‘FinCEN files’ leak. However, such leaks ‘are always highly selective, and while they may reveal some systemic processes, at most they open the box in only a small minority of cases’ (Whyte, 2020: 92). Also, such data are private and unlawfully leaked, which raises ethical questions for researchers as to whether such data ought to be analysed. Some leaked data may be accessible through legitimate means in some jurisdictions (for example, suspicious activity reports (SARs)) or with some companies (for example, commercial data), but access varies greatly, particularly given their sensitive nature.

Let us now look at one example of such data leaks to determine their use for SNA: the FinCEN files. All corporate financial crimes involve flows of illicit and/or criminal finances. As part of the global anti-money laundering regulatory framework, so-called ‘regulated entities’ such as banks and financial services providers, among many others, are required to undertake due diligence on their clients and monitor financial transactions that flow through their organization. Where such transactions are considered to be ‘suspicious’ by bank and organizational compliance teams, there is a legal requirement to report these suspicions to the Financial Intelligence Unit (FIU) in the jurisdiction where the organization is based. In the US, the Financial Crimes Enforcement Network (FinCEN) acts as the FIU. All regulated entities in the US must submit their SARs to FinCEN. In the UK, the FIU is located within the National Crime Agency.

SARs represent the concerns of compliance actors within the private sector whose subjective interpretation of objective financial transactions raises suspicions about their nature. A SAR will include details on the individuals and entities involved in the financial transaction(s), as well as the monies, currencies, accounts, jurisdictions and dates involved, alongside a narrative that outlines why the transactions have been interpreted as suspicious, covering their nature and circumstances. These reports include commercial and private transactional data that ought not to be disclosed publicly, but in 2020, data on SARs submitted to FinCEN were leaked to BuzzFeed News and subsequently to the International Consortium of Investigative Journalists (ICIJ). The ICIJ then published some transactional data from the accessed files to draw attention to the illicit and criminal behaviours enabled by global banks and the financial system.[i] Some data have been made available publicly for all to download in the form of a .csv file that includes data on financial transactions between originator and beneficiary banks, entities involved and locations, but raw documents or personal information were not released. The transactional and entity data from the FinCEN files illuminated the financial behaviours of individual and corporate actors from across the globe, including the UK.

The node set in the case of FinCEN files is constrained to banks, although it is not clear why the banks included in the data leak are there and others are not. In terms of ties, there is considerably granular information about the strength (volume of money) and direction, but a crucial thing to consider is that even though transactions flow from one account to another, the data is aggregated at the bank level, thus obscuring the level at which the actual decisions where from and where to send the money are located. While the FinCEN files contain no information about banks’ attributes, this information can be added from other sources because the banks’ identity is known. The temporal information is considerably granular due to the need to be able to precisely identify the suspicious transactions and therefore there is information about the date on which it occurred. This leaked file contains no other modes besides the banks, but the biggest apparent drawback is the lack of contextual information about how the data was compiled, how it was exactly generated, or how likely it is to be valid. This makes it very difficult to distinguish genuine findings from artefacts of the data production.

Figure 3: Main component of the network constructed from the FinCEN files leak. Size of nodes is proportionate to the natural logarithm of the volume of transactions they are involved in.

Figure 3 depicts the sociogram of the main component in the network extracted from the FinCEN files. The main component is the largest set of nodes where there is a path between each pair of nodes. Examining its structure reveals that there is considerable centralization of incoming transactions, which means that some banks receive disproportionately more transaction than others. Such a finding may be interpreted in two ways. It may be a sign that suspicious transactions tend to revolve heavily around a few banks that are particularly prone to them; or it may be due to the focus of those who leaked or published the data, because they wanted to concentrate on them in newspaper articles; or even just because the data on those highly central banks was easier to obtain. Without further contextual information about the process of creating this data, it is impossible to distinguish whether the result reflects the underlying reality or whether it is an unintentional by-product of data collection.

Common issues of the data sources

Having collected network data, SNA provides researchers with a rich toolbox of measures and models. As we have already touched upon, data collection on criminal networks is complicated by their defining aspect: actors involved in these types of networks aim to avoid detection. The complications stemming from actors in criminal networks trying to avoid detection translate into all the data sources that we have just described in two major ways: either by omitting certain information or by over-representing some information.

The problem with omission of information is a variation on the problem of missing network data. This issue is serious enough for SNA in general, but it is exacerbated in criminal network analysis because the aim of actors to remain hidden increases the likelihood of incomplete data, as some actors may be sophisticated enough to avoid any detection, or some of their interactions may not be detected, due to more sophisticated methods of communication (Campana, 2016). This means that the incompleteness is likely not to be random but, rather, systematic (Diviák, 2019; Bright et al, 2021). Moreover, what is absent (that is, genuinely does not exist) and what is missing (that is, we are not aware of its existence) is usually impossible to distinguish in criminal network data, limiting the ability to use missing data imputation techniques. To this end, using data sources that combine different sources of information or triangulating different data sources helps in assuring as much coverage of a given case as possible (Bright et al, 2021). However, this means that ‘live’ cases (that is, case not yet scrutinized by multiple parties in the criminal justice system) or cases where triangulation is not possible are especially prone to absent and missing information and researchers should thus be extra careful in evaluating the validity of their data sources and the way they extract information from them to construct network datasets.

The over-representation of some aspects of the data reflects the approaches, including policing (and cultural) mindsets and practices towards the investigation of issues of interest, or the use of routine approaches that in turn can shape how the case is constructed. A typical instance of this over-representation is the spotlight effect (Smith and Papachristos, 2016). Spotlight effect refers to the situation when some actors have high numbers of ties, not due to their genuinely high activity but because the data collection centres on them. For instance, the spotlight effect may arise in the case of enforcement data when law enforcement agents start an investigation by putting a set of actors under initial surveillance and all the subsequently investigated actors come into surveillance because of their contact with the initial set. Entire regions of the network may appear artificially more central or dense in the transaction data solely because there are particular jurisdictions where scrutiny and surveillance is more thorough. In such cases, the jurisdictions with less thorough surveillance may actually have denser networks, as this feature lends itself to being abused for money laundering and so on, yet the network representation may paint the opposite picture. These effects can be alleviated in statistical modelling by introducing appropriate control variables, but the information for constructing these control variables has to be present in the data source in the first place.


SNA has proven itself useful in criminology and criminal investigation in analysis of various organized criminal groups ranging from gangs and trafficking of illegal commodities to terrorist networks, but also in designing prevention and intervention measures (Morselli, 2009; 2014). Specifically in financial crime settings, SNA has been used to show the surprising resilience of corruption networks to the removal of central actors and the way that criminal opportunities create dynamics that allow these cases to be exposed (Diviák and Lord, 2022), or how the scripting process underlying counterfeit alcohol distribution intersects with the criminal network to highlight especially important actors whose removal may incapacitate the network (Bellotti et al, 2018).

Both the generation of scientific knowledge and its practical application rely fundamentally on good data – almost a trivial statement, yet perhaps the biggest obstacle in the study of criminal networks. Serious financial crimes are no exception, as those implicated in bribery, counterfeiting or money laundering strive to retain their secrecy or confidentiality. Their relations and interactions, much like their illegal trafficking or gang-related counterparts, are difficult to access. Unlike gang or trafficking networks, financial crime networks remain relatively understudied. We wanted to bring more attention to the study of these phenomena from the network perspective by introducing available data sources amenable to SNA. For instance, DPAs are increasingly available globally, with other major economic players now using similar legal tools, so we can expect to see more such data openly accessible. We also see increasingly productive academic—enforcement data-sharing agreements, such as in the Netherlands, where data on organized crime and money launderers are being shared for network analysis. Although none of the three data sources covered in this chapter is perfect, we hope that they have potential in constructing networks that will eventually help researchers in this domain to accumulate a vast body of knowledge similar to other areas of SNA application in criminology.


Bellotti, E., Spencer, J., Lord, N. and Benson, K. (2018) Counterfeit alcohol distribution: A criminological script network analysis. European Journal of Criminology, 147737081879487.

Bellotti, E., Lord, N., Elizondo, C., Melville, J. and Mckellar, S. (2022) ScriptNet: An integrated criminological-network analysis tool. Connections, 42.

Borgatti, S., Everett, M., Johnson, J. and Agneessens, F. (2022) Analyzing social networks using R (First). SAGE Publications.

Bright, D., Brewer, R. and Morselli, C. (2021) Using social network analysis to study crime: Navigating the challenges of criminal justice records. Social Networks, 66, 50–64.

Campana, P. (2016) Explaining criminal networks: Strategies and potential pitfalls. Methodological Innovations, 9, 2059799115622748.

Campana, P. and Varese, F. (2012) Listening to the wire: Criteria and techniques for the quantitative analysis of phone intercepts. Trends in Organized Crime, 15(1), 13–30.

Diviák, T. (2019) Key aspects of covert networks data collection: Problems, challenges, and opportunities. Social Networks.

Diviák, T. and Lord, N. (2022) Tainted ties: The structure and dynamics of corruption networks extracted from deferred prosecution agreements. EPJ Data Science, 11(1), 7.

Diviák, T. and Lord, N. (2023) From text to ties: Extraction of corruption network data from deferred prosecution agreements. Data & Policy 5, e4.

Laumann, E., Marsden, P. and Prensky, D. (1983) The boundary specification problem in network analysis. Applied Network Analysis: A Methodological Introduction 61, 18–34.

Morselli, C. (2009) Inside criminal networks (Roč. 8). Springer New York.

Morselli, C. (2014) Crime and networks. Routledge.

Robins, G. (2015) Doing social network research. SAGE Publications.

Smith, C.M. and Papachristos, A.V. (2016) Trust thy crooked neighbor: Multiplexity in Chicago organized crime networks. American Sociological Review 81(4), 617–643.

Whyte, D. (2022) Follow the money: Inside the black box of the corporation. In: M. Mair, R. Meckin and M. Elliot (eds), Investigative methods: An NCRM innovation collection (pp 87–96). Southampton: National Centre for Research Methods.

[i] ICIJ investigation accessible here:

No comments here
Why not start the discussion?