How much missing data does PIRUS have? How should users handle the problems associated with missing data?
Are the PIRUS data available in the Terrorism and Extremist Violence in the United States (TEVUS) portal?
What is PIRUS?
PIRUS is a deidentified cross-sectional, quantitative dataset of individuals in the United States who radicalized to the point of violent or non-violent ideologically motivated criminal activity, or ideologically motivated association with a foreign or domestic extremist organization from 1948 until 2013 (except for two cases from 2014). The PIRUS dataset was coded using entirely open-source material, including newspaper articles, websites (e.g., government, terrorist group, watchdog groups, research institutes, personal information finder sites), secondary datasets, peer-reviewed academic articles, journalistic accounts including books and documentaries, court records, police reports, witness transcribed interviews, psychological evaluations/reports, and information credited to the individual being researched (verified personal websites, autobiographies, social media accounts). PIRUS contains dozens of variables containing information on a wide range of characteristics, including the individuals’ criminal activity and/or violent plots, their relationship with their affiliated extremist group(s), adherence to ideological milieus, factors relevant to their radicalization process, demographics, background, and personal histories. The dataset is not limited to a single ideological category, and includes individuals representing far right, far left, Islamist, and single-issue ideologies.
How were the data in PIRUS collected and coded?
Data collection and coding for the dataset occurred in several stages. First, researchers used open-sources and extant START research products to collect a list of names and preliminary background information on approximately 4,000 individuals from various ideological milieus and time frames for possible inclusion in the dataset. Second, researchers coded each of these observations to determine whether the individuals should be included in the dataset using the inclusion criteria detailed below. Third, researchers coded the relevant background, contextual, and ideological information for a random sample of individuals who were selected for inclusion in the dataset. Random sampling techniques were used to maximize (although not guarantee) the representativeness of the dataset at all points in time that are covered by the project (see question below regarding the representativeness of the dataset). The criteria coding and full coding stages occurred in multiple waves, thereby producing sub-sets of fully coded data that allowed for preliminary analysis in the initial phases of the project.
Why does the dataset end in 2013? When will it be updated to include the most current individuals?
The first stage of data collection (name identification, outlined above) occurred in the first six months of 2013, after which the research team moved onto the second and third stages of criteria coding and full-set coding the individuals identified during the first stage. Because of the project timeline, as well as limited project resources, the PIRUS dataset does not yet include individuals from 2014-present. However, the research team plans on releasing subsequent updates to the PIRUS dataset that will include individuals whose radical activities occurred or became public knowledge in 2014 and later.
What are the criteria for an individual to be included in the PIRUS dataset?
In order to be eligible for inclusion, each individual must meet at least one of the following five criteria:
- Arrested/Charged: The individual was arrested for committing an ideologically-motivated crime. This includes arrests or their equivalents outside the United States.
- Indicted: The individual was indicted for an ideologically motivated crime. This includes indictments or their equivalents outside the United States.
- Killed in Action: The individual was killed as a result of his/her ideological activities. This includes being killed during the commission of an attack, including suicide, being killed during an attempted arrest/detaining by security forces, being targeted by security forces (even if not the primary target), and being killed in an unmanned aerial vehicle strike.
- Member of Designated Terrorist Organization (DTO): The individual is or was a member of a terrorist organization designated by the United States Department of State. Note: "Member" is defined broadly. This includes official members, individuals that the US government or another government claimed were members of a DTO (even if the group itself did not acknowledge the membership), and individuals which credible media sources link to the group (but not those based on pure speculation). It also includes individuals who claim membership in a DTO even if the group itself did not acknowledge membership.
- Violent Extremist Group Association (VEGA): The individual is or was associated with an extremist organization whose leader(s) or founder(s) has/have been indicted for an ideologically motivated violent offense. Note: "Association" is defined broadly. This includes official membership, membership claimed by a government, and self-identified association (even if the group does not acknowledge it). It also includes active participation in group activities, such as protests and newsletter subscriptions. "Association" does not include less active participation in group activities, such as signing a petition or listening to a speaker from the group at a public event.
In addition, each individual must:
- have radicalized in the United States,
- have espoused or currently espouse ideological motives, and
- show evidence that his or her behaviors are/were linked to the ideological motives he or she espoused/espouses.
Who makes the decision about whether or not an individual fulfills the inclusion criteria?
The decision of whether an individual meets all inclusion requirements is done systematically by full-time project researchers, or trained research assistants whose evaluations are reviewed by full-time project researchers.
How does PIRUS define radicalization?
We define radicalization as the psychological, emotional, and behavioral processes by which an individual adopts an ideology that promotes the use of violence for the attainment of political, economic, religious, or social goals. Indicators of radicalization within the scope of the PIRUS dataset consist of arrests, indictments, and/or convictions for engaging in, or planning to engage in, ideologically motivated unlawful behavior, or membership in a designated terrorist organization or a violent extremist group. Radicalization does not necessarily involve violence. For example, under the foregoing criteria, an individual who runs a website for a violent extremist group would meet the criteria for inclusion in the database.
How does PIRUS define Islamist, Far Right, Far Left, and Single Issue ideologies?
Islamist - We recognize that the terms “Islamist”, “jihadism”, and “jihadist” are applied inconsistently in both academic and policy circles, and can imply a wide range of meanings based on the context in which they are used. For this project, we use the broad term “Islamist” in reference to the religio-political methodology practiced by Sunni Islamist-Salafists who seek the immediate overthrow of incumbent regimes, and the non-Muslim geopolitical forces which support them, in order to pave the way for an Islamist society which would be developed through martial power. Although there are a number of Islamist-Salafist thinkers who do not advocate for violent military strategies to achieve their goals (e.g., Muhammad Nasiruddin al-Albani), in the U.S. context, the individuals we classify as “Islamists” are most commonly connected to, or inspired by, violent Islamist-Salafist groups that have their roots in the onset of “global jihadism” of the 1980s, including al-Qaeda and its affiliated movements. There are a number of ideological tenets commonly elaborated by Islamist-Salafist groups, including the imposition of shari’a with violent jihad as a central component, the creation of an expansionist Islamic state, or khalifa, and the use of local, national, and international grievances affecting Muslims, which are aired in an overtly religious context.
Far right - There exists a broad range of far right beliefs and actors (often overlapping movements), including both reactionary and revolutionary justifications of violence. In its modern manifestation in the United States, the ideology of the far right is generally exclusivist and favors social hierarchy, seeking an idealized future favoring a particular group, whether this group identity is racial, pseudo-national (e.g., the Texas Republic) or characterized by individualistic traits (e.g., survivalists). The extremist far right commonly shows antipathy to the political left and the federal government. As a result of this heterodoxy, this category includes radical individuals linked to extremist religious groups (e.g., Identity Christians), non-religious racial supremacists (e.g., Creativity Movement, National Alliance), tax protesters, sovereign citizens, militias, and militant gun rights advocates.
Far left - The far left in the United States is essentially class-oriented and consists primarily of individuals and groups that adhere to belief systems based on egalitarianism and the mobilization of disenfranchised segments of the population. With roots in the leftist student movement and radical prison reform movement of the late 1960s, traditional far left extremists generally sought the overthrow of the capitalist system, including the United States government, in order to replace it with a new, anti-imperialist economic order that empowers members of the “working class”. The traditional left included groups that maintained a distinct racial identity (e.g., Black Panther Party), which were motivated by a mix of economic grievances and race-based issues. Today, the far left is more commonly identified by followers of animal-rights and environmental protection issues. While not all animal rights or environmental groups are inherently leftist in orientation (for instance, there are Green Fascists), the vast majority of these individuals and groups identify with leftist political positions and have thus been included in the far left category for the purposes of this project.
Single issue - Single issue extremists are individuals who are motivated primarily by a single issue, rather than a broad ideology. Examples in the PIRUS data of single issue extremists are individuals associated with the Puerto Rican independence movement, anti-abortion extremists that were not motivated by traditional far right issues (anti-government, race superiority, etc.), members of the Jewish Defense League, and extremists with idiosyncratic ideologies (e.g., Ted Kaczynski).
Does PIRUS include individuals involved in incidents of hate crime?
Yes. The PIRUS dataset includes individuals who would commonly be considered as perpetrators of hate crime – that is, relatively spontaneous violent or threatening acts directed toward another individual on the basis on their gender identity, ethnicity, religious affiliation, or sexual preference.
Can PIRUS help users “predict” who will radicalize and who will not?
No. The PIRUS data are limited to only those individuals who displayed positive signs of radicalization (i.e., they adopted extreme ideologies or engaged in extremist ideologically driven behaviors). Without a control group consisting of comparable non-extremists, the data cannot be used to “predict” radicalization in terms of a cognitive process. Researchers should also avoid using the data to devise checklists of demographic, cognitive, or personal traits that may indicate that someone is radicalizing or has the potential to radicalize. Most of the indicators that are associated with radicalization, including personal trauma, loss of significance, marginalization, etc., are also found at significant rates among individuals who do not harbor or adopt extreme beliefs or engage in extremist behaviors.
Why are the names of individuals in the PIRUS dataset anonymized?
While every individual contained in the PIRUS data was identified and coded using only public sources, START researchers have chosen to keep the names of the individuals in the dataset anonymous, and instead utilize a 4-digit identifier for each observation in the data. These privacy assurance procedures were followed in order to comply with Interuniversity Consortium for Political and Social Research (ICPSR) guidance on de-individualizing submitted data. In addition, START researchers recognize the importance of protecting the identity of the individuals in the dataset, some of whom are no longer involved with the criminal justice system, and/or have desisted, disengaged, deradicalized since their first publicly known ideological behavior.
Is the PIRUS dataset a comprehensive sample of radicalization in the United States?
The PIRUS database is not, and should not be treated as, a comprehensive set of all individuals who have radicalized in the United States. Achieving a comprehensive dataset of all individuals who meet the database’s inclusion criteria remains implausible for several reasons. Such a hypothetical database would encompass an unwieldy population of interest, face an extreme shortage of similar, reliable sources of data from which to draw upon, and would require a massive investment in resources. Users should be careful to note that the PIRUS data are not comprehensive when looking at aggregate rates on variables of interest.
Is the PIRUS dataset a representative sample of individuals radicalized in the United States?
Every effort was made to maximize the representativeness of the data using random sampling techniques. However, for reasons outside of our control, the data may not be representative of radicalization in the United States at all points in time. First, given our reliance on open-sources, the sample likely reflects news reporting trends over time. That is, as reporters shift their primary focus from one ideology or movement to another, it becomes increasingly easier to identify individuals who are associated with the groups that are under intense media scrutiny, and increasingly harder to identify those who are not. For example, the post-9/11 period in the PIRUS data is likely over-representative of Islamist extremists compared to individuals affiliated with other extremist ideologies. Second, despite exhaustive searches, limited access to digital historical sources from the period beginning in 1940s and ending in 1980s make it difficult to properly represent this era in the data. Therefore, the database very likely includes a disproportionate number of more recent cases, which, if not corrected for, can bias the results of longitudinal trend analysis. Considering this, researchers should take caution when performing trend analysis with the PIRUS data. In particular, researchers should avoid analyses that compare aggregate numbers of cases over time. In addition, controls for exposure date should be included in all statistical analyses to help account for the effects of reporting trends.
How much missing data does PIRUS have? How should users handle the problems associated with missing data?
Missing data is a challenge that all researchers confront, but it is particularly significant for the PIRUS database given the nature of the data that were collected and the methods that were used to collect them. A number of variables in the PIRUS database, particularly those representing private and sensitive information, such as mental health history and childhood family dynamics, were especially challenging to obtain using publically available sources. The amount of missing data was also likely increased by our coding guidelines, which instructed researchers to be conservative and record values as missing instead of absent (i.e. as a missing code of “-99” instead of a value representing “No”) whenever sources failed to report values for most variables.
There are several options for dealing with missing data. Some rely on researchers’ substantive knowledge and case expertise to fill in missing values, while others employ advanced statistical techniques. Although no method provides a perfect solution, advances in techniques for handling missing data have made it possible to make valid inferences about causes and effects despite missing values on variables of interest. Several methods for handling missing data are detailed in Jensen, et al., Empirical Assessment of Domestic Radicalization. In particular, we identified four missing data techniques that are sensible options given the structure of the PIRUS data and our substantive knowledge of the cases and radicalization processes. These are: simple imputation using fixed values (i.e. cold-deck imputation), simple imputation using sub-group means, regression-based multiple imputation, and multiple imputation based on expected maximization calculations. We recommend that users of the data familiarize themselves with these, and other, approaches to handling missing data prior to performing statistical analyses using the PIRUS data.
What do the values -99 and -88 represent in the PIRUS data?
A value of -99 in the PIRUS dataset indicates that researchers were unable to find information in the public sources for the individual and variable under review (i.e., value is unknown). Alternatively, a value of -88 indicates that for a specific observation, the value is not applicable. For example, if an individual is not a known member of an extremist group, variables related to the individual’s role in an extremist group would be coded as -88. Users wishing to sum or average values in the data should be aware of these coding conventions, and take appropriate steps to recode or remove observations with these values.
Is PIRUS the definitive source on radicalization in the United States?
No. The PIRUS data are useful for understanding the processes of radicalization, as this is the largest dataset of its kind in the United States, but they should not be treated as the only source of such information. When making inferences about radicalization, users should look for commonalities across multiple data sources.
Are there things that I should avoid doing when analyzing the PIRUS data?
Yes. There are several questions of interest that the PIRUS data are not designed to answer. First, as noted above, the PIRUS data may not be representative of radicalization in the United States for all periods and, thus, users should be cautious when using the data to assess aggregate trends over time. Second, the PIRUS data cannot be used to predict who will radicalize. The data include only individuals who have been publically exposed as extremists. Without a control group of non-radicals, the data cannot be used to identify “predictors” or “indicators” of radicalization. Third, users should avoid drawing conclusions from the present or absent rates of individual variables in the PIRUS data without also assessing how often those variables are present or absent in the general population. For example, users who are interested in the immigration rates of the individuals in the PIRUS data should not draw conclusions about immigration and radicalization without first considering general rates of immigration in the United States.
What are some useful ways in which I can analyze the data?
The PIRUS data can be used to explore a number of important aspects of radicalization and extremism in the United States. These include comparisons of ideological and sub-ideological groups, group-based and lone actors, and violent and non-violent extremists. The data can also be filtered by exposure date, age, gender, location, ideology, group, and more to address specific aspects of radicalization in the United States. Users should be aware, however, that filtering the data reduces the number of valid cases for exploration and may render statistical tests ineffective.
How can I get access to the dataset?
The PIRUS data can be explored using the Keshif tool available here.
Users can also download the full dataset by filling out the download form found here.
Are the PIRUS data available in the Terrorism and Extremist Violence in the United States (TEVUS) portal?
Currently, the PIRUS data are not available in TEVUS.
What is Keshif and how do I use it?
Keshif is a web-based tool that allows users to easily and effectively identify relationships and trends in data. For more information on Keshif, including user-guides and instructional videos, visit: keshif.me.
Do I need to cite PIRUS when using the data in a publication?
Yes. Please use the following citation if using the PIRUS data in your own research/publication: National Consortium for the Study of Terrorism and Responses to Terrorism (START). (2017). Profiles of Individual Radicalization in the United States [Data file]. Retrieved from (http://www.start.umd.edu/pirus)
Who funded the data collection for PIRUS?
The bulk of data collection for PIRUS was supported by the National Institute of Justice, Office of Justice Programs, Department of Justice, through Award Number 2012-ZA-BX-0005. In addition, an effort to review and update information in the PIRUS dataset has been supported with funding from the Department of Homeland Security through the Center for the Study of Terrorism and Behavior (CSTAB) Partner grant. The PIRUS dataset and any findings derived from the dataset do not represent the official positions of the National Institute of Justice, the Department of Justice, the Department of Homeland Security, or any other funding agency.
Who can I contact if I have questions about the data?
Inquiries about the PIRUS data, or START’s radicalization research in general, can be sent to: email@example.com