Statistics are Misleading 100% of the Time (Part 3)

Welcome to Part Three of the Puppycide Database Project's ongoing investigation series, "Statistics are Misleading 100% of the Time". In Part One of this series, we discussed how important statistics are to our understanding of police use of force. We also reviewed how this series of articles began as an attempt to contact sources that our researchers believed had conducted intensive research of dogs shot by police. In "Statistics are Misleading 100% of the Time Part Two", we tracked down the source of the statistic "A dog is shot by police every 98 minutes" to a fundraising page for a documentary film. We attempted to uncover the research behind this statistic, and compared the "98 minutes" claim with our own research here at the Puppycide Database Project.

In today's article, Part Three, we will examine another statistic about police shootings of dogs that has been even more influential than "A dog is shot by police every 98 minutes". Unlike "98 minutes", the statistic we will discuss today was in fact based on original research of use-of-force records from numerous police departments. In order to confirm whether this new statistic is accurate, we will need to learn basic information about key concepts used by researchers, such as primary and secondary sources, probability and bias (we promise - there will be absolutely no math!). Perhaps most interesting, we will see first-hand how a single sentence in an article long-since out of print has become the basis for the policy of one of the largest animal welfare organizations in the world and headline news across the country, without being read or cited.

"Round numbers are always false." - Samuel Johnson

Although it was created elsewhere, the statistic we will investigate today escaped the attention of the public until a policy paper was published by the Department of Justice. The publication was an overnight success. Almost immediately, the paper became a critical part of the literature related to police violence toward animals: The Problem of Dog-Related Incidents and Encounters (PDF).

The authors of Problem were then - and remain today - highly respected and frequently cited in their field. They included:

  • Dr. Patricia Rushing from the University of Illinois, Director of the Illionois COPS Regional Community Policing Institute as well as UI's Center for Public Safety and Justice.

  • Karen Delise, founder and Director of the National Canine Research Council (NCRC) whose work on dog breed identification and canine-related fatality trends have been repeatedly cited and applauded by the Puppycide Database Project

  • Donald Cleary, also of NCRC

  • Ledy VanKavage of both Pit Bull Terrier Initiatives as well as the Best Friends Animal Society

  • and Dr. Cynthia Bathurst, who worked for over 26 years in the field of mathematics research with Horrigan Analytics and currently serves as the Director of Safe Humane Chicago, Project Safe Humane and the Best Friends Animal Society

For those who aren't familiar with the rather insular world of canine behavior research, this was very much a group of rock stars. Their work is regularly excellent, and in cases where Puppycide Database Project has not directly cited work by these authors, the odds are good that our researchers have at least read it.

The Problem of Dog-Related Incidents and Encounters was published on behalf of the Department of Justice's Office of Community Oriented Policing Services (COPS), in itself somewhat of a coups. The Department of Justice does not, as a general rule, lend itself as a sounding board for animal welfare activists. This is the same DOJ that aggressively prosecutes "Ecoterrorism" cases. A great deal of the attention that Problem received was due to its source of publication. Rarely have any law enforcement agencies published any comprehensive policy on dealing with canines. The idea of a law enforcement agency inviting animal welfare advocates to prepare that policy for them remains completely unheard of.

Problem of Dog-Related Incidents and Encounters COPS DOJ Cover

We should also be clear that The Problem of Dog-Related Incidents and Encounters was and is an excellent example of a policy paper. Its recommendations were clear and well argued. Problem drew attention to the potential civil liability that police departments faced from avoidable shootings and provide examples that officers could use to interpret canine behavior and respond accordingly without the need for violence. Why not save your department real dollars by avoiding settlements and legal bills, not to mention a PR nightmare?

A lot of good came from the paper's publication in 2011. Law enforcement administrators frequently cited The Problem of Dog-Related Incidents and Encounters as an inspiration in the coming years as police departments began to introduce training for canine encounters. A lot of attention was given to the publication in newspapers across the country, and rightfully so. Here was a reliable source of information about the increasingly controversial topic of police violence toward animals, a topic that had until then escaped serious inquiry from academics or government.

But that said, we are here to talk about statistics.

Unfortunately, some of the data contained in the Problem report was inaccurate: a mistake caused by relying on unreliable citations and by attempting to make very large claims from a very small amount of information. Worse, the data included in Problem inspired a game of Telephone. The statistics included were mis-read, republished, mis-read and republished again, until the original statements took on a life of their own. Specific law enforcement agencies had become all law enforcement agencies. Three cities became the entire United States. The policy paper became a study. And out of this process was born a single statistic: Half of the intentional shootings by police involve dogs.

"One feather is of no use to me, I must have the whole bird."Grimm Fairy Tales

The claim that 'half of police shootings involve dogs' is even more incredible than the 'police shoot a dog every 98 minutes' fallacy we examined in Part 2 of our series. With '98 minutes', all that is taken for granted is a knowledge of the annual rate of dogs killed by police. The 'half of police shootings' statistic requires an understanding of not only how many dogs are shot by police, but also the number of all intentional shootings by police - whether the target is another type of animal or even a human being.

Seattle Times Half all police shootings involve dogs

Despite recent advances from independent research groups like Fatal Encounters, the public's understanding of the rates at which police shoot human beings remains incomplete. When the 'half of all police shootings involve dogs' statistic first began to receive serious attention sometime between late 2011 and 2012, there was no simply no reliable statistical information available to the public about the national rates of police shootings.

Although the FBI's national records on police homicides were available, the FBI's data has long since been discredited. Police departments are not required to submit information about how many people are killed by officers to the FBI, so most do not. Of the small number of police departments that do provide data, there have been instances reported where records appear to be missing or incomplete. Although the problems with the FBI numbers became a news story a few years ago, the problems are not new and have always been obvious to anyone attempting an objective review of the data.

As for shootings of dogs, Problem was published some two and a half years before the Puppycide Database Project was founded. When we first encountered references to the 'study' that confirmed the 'half of all shootings' statistic, we assumed that someone was researching the phenomenon of dog shootings by police by actively reviewing records of shootings - either another research group or an organization within the government. Although we knew of several groups on Facebook who collected news stories of dog shootings, we had yet to encounter any other organizations that were attempting to analyze the results of a large sample of shootings. Did the authors of Problem conduct this type of research, or possibly cite another group that had? What were the exact claims made by Problem and what were the sources for those claims?

"In most police departments, the majority of shooting incidents involve animals, most frequently dogs. For example, nearly three-fourths of the shooting incidents in Milwaukee from January 2000–September 2002 involved shots fired at dogs, with 44 dogs killed by officers during that period. Information furnished by various California law enforcement agencies indicated that at least one-half of all intentional discharges of a firearm by an officer from 2000–2005 involved animals." - The Problem of Dog-Related Incidents and Encounters (Emphasis added)

This is the paragraph that gave rise to the "half of all police shootings involve dogs" statistic. It is published on page 10 of The Problem of Dog-Related Incidents and Encounters. The paragraph begins with the statement: "In most police departments, the majority of shooting incidents involve animals, most frequently dogs". Two statistics are then used as evidence for that statement. But is the conclusion in the first sentence proven by the examples that follow it?

Let's assume for a moment that both examples are completely correct and are correctly cited from reliable sources (none of which is true, but we will get to that in a moment). Do the statements prove the conclusion that "In most police departments the majority of shooting incidents involve animals"?

The first statement provides us with a claim about Milwaukee Police Department use of force statistics that were, when taken at face value, nine years old at the time of publication. The second statement provides a claim about an indeterminate number of police departments from the state of California ("Information furnished by various California law enforcement agencies"). The phrasing of this second statement gives the impression that data from multiple law enforcement agencies in California are being referenced, but the number of agencies involved was somewhere short of all of California's police departments. Again let's provide the benefit-of-the-doubt and assume that the report references nearly all of California's police organizations: say 99%. The California numbers, according to Problem, cover the period between 2000-2005, ending six years prior to to publication of Problem.

By using out-of-date information from a single police department in Milwaukee and slightly more recent data from nearly all of California's police departments can we make reliable claims about current police use-of-force across the entire country?

Puppycide Database Project argues the answer to that question is clearly No.

"If we knew what it was we were doing, it would not be called research, would it?" - Albert Einstein

In order to create reliable statistics about a large group, or population, a researcher must have data for the entire population or, more commonly, a representative sample of that population. The group that is selected to represent the population as a whole is sometimes referred to as a sampling frame. There are a variety of methods to determine who is included in the sampling frame. One of the most important factors in determining who or what is included in the frame is the reduction of what is referred to as bias or sample selection bias. The dictionary has a decent definition for this kind of bias:

"a statistical sample of a population (or non-human factors) in which all participants are not equally balanced or objectively represented"

The force that researchers use to combat bias is randomness. By introducing a random element to the selection process, researchers fight the introduction of bias into samples - whether it is introduced deliberately or even subconsciously. Some experiments require elaborate controls to prevent the introduction of bias: readers may be familiar with the term "double-blind study" from medical research. Researchers and study participants are both prevented from knowing key information about the elements of a study (or kept "blind") to prevent the introduction of bias. The use of a double-blind procedure is often referred to as the "gold standard" of medical research.

double blind

Sometimes, it is not possible to select a sufficiently random and thus representative sample. While not ideal, researchers can use methods to account for problems with sampling. In fact, Puppycide Database Project must account for this issue in our own research: because many of our own records are obtained from sources in the media, the records in our database contain a certain type of bias for incidents of police violence toward animals that news editors and journalists prefer to write news stories about. To be clear, an element of bias does not make findings irrelevant, but they must be accounted for as part of publication of the results of the study.

Why do all of the participants have to be "objectively represented"? When we use a sample, we are attempting to use a small number of individuals to represent a much larger number of individuals. Let's say I have 50 checkers - 40 checkers are black and 10 checkers are red. If I were to randomly shake up the checkers in a bag and select 5 of them at random, the odds are fairly good that I would randomly select a sample that represents all of my checkers - I would most likely select 4 black checkers and 1 red checker, or a combination close to it, like 3 black and 2 red. But if I wanted to, I could create a non-representative sample by pouring out all of the checkers on the table and choosing which ones I want deliberately (this would introduce my bias to the sample). Acting randomly, the odds are quite long that I would select 5 red checkers from my bag of 50. It would probably take all day of shaking up the checkers and randomly selecting them to happen upon 5 red ones. But I could easily and quickly choose 5 red checkers deliberately, thus removing the randomness.

checkers

Why does it matter who is selected as part of a sample? Because sampling relies on the mathematical concept of probability to reach valid results. Lets consider on more example. When rolling a pair of dice, we can say with reliability that there is a 2.778% likelihood that we will roll double sixes. But if we were to tamper with one of the dice by introducing a weight to one side, we could no longer be certain that our claim would be reliable. Bias acts the same way that a weight works on the dice. Consider that if we were using a pair of weighted dice under the impression that they are normal, we would have no idea what the odds are that we will roll any given number - at least until we figure out they are tampered with. Probability is so important in all fields of study because it allows us to make very accurate predictions about the future. When we know the probability behind dice, we can essentially predict which numbers will be shown. This predictive power grows the more time we throw the dice. If the dice are loaded - or if bias is introduced into our statistics - our ability to see the future is lost unless we know exactly how the dice are loaded, or exactly how the sample is biased. Understanding the bias completely allows us to regain our ability to use probability to see the future! This is why cheaters who use loaded dice have such a huge advantage, and also why researchers must address bias in their own statistical samples as part of their research.

The examples used by the authors of The Problem of Dog-Related Incidents and Encounters to prove that "In most police departments, the majority of shooting incidents involve animals" do not qualify as a representative sample. They fail to be representative because there are not enough examples, and those examples are dominated by law enforcement agencies from a single state, California. The laws governing police in the other 49 states may be similar to California in some ways, but in other ways they are completely different. Inside of each state, municipalities and counties are themselves governed by unique laws and regulations. There are significant differences between the parishes of Louisiana, the counties of Florida and the parishes or Louisiana.

Even if the laws, courts and the administration of the police were identical in all 50 states, the people would still be different, and the dogs would be different as well! We don't yet fully understand the relationship between population density, race or dog breed and the rates at which police shoot dogs - but we do have a pretty good idea that all of these things influence the rates at which police shoot human beings. Do you think the police in Alaska are more worried about pit bulls than they are bears? Do police in Nebraska get more dog bites or more complaints about wolves and other animals that are a nuisance to farmers? Even if police in Alaska and Nebraska do shoot lots of dogs, the odds are very long that they shoot as many dogs as police in California.

What little we do know about killings of dogs by police suggests a significant disparity in the rates at which individual police departments kill dogs. We also know that the rates at which dogs kill human beings varies greatly from state to state. According to 15 years of CDC data reviewed by the Puppycide Database Project, California is the #2 state in the nation for fatal dog attacks, lead only by Texas. The majority of states in the union had no fatal dog attacks during the period we reviewed. California is a major outlier in canine violence, and we have no reason to believe that California would not also be an outlier in terms of police violence.

Dog bit fatalities, by state Canine attack fatalities 1999-2001, by State

Given these concerns, we should be highly skeptical of the "half of all shootings" statistic provided by The Problem of Dog-Related Incidents and Encounters, even if all of the data used in their examples for that statistic were completely true. And there are serious problems with the data provided by both of the sources provided by Problem.

We conclude our series with Part Four of "Statistics are Misleading 100% of the Time", in which we look at the sources that "The Problem of Dog-Related Incidents and Encounters" uses as a reference for their claim that "Half of all intentional police shootings involve dogs". We will also find out what a Secondary Source is - and we will learn how trouble results when a secondary source uses a secondary source that uses a secondary source. Finally, we will examine how this slight mistake became part of the policy of one of the largest animal welfare organizations in the country and has been repeated by dozens of trusted news sources for years. Be sure to check back here tomorrow!