Introduction & Critique
The Police Data Initiative website contains data supplied by 135 participating agencies as of today’s date. There is likely an agency near you that you might recognize.
In late 2015, the Tucson Police Department’s Lieutenant Myron “Ron” Holubiak brought the Public Safety Open Data Portal to my attention. The White House’s Police Data Initiative had begun in May 2015 and Tucson PD had announced its participation at the end of October.
When I originally considered this topic, I hadn’t planned to include a critique, yet when compared with other data portals, it has some flaws. Below are my impressions. I have not contacted the people managing the data portal for comment.
- It’s not curated. That means that the portal relies on each agency to
control each dataset’s quality.
It doesn’t look as if someone on the website reviews the data
and provides feedback to agencies regarding their datasets’ accuracy or
completeness.
- It’s not (frequently) updated. You may see an agency providing data on the portal, but providing more recent information on a city website.
- It may not be housed locally. Generally, I would expect either a direct link to a file within the data portal. Alternatively, there might be descriptions of data stored elsewhere with direct links to those data files. In multiple instances, the Police Data Initiative simply links to a city’s data site and an interested researcher must hunt for the data there.
Here are some examples:
Boston Police Department is listed as a participating agency, but shows no data on the site. Instead, there is a link to the Legacy Boston Open Data Portal when you would rather go to the new site called Analyze Boston, where you can download crime incident reports from 2015 to present. These are incidents from the department’s records management system (RMS) rather than calls from the department’s computer-aided dispatch (CAD) system. It includes information on the time that an incident occurred but nothing about responding units or response times.
On a positive note, incident locations contain latitudes and longitudes that appear quite accurate.- Atlanta Police Department is also a participating agency. At the moment,
the link is broken, but you can
download incident information here.
The same issue about “incidents versus calls”" applies to Atlanta.
In fact, the call times appear intentionally truncated to the minute,
with over 40% of records only showing the hour.
Within New Jersey, Newark Police Department references a link to a city website that contains no police data, while the Camden County Police Department, which was an early participant in the initiative has no working link at all.
Atlanta
Below we take a look at incidents for Atlanta PD. First we load the information and limit to the first 4 months of 2017. Next, we examine the recorded times for these 7,668 incidents. There were an average of 1.3 incidents per day. Focusing on the minutes shows that officers entering the information were likely to round to the nearest hour, 30 minutes, or other increments, rather than entering an accurately recorded time.
Boston
For simplicity, we apply the same limits to Boston PD’s incidents that were applied to Atlanta PD’s incidents. For Boston, we examine the recorded offense descriptions, focusing on UCR Part One crimes. There are a total of 31,280 incidents, and an average of 258.5 incidents per day. There were 5,381 Part One crimes. Below is a quick count of crimes by group. Note that the Boston PD appears to have removed all rape records from their public data set.