We thought Robert Picard’s analysis of Baltimore’s parking tickets was pretty fun. So we took another cut at sussing out whether Baltimore gives out more parking-related tickets as the end of the month nears (which could be suggestive of the existence of ticketing quotas).
We designed the analysis to avoid some annoying complications in the data:
- Months ending on different days of the week: We cut each month into four weeks, counting back from the end of the month, such that the last week always ended on the last day of the month (e.g., Thursday Oct. 25th through Wednesday Oct. 31st).
- Holidays: We threw out November and December.
- Ticket Types: We dove more deeply into specific types of tickets.
Here’s how many tickets are in the city records for each year, as a percent of the dataset:
We sampled that down to 11k tickets per year for 2008 and later, then looked at how many tickets were recorded by week of the month (excluding November and December). We’re hoping to see more tickets in the last week than the other weeks:
Cue the sad trombone. Nothing much there. Maybe there’s something year by year?
The only cell that really stands out is the last week of 2011. And in fact, playing around a bit more, there’s a discontinuity where record-keeping ramped up (?) starting in the last week of September 2011:
Okay, still nothing. Maybe that’s just because we’re looking across too many ticket types (we haven’t yet excluded automatic tickets, like Robert did).
Here’s the variety of tickets tracked:
Looking across various ticket types, though, there’s no notable patterns, except that expired plate tickets tend to happen towards the beginning of the month (makes sense).
You could argue that the mobile speed traps tend to get more common as the month goes on (confidence intervals on those do show significant differences). But we’re starting to dredge the data a bit carelessly at this point, so the best we can do with that is to look for verification of that pattern over the next, say, 6 months. Assuming the mobile speed traps don’t get burned down.
The biggest problem with looking for evidence of quotas here is that we’re probably looking at the wrong data. Most of these offenses don’t appear to be ones that police have a whole lot of discretion around, as opposed to DUI pullovers or manually dispensed speeding tickets. But, of course, you go to war with the data you have.
The bigger lesson here, from our perspective, is that it’s often useful to sample data down to a more manageable chunk so you can play with it much more quickly–with millions of rows of data, you spend more time thinking about data management than data analysis. After you’re done exploring sampled data you can go back to unsampled data if you want to be really sure of a particular finding, of course. Because of sampling, this entire analysis was conducted with Excel and Statwing.
There’s some other fun stuff in the dataset itself (in Statwing) like day of the week vs. type of ticket.
Cheers to Robert on bringing this dataset to light (and for being appropriately unsure if there was anything really meaningful going on here), and cheers to Baltimore for publishing it in the first place.
Discussion on Hacker News