Portfolio: Swezey Portfolios

Module 9: Project Update

For my final project, I have decided to focus on the New York State executions dataset. The spreadsheet is comprised of 1,130 rows detailing information about each individual that has been executed within the New York State judicial system, beginning in 1646 and ending in 1963. Such details include the name of the individual, their race and gender, their occupation, their crime, the method of execution, the date they were executed, and the county in which they were convicted (disregarding appeals).

Kaycie Haller originally provided the data critique below! I believe that the “mtpl” column is referring to the number of same-day executions. So, the number ‘5’ attached to an individual is indicating that they were the 5th person that day to be executed.  

The data itself needs minimal cleaning. The majority of changes will be applied to the spreadsheet itself: capitalizing names, splitting the names of the enslaved persons from their masters’ names (those in parentheses), removing commas from names originally listed as “Van, Patten, John,” “Van, Reed, Harry,” etc. and filling in unknown pieces of information if they are revealed through other research (for example, Barbara Stillwell’s race was originally listed as “?” but is noted in Daniel Hearn’s Legal Executions in New York State: A Comprehensive Reference, 1639-1963 (1997) as being white. For the sake of this project, I would like to include these small changes while noting their potential inaccuracies in the final project.)

I will be expanding the “Counties” section of the data first to examine the changes in their populations over time. Using this PDF complied by the New York State Department of Economic Development (that includes its sources) and pulling the tables using Tabula, I will be able to see the number of executions per county per decade in proportion to the population over time. This will, at the very least, allow me to see whether certain areas of the state had an unusually high number of execution convictions in proportion to their population. I will also be noting the landmark legislation that changed how New York State implemented capital punishment and looking for any recognizable trends in the immediate aftermath of that legislation, especially around 1937, when New York made the death sentence mandatory for first-degree murder cases unless the jury recommended life in prison.

(Also, I planned on looking through the country-wide executions data soon- I downloaded the delimited version rather than the ASCII version, though, which I don’t think is a problem… it needs translating though, and I haven’t done it yet, so this weekend I’m either doing that or getting the program to translate the ASCII data. I’ve been really focused on the New York data so far, which is why I haven’t expanded outwards to different states.)

While thinking about potential visualizations, I created a choropleth map with a year slider to show the number of execution verdicts per county per year. I created several other graphs as well while getting a feel for the data, such as the number of executions performed through the 1930s and 50s by race using stacked bar graphs and a series of visualizations that considered the crimes committed by housewives and if their crimes involved killing their husbands. I’m embedding the choropleth map and a graph showing the number of convictions in New York County in comparison to the population over time (with a wonky data point joining towards the end, which I’m trying to fix- also, the graph itself might be confusing with the extremely different y-axes so I’m considering other ways to do this as well).

One reply on “Module 9: Project Update”

Re: the county populations, that might be kind of a headache to get out of that pdf; this dataset is for all counties in the US< but you can filter it down to just NY. It would be worth checking what the county data actually records--whether it's the county the person lived in/committed the crime in, or if it's the county where they were executed. I'd guess it's county of residence rather than execution, since county of execution would cluster around prisons, but just double check to be sure. I think your population line in that second graph is wonky because your x/bottom axis is trying to be both year and decade. You're on the right track with your calculated field grouping your years by decade (and good job figuring out how to do that on your own, I had to look that up!)--you just want to use that field to make your graph as well and that should fix the wonky population line doubling back on itself. I did a quick video here so I can show the rest of the class: I made a copy of your workbook with the new sheet I made here if you want to see how I set that up.

I think your double axis is fine, but in the long run once you get the individual county level population data included, it might make more sense to do executions per 100k people per county per decade, either as small multiples of many line graphs on a dashboard, or as the top (5, 10, whatever) counties all together in one chart.

Using your decade calculated field will also probably help your chloropleth make more sense, so that you can see trends by decade rather than paging through each individual year where there’s one execution.

If you’re going to translate your whole-US delimited file in OpenRefine, I made a quick translator spreadsheet here: Basically, take the column name, numeric value, and final text value from your code book and paste them in to the corresponding blue columns. Drag the white and green columns down so you have as many rows as changes you want to make, then copy the green column out to put into OpenRefine’s apply operations pane under undo/redo. (If you’re comfortable doing the ASCII translation in another program already, that’s fine, but this is the quick, dirty, and minimal new programs way).

Comments are closed.