Module 6: Project Proposal

For my final project, I’m considering two different datasets:

  1. Tim Renner, “Haunted Places in America.”
    *Probably restricting it to New York State/Northeast due to the amount of data
  2. Executions in New York State, extracted from

The hauntings dataset is very fun- everyone loves a good spooky experience. While looking through it, I was struck with several questions:

  1. Are there particular populated areas of the state that have higher concentrations of reported hauntings? (And does it match with the population numbers- more people equals more reports?)
  2. Are there differences between the types of hauntings reported at elementary schools, middle schools, high schools, and universities? (Are stories from elementary schools less violent, for example, than ones from universities?)
  3. How many of these hauntings fit the characteristics of “classic hauntings,” where (usually violent) deaths cause a spirit to haunt the premises (see: Pliney the Younger and his experience with a literal chain-rattling ghost pointing out its own skeleton under the floor), and can the outliers expand our notion of what is worth haunting?

(Also: Is it possible to build a better ghost hunt by combining this data with the haunted tours and trails on the Haunted History Trail of New York State?)

I almost requested if I could combine this data with UFO sightings in New York from the National UFO Reporting Center, but I couldn’t really think of a research question other than, “here’s where you don’t want to live in New York if you’re afraid of the paranormal and/or extraterrestrial.” There was a bit of a spike in 2019 because the U.S. government had their whole UFO debacle, but there was nothing comparable to hauntings.

For the Executions dataset, I think it would be interesting to do a deeper dive into the court cases that led to the end of the death penalty in New York. I’m wondering if the majority-minority opinions of the judges involved in these cases (and maybe in the Supreme Court cases that decided the constitutionality of the death penalty) can be attached to individual data points, and if that would provide an interesting way to visualize general court opinion over time. The dataset can be sorted many ways, so maybe even the severity of the crime combined with execution method and race could reveal New York State’s past tendencies towards executions. These questions aren’t fully developed yet (can you tell I was focusing too hard on ghosts?) but if Hauntings isn’t viable, this provides, you know, something real to work with.

The Death Penalty Information Center would be a source of federal executions from 1923-present, if I were to compare and contrast between the federal level and the state level. Creating a .csv from their charts would be easy, as they seem to provide more information than the .csv on NY Executions and it could be pared down (or, the data on New York could be expanded to include those other categories.)

One reply on “Module 6: Project Proposal”

If you go with executions, you can get the data for the whole US here; I just put up the NY data for the data critique so it would be easier to work with. You don’t need my approval if you want to do the whole US data, but you may need to make a free account with the ICPSR site to download it (and lmk if you run into problems with that).

For the hauntings, you’ll probably need to add a column and categorize things yourself; ie violence_level high/med/low and go through to categorize by hand using your own judgement. There’s also a bigfoot sightings dataset by the same guy that did the hauntings data, and you could combine some/all of them. For the Bigfoot/UFO/Hauntings data, it should be pretty easy to combine those–put them all in one big spreadsheet and add an extra column for type (eg, bigfoot, ufo, etc) and then you can compare distribution of each type against the other. If you decide to combine them, you’ll need to make sure that any columns they don’t share are accounted for (ie, if there’s no moon phase columns in the hauntings dataset, you’ll need to move the columns around when you paste them together)

For both datasets you will need to compare against population by getting the county or state level population and comparing the # executions or sightings per 100,000 population. (This can be easily done either in Tableau or with a spreadsheet formula). You can get county-level population data for the whole US here: Unless the years column has a big date spread, I’d just use the most recent population numbers.

Do keep in mind that one of the two requirements for the final project is that there needs to be an argument–so something like a browesable hauntings trail would not necessarily have an argument, but something about the correlation between kind of haunted place and level of violence would have an argument.

Thomas is also looking at the paranormal data if you’re interested in a group project

