Categories
Portfolio: Katie Tote

Module 11 Update

Roadblocks & Difficulties

So, after spending an obscenely long time trying to clean restaurant menu item data and at one point making it through one single decade of editing in an eight hour work day… I decided I need to revise my project idea. Part of it was that my quantitative-data-oriented brain wanted to make everything perfect, but also I think the realities of the data and the timeframe/scope of this class limited the feasibility of my new plan.

I ran into issues with OpenRefine and Excel crashing because the dataset was too big, leading to frustration and more time lost as I needed to redo things / fight my computer. A positive to the many, many hours I spent trying to clean the data is that I now know way more French words than ever before, and I’m pretty confident that I could order something off a menu if I happened to be in a French restaurant that served pigeon and calf brains and other various horrifying nineteenth century delicacies.

But anyway! The new plan! I’m now using the same dataset (so not all is lost!) to look at food fads, with the arguments that:

  • 1) The fleeting food trend is not a new phenomenon, and restaurant dishes throughout history have seen patterns of increasing popularity, ubiquitous-ness, and sharp falls from grace, just as we’ve more recently seen happen with the cronut, Sriracha-everything, hard seltzers, and so on.
  • 2) Perhaps more interestingly, I’m going to look at food items that have become ~trendy~ in recent years and explore whether or not they made an appearance on restaurant menus from 1850-1950. Were we eating kale salad in the depression? Did Victorian Era restaurants serve quinoa? Let’s find out!
  • 3) And, to touch vaguely on my original research plan, I’ll use the same methods used to investigate trendy food items to also investigate when various ‘ethnic’ cuisines emerged on the NYC dining scene. This is interesting in theory, but, spoiler alert: mostly they…. don’t, at least not within the time period and the menu selection available in the data set.

Visualization Choices

I’m building a website using Bootstrap or maybe WordPress with blocks (or pages? I’m not sure what the final layout will end up being) organized by decade, starting with 2020 and going backwards to 1950. Each block will focus on a few fad foods that were big in each decade, with historical background on why they gained popularity and then some charts/graphs/timelines showing counts and trends and such. Different decades will have different visualizations, but so far I have some line graphs and fancy bar charts made, and have plans for some word clouds and a little bit of mapping. I’m using R and Excel for chart making. I also might use this cool timeline tool.

There will also be sections focused on ‘ethnic’ food, a section on beer, and a final section focusing on bizarre food trends of the late nineteenth and early twentieth century that will hopefully never see a resurgance.

My color scheme will vary on each page as the reader scrolls through, depending on the decade in question. Tacky? Maybe.

Remaining To-Dos

After having a bit of a meltdown about how little progress I’d made despite so much time spent, I think I’m finally on track and feeling good about my revised direction! I’m currently putting finishing touches on my presentation for next week, which is a small-but-interesting piece of my second argument listed above. I have all of the analysis and writing done for the final completed website project, so will spend my remaining time working on piecing together the visualizations and design elements.

I’m also doing my posts out-of-order, here, so I also have “Week 10 update post” on my to-do list — I wanted to make a few tweaks to my layout plan and draft visualizations before posting, so figured I’d post this update first since I’m a bit behind…

2 replies on “Module 11 Update”

Remember that process is more important than product for this class. A lot of people hit the meltdown place around this point in the semester, but from my perspective, if all you get out of this class is an idea of the limitations of your tools and a sense of what you personally are or are not willing to sink time into in the future, those are both better to find out in a class than when you’re knees deep in a project for your job.

I’m not surprised about Excel crashing, but I am a bit surprised about OpenRefine crashing for you. How many rows did your sheet end up? I ask so that I can steer people away from things for future classes.

This may be something you’ll talk about in your module 10 post, but how are you planning to find #2–chart the appearance of kale on menus over time?

Re: the color palettes, I don’t work in R enough to recall how much control you have over your colors, but even if you do change color palette from decade to decade, you may want to have some kind of connecting feature between each decade. ie, Start with something like the default here https://projects.susielu.com/viz-palette, and make your decades get lighter/darker or less saturated; or progress with one major rainbow color for each decade (red for 1890, orange for 1900, yellow for 1910, etc etc); or have common background elements to tie everything together. What you’re planning is going to be A Lot, and just from sheer numbers most peoples’ brains have trouble processing a lot of visual change because it registers as a completely new project if every visual element changes (and the cognitive load of understanding the new color palette makes it more difficult to follow the actual argument). You can have variety, but just think about how you’re using the variety and change to support rather than work against your own argument.

Remember also that you can always scale back plans if you’re pushing up against the deadline. I’d rather see a couple of things done well than a bunch of things you don’t feel great about.

So, the final CSV with one row for each menu item in NYS from 1850-1959 ended up being 3,484,195 rows — I’m editing this because that number was definitely wrong, but now I can’t find where the file went for an actual row count. But it was a lot! I guess it makes sense that Excel and OpenRefine were crashing. I’ve decided that I’m going to use the full dataset (not limited by time OR location), but I’m just going to be using R to manipulate it, so I should be good re: my computer freaking out.

re: how to chart menu items over time — I’m doing a text search in the dish name column in R, searching for various food items (& common misspellings / French and German translations) and then summing up how many times they’re mentioned per year. So, for example:
kale_data <- full_data %>% filter(str_detect(name_up, “KALE|CHOU FRIS|NKHOL”))
kale_summary <- kale_data %>% count(year)

Comments are closed.