For this week, the concept I struggled with the most was the language. I frequently got stuck due to so many stupid, easy typos. I wrote (‘items’) rather than [‘items’] in the API Requests assignment, I was trying to run a function with text.strip rather than text.strip() in the Webscraping assignment, and I had placelist.txt written instead of ‘placelist.txt’ in the Geocoding assignment. That last one took me the longest because I had no clue why Colab was telling me placelist.txt was not defined since it obviously was! It took me an embarrassing amount of time before I finally copied Prof. Kane’s code at the bottom and figured it out. I liked concluding with the Gender Inference assignment. The way it incorporated all the functions of the previous assignments was helpful to me.
A concept that will help me in my own research is identifying the range. At first it took me some time to figure out why Prof. Kane used 31 pages as her range for a list of 610 results until I saw that there were 20 results per page. I’ll be using lots of site files and catalogue records to research material collections of trade assemblages, an endeavor which will require lots of pages from many databases (“data dumps”, as one of my anthro professors called it).
The US National Archives has a Flickr API that contains more than 16,000 historical photographs, maps, newspaper pages, and other documents publicly available. Each Flickr post contains an image and other data variables such as Production Date, Series, Creator, and a Identification Number. I found the link to the Flickr account on this National Archives webpage, but I’m having trouble locating the link to the actual API data. It looks like whoever runs this Flickr account pulls individual records of images from the API and uploads them here with all the data each record contains. In order to access the API you’d probably have to do some digging to find the contact person who manages the Flickr and would know how to grant permission.
I think it would be cool to fetch the Production Date data to compare the time periods of the images to see which years are better represented than others. According to the description, the National Archives date to 1775, but I’m wondering how many images they actually uploaded from the eighteenth century. To use an archaeological term, this could be similar to taphonomic bias – since younger materials preserve better, there’s more imagery to work with from later time periods. This causes researchers to focus more on recent records as opposed to the earlier ones.