Module 3 email

This week we have a discussion starter on the data and its representations pieces up on Blackboard.  For the reading this week, keep in mind that there are multiple headings in the Quantification chapter of The Curious Journalist’s Guide to Data; I want you to read everything except Sampling and Quantified Error and The Problem of Measurement Error, but you will have to manually click through to the other sections after reading the intro section of the Quantification chapter.

In your data critique assignment, you walked through the first steps of doing a data-driven project.  To understand what kinds of questions your data can even answer, you need to be able to read and assess it before you start manipulating it like we did in the data cleaning assignment.  You should always, always do a data critique before starting a major project, even with data you yourself have created, because it can help you map out the rest of your project:

  • Are there any unexpected values?
  • Is it formatted well for the kind of analysis you want to do (this will be answerable after we’ve walked through some different modes of analysis in later modules)
  • How much and what kind of cleaning do you need to do?
  • Do you have a kumquat problem–ie, do you need to do research on what some of the terms in your data even mean before you can ask questions of it?
  • Can you turn unstructured text into categorical data either by clustering or by pulling out specific things?

The Module 3 page will start building some skills to do different modes of analysis.  The reading is a bit lighter this week because we’re going to start learning a bit of javascript and a bit of Python.  Javascript is used in DH primarily for display, while Python is used primarily for analysis (it can be used for display, but I personally find it a bit of a pain in the ass).  This week we’re mostly going to be learning basic principles and if you’ve done any other programming languages, the principles will likely be familiar.  If you’ve never done any programming before, we’re going to learn specifics of some languages that are commonly used in DH, but the principles are very transferable to other languages once you understand the logic.  Most of all, as we go into the next few weeks with programming tasks, remember what we learned with the formulas in the spreadsheet assignment for Module 1: there is more than one way to solve most problems.  As long as you understand the logic and can articulate in plain English what you’re trying to do, I’m more interested in seeing your process and your thinking.

I will be available on Zoom during Monday office hours 2-3PM, our Wednesday afternoon scheduled meeting time, and throughout the week on Slack and email to help troubleshoot the assignments.  For our assignments this week, once you’ve forked the assignment (you’ll learn what this means!) remember that you can link directly to your assignment.  Remember to use this if you need help with something!

One reply on “Module 3 email”

Comments are closed.