Categories
How To

Adding type columns

Adding a column to your dataset categorizing your data is one way to extend and further analyze your data. You can do this by hand, but if your dataset is especially large, OpenRefine can make your life easier by finding common terms and bulk editing your data.

To do this, bring your data into OpenRefine and add a new column (Edit column > Add new column based on this column and then delete value is the easiest way to create a new, blank column.)

If you’re creating your new categories based on text within your data, you can use Facet > Customized Facet > Word Facet to find commonly used terms within a column without actually splitting that column. For example, in the video below, I used the Word Facet to facet the location column of the Haunted Places dataset. If I used a normal text facet, I would have facets like “Ada Cemetery” and “Evergreen Cemetery” where those two items have to be selected and edited separately. By using a word facet, “Ada,” “Evergreen,” and “Cemetery” are all separate facets, meaning that I can select every row that includes the word “cemetery” in the location column–even if it also includes other words! By editing the new type column, I can classify many cemeteries with one edit. Similarly, items like hotel/inn, bar/restaurant, school/university/college/academy could be easily grouped together and given one categorical identifier.