For this project, I’ll be using the dataset compiled by Jacob Metzer and Robert A. Margo entitled “Union Army Recruits in Black Regiments in the United States, 1862-1865.” This dataset records important information about these recruits, including their birthplace, place of enlistment, year of enlistment, regiment and company, rank, etc. It also delineates their pre-war occupation/skill level, which I argue had a notable impact on the character of their military service. Observed physical traits, such as complexion, hair color and eye color are also records, but I’ve decded to omit this data as superfluous to the scope of my project.
In terms of cleaning, I took the following actions:
- Removed complexion, har and eye color columns, as physical traits have no bearing on this study
- Renamed “occup” to “Occupation Skill Level”
- Reoriented “OSL” next to occupation, for ease of reading
- Added a “0” to all entries in “enlistdate,” then manually removed those for dates in October, November and December
- Split the now uniform column using field lengths of “2, 2, 1” into three columns (Month, Day, Year)
- Renamed single digit years in four digit terms (i.e. 1 = 1861)
- Combined the Month, Day and Year columns into a new uniform date column
- Used codebook to cluster enlistment places in terms of state. I choose to forgo count locations because such detail is not necessary for this study. (there are several discrepancies between the codebook and the dataset for this column, including often an incorrect # of digits in the column and one entry for a state coded 9, which is absent from the codebook)
- ***Important Note*** – For states represented by a single digits (i.e. 2), I interpreted the data as likewise possessing only a single digit state code (i.e. 2044 = 2 (state code) 044 (three digit county code)
- Removed “dateend1” and “enlistdate1” as they provide duplicate information as earlier columns
- Exported the now cleaned data to a .csv file.
In terms of visualizations, these are what I created:
<script type='text/javascript'> var divElement = document.getElementById('viz1620324998164'); var vizElement = divElement.getElementsByTagName('object')[0]; vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px'; var scriptElement = document.createElement('script'); scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js'; vizElement.parentNode.insertBefore(scriptElement, vizElement); </script>
<script type='text/javascript'> var divElement = document.getElementById('viz1620324923251'); var vizElement = divElement.getElementsByTagName('object')[0]; vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px'; var scriptElement = document.createElement('script'); scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js'; vizElement.parentNode.insertBefore(scriptElement, vizElement); </script>
<script type='text/javascript'> var divElement = document.getElementById('viz1620324898778'); var vizElement = divElement.getElementsByTagName('object')[0]; vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px'; var scriptElement = document.createElement('script'); scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js'; vizElement.parentNode.insertBefore(scriptElement, vizElement); </script>
<script type='text/javascript'> var divElement = document.getElementById('viz1620324862260'); var vizElement = divElement.getElementsByTagName('object')[0]; vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px'; var scriptElement = document.createElement('script'); scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js'; vizElement.parentNode.insertBefore(scriptElement, vizElement); </script>
<script type='text/javascript'> var divElement = document.getElementById('viz1620324823404'); var vizElement = divElement.getElementsByTagName('object')[0]; vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px'; var scriptElement = document.createElement('script'); scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js'; vizElement.parentNode.insertBefore(scriptElement, vizElement); </script>
<script type='text/javascript'> var divElement = document.getElementById('viz1620324797076'); var vizElement = divElement.getElementsByTagName('object')[0]; vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px'; var scriptElement = document.createElement('script'); scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js'; vizElement.parentNode.insertBefore(scriptElement, vizElement); </script>
<script type='text/javascript'> var divElement = document.getElementById('viz1620324754916'); var vizElement = divElement.getElementsByTagName('object')[0]; vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px'; var scriptElement = document.createElement('script'); scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js'; vizElement.parentNode.insertBefore(scriptElement, vizElement); </script>
<script type='text/javascript'> var divElement = document.getElementById('viz1620324722714'); var vizElement = divElement.getElementsByTagName('object')[0]; vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px'; var scriptElement = document.createElement('script'); scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js'; vizElement.parentNode.insertBefore(scriptElement, vizElement); </script>
<script type='text/javascript'> var divElement = document.getElementById('viz1620324568158'); var vizElement = divElement.getElementsByTagName('object')[0]; vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px'; var scriptElement = document.createElement('script'); scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js'; vizElement.parentNode.insertBefore(scriptElement, vizElement); </script>