Meeting 04 - Data Cleaning Basics
Meeting Goals
This course meeting has an emphasis on the following goals:
- Identify common map layout elements.
- Select the correct verbs to perform specific data cleaning tasks.
- Construct a data cleaning workflow to prepare a data set for mapping.
Meeting Resources
- An overview of the meeting can be on the syllabus
- Prep videos and entry ticket are on Blackboard
- Meeting materials are available on GitHub in the
module-2-data-cleaning
repository - You can follow along with the complete code for Exercise 2 here
- Lab-04 instructions are available here
Before Class
Tasks
Please complete the tasks listed on the syllabus, and see Blackboard for the entry ticket link.
During Class
Agenda
- Exercise 1 - Discussing Map Layout Elements
- Exercise 2 - Data Cleaning
- Break
- Lab-04 and one-on-one meetings
Exercise 1 - Discussing Map Layout Elements
To illustrate some of the elements from the cartography reading assigned for today, please look at this map of the Artic Research and Policy Region created by the U.S. Geological Survey. Once you have reviewed the map, discuss the following prompts with your group:
- Review from last week: Which layers are in figure, and which layers are in ground?
- What data does this map layout emphasize? (see Brewer, p. 5)
- Which map layout elements are present here? (see Brewer, p. 4)
- What are some weaknesses of this map design?
Exercise 2 - Data Cleaning
To apply some of the verbs from our preparatory videos for this meeting, we are going to work on cleaning some data about Clean Water Act violations in Missouri. As we do this, we are going to follow the data wrangling workflow found here. We’ll focus on renaming variables, evaluating our data for missing values and duplicate observations, subsetting both columns and rows, and creating new variables.
The data and example notebook are available in this course meeting’s repository, named module-2-data-cleaning
. You can find them in examples/meeting-04.Rmd
. You can also follow along with the completed code here.
After Class
The Lab-04 instructions are available in module-2-data-cleaning
or can viewed online here. You’ll need to create a new .Rproj
for it in your assignments repository and then work to clean-up some more Clean Water Act data on polluted rivers and streams in the St. Louis area. You won’t need to create a map this week, just focus on wrangling the data.
Meeting Reminders
Please don’t forget about what is due for next week, which is listed on the syllabus. In addition to Lab-04:
- the second final project waypoint is due in two weeks for all students
- the annotated bibliography is due if you are enrolled in SOC 5650