Meeting 04 - Data Cleaning Basics

Please submit Lab-03 on your own via GitHub Desktop!

Meeting Goals

This course meeting has an emphasis on the following goals:

  1. Identify common map layout elements.
  2. Select the correct verbs to perform specific data cleaning tasks.
  3. Construct a data cleaning workflow to prepare a data set for mapping.

Meeting Resources

  • An overview of the meeting can be on the syllabus
  • Prep videos and entry ticket are on Blackboard
  • Meeting materials are available on GitHub in the module-2-data-cleaning repository
  • You can follow along with the complete code for Exercise 2 here
  • Lab-04 instructions are available here

Before Class


Please complete the tasks listed on the syllabus, and see Blackboard for the entry ticket link.

During Class


  1. Exercise 1 - Discussing Map Layout Elements
  2. Exercise 2 - Data Cleaning
  3. Break
  4. Lab-04 and one-on-one meetings

Exercise 1 - Discussing Map Layout Elements

To illustrate some of the elements from the cartography reading assigned for today, please look at this map of the Artic Research and Policy Region created by the U.S. Geological Survey. Once you have reviewed the map, discuss the following prompts with your group:

  1. Review from last week: Which layers are in figure, and which layers are in ground?
  2. What data does this map layout emphasize? (see Brewer, p. 5)
  3. Which map layout elements are present here? (see Brewer, p. 4)
  4. What are some weaknesses of this map design?

Exercise 2 - Data Cleaning

To apply some of the verbs from our preparatory videos for this meeting, we are going to work on cleaning some data about Clean Water Act violations in Missouri. As we do this, we are going to follow the data wrangling workflow found here. We’ll focus on renaming variables, evaluating our data for missing values and duplicate observations, subsetting both columns and rows, and creating new variables.

The data and example notebook are available in this course meeting’s repository, named module-2-data-cleaning. You can find them in examples/meeting-04.Rmd. You can also follow along with the completed code here.

After Class

The Lab-04 instructions are available in module-2-data-cleaning or can viewed online here. You’ll need to create a new .Rproj for it in your assignments repository and then work to clean-up some more Clean Water Act data on polluted rivers and streams in the St. Louis area. You won’t need to create a map this week, just focus on wrangling the data.

Meeting Reminders

Please don’t forget about what is due for next week, which is listed on the syllabus. In addition to Lab-04: