1.5 Another Data Set

If the crime and non-emergency call data do not interest you, you may petition to choose a topic of your own provided you can find appropriate data.

1.5.1 Characteristics of an Appropriate Data Set

Data sets for your final project will have a number of salient characteristics:

  1. They should be spatial data.
  2. They should numerous - there should be at least several hundred observations that vary over space and, ideally, over time as well. Several thousand observations will be ideal.
  3. They must be countable - the number of 9-1-1 calls, restaurants, or terrorist attacks, for example.
  4. They should be point data (or have the ability to be mapped as point data). Examples would be cities throughout a country or a region, restaurant locations across a city, state, or province, and crime scene locations.
  5. They should be able to be aggregated up to some sort of higher order construct. Think about how restaurant locations could be aggregated up to neighborhoods, municipalities, or counties.

1.5.2 Other Considerations

There are a few other considerations to take into account. If you are a graduate student and have already identified a possible thesis topic, you may want pick a data set that is either a possible candidate for inclusion in your thesis or, at the very least, is conceptually related. You want to maximize the impact that your coursework has, so even if you are not sure whether or not the data set itself will be helpful, picking something in the same topic area will mean that your literature search can be put to use on future assignments (such as in your Research Methods course).

1.5.3 Pandemic Considerations

I am normally flexible about accepting other data sets for the final project. However, because of the nature of the Spring 2021 semester, I am going to be more stringent in approving these. If you want to petition use an outside data set, you may. However, you need to be prepared to do some of the leg work on your own without the benefit of in-person support from Chris.

1.5.4 Finding an Appropriate Data Set

In general, you are free to use any resource to identify a suitable data set that meets the above criteria with a couple of caveats:

  1. There is not time for you to collect your own data.
  2. There is not time for you to go through the IRB process to gain access to confidential data (either data that is not publicly available or data collected by a thesis adviser or other faculty member).
  3. The data you use should be licensed for re-use (it cannot be proprietary or otherwise restrictively licensed).
  4. The data should be well documented - you want to be very sure what each variable represents. If there is no code book or documentation, the data set is probably not appropriate for this project. See Chris if you have questions about this.