✅ Put your name here
¶
Introduction to Data Wrangling¶

Credits: xkcd.com
Learning goals for today’s pre-class assignment¶
Discuss the impact of data context
Experiment with some of the ways you can load data into a Jupyter notebook
Identify the arguments needed for different ways to load data
Practice loading a data file in need of wrangling
Assignment instructions¶
This assignment is due by 11:59 p.m. the day before class, and should be uploaded into the appropriate “Pre-Class Assignments” submission folder on Canvas. Submission instructions can be found at the end of the notebook.
1. Finding Data¶
Below are a few places (among many!) to find data sets.
✅ Task 1
Choose three data sets (without worrying about if they are “good” or not) and download them. Feel free to choose three from the same website or three from three different websites. In the cell below, paste links to the files.
Sites:
LINKS HERE:
2. Loading and Contexting Data¶
Watch the video below to see some of the ways we can load data into Jupyter notebooks! Then answer the questions below.
from IPython.display import YouTubeVideo
YouTubeVideo("KxBgGdDP95Y",width=640,height=360)✅ Question 2
In the cell below, include links to the package documentation for the different options mentioned in the video.
What are some of the common arguments used when loading in data files?
What are some common challenges you might run into when loading in data files? What challenges have you already run into?
✎ Put your answers here.
✅ Task 3
In previous assignments, we have loaded data that is ready to use with tools like numpy and pandas. In the cell below, choose one of your data sets and try to read it in with no additional arguments. Then answer the questions below.
# put your code here
✅ Question 4
If your data was read in, what does it look like when you view it in the notebook? Is it in a usable form?
If your data was not read in, what bugs do you see, how might you address them?
✎ Put your answers here.
Let’s try again!¶
✅ Task 5
If your data was not read in properly above, try adjusting your arguments and see if you can get it loaded in.
If your data was read in properly, try an additional data set here!
# put your code here✅ Question 6
What arguments did you use to read in your data?
What steps did you need to take to figure out how to read in your data?
✎ Put your answers here.
Contexting your Data Set¶
✅ Task 7
Choose one of the three data sets you identified, and do your best to answer the following questions for that data set:
Who collected/generated the data?
How was the data collected/generated?
Who/what is included in the data?
Who/what is not included in the data?
What are the limitations or biases of the data?
Note: If the information is not available to answer the questions above, that is your answer! You may need to do some additional searching about the data beyond the source you downloaded it from as well.
✎ Put your answer here.
Additional food for thought: Contexting our Discussion (optional)¶
We want to build some context for why data sources matter.
Read Chapter 6. The Numbers Don’t Speak for Themselves of Data Feminism located HERE and answer the following questions in the cell below:
What are some ways we can identify the complexity of what data sets actually represent?
Go back and look at Figures 6.6 and 6.7. Tell us about your interpretation of that sequence of plots.
Identify some places in PLNT_SCI 2500 so far that data context was or would be impactful.
Thinking about your personal ethics as someone developing data analysis skills, what ideas would you want to carry forward from this reading into your approach to data analysis and visualization?
NOTE: While your personal views may not completely align with those of the authors, you should seek to glean insights that make your data analysis and visualization efforts more impactful, regardless of the context or application. By digesting multiple perspectives and approaches to data analysis and visualization, we can strive to make our work as high quality as possible.
✎ Put your answer here.
Congratulations, you’re done!¶
Submit this assignment by uploading it to the course Canvas web page. Go to the “Pre-class assignments” folder, find the appropriate submission folder link, and upload it there.
See you in class!
Material drawn with permission from:
© Copyright 2023. Department of Computational Mathematics, Science and Engineering at Michigan State University
Adapted for:
© Copyright 2026, Division of Plant Science & Technology—University of Missouri