Lab #2 Markdown File

In this lab we will practice using methods from the tidyverse package.

Be sure to do the required readings first. While many of the problems can be solved using approaches from the lecture videos, lab videos, or required readings, you may need to do some searching on the internet to solve some of the problems. This will be a valuable skill to learn as you develop your data science skills.

This lab should be submitted in both R Markdown (.Rmd) file and knitted HTML web page (.html) formats, and you should be starting with the Lab markdown file (download here) after replacing the author name with your own. While the R Markdown file should include all of the code you used to generate your solutions, the Knitted HTML file should ONLY display the solutions to the assignment, and NOT the code you used to solve it. Imagine this is a document you will submit to a supervisor or professor - they should be able to reproduce the code/analysis if needed, but otherwise only need to see the results and your write-up. The TA should be able to run the R Markdown file to directly generate the same exact html file you submitted. Submit to your TA via direct message on Slack by the deadline indicated on the course website.

Required reading:

Optional resources:


Questions


1. Describe what the following tidyverse functions do. Also describe the pipe operator “%>%”. You may need to look up the official documentation for each of these.

filter: 
select: 
mutate: 
count: 
arrange: 
gather: 
pipe operator ("%>%"): 


2. Create a new dataframe that includes only senators and the columns gender, birthyear, and party. Then use that new dataframe to compute the number of male and female democrats and republicans (the output should be five rows corresponding to female democrats, male democrats, male independents, female republicans, and male republicans).

# your answer here


3. Identify the oldest and youngest male and female democrat senators using only R code.

# your answer here


4. Using mutate, create a new variable called age which represents the approximate age of each member of congress. How many democratic senators are over 60 years old?

Note: you can approximate age using the formula age = 2021-birthyear.

# your answer here


5. Create a new column that indicates whether or not the member of congress is more than 55 years old, and create a single dataframe showing the number of male and female members of congress that are over and under 55.

Note: the dataframe should have four rows: number of females over 55, number of males over 55, number of females under 55, number of males under 55.

# your answer here


6. Using gather, create a new dataframe where each row corresponds to a valid twitter, facebook, or youtube social media account, then compute the total number of accounts for each political party. Then do the same with pivot_longer.

Note: not every congress member has an account on all three platform, so be sure to filter.

Note: you may need to look up the documentation for pivot_longer.

# your answer here


Below, I define for you two vectors corresponding to policies that US States have adopted to respond to COVID-19: restrictions on travel (recorded May 20, 2020) and requirements that citizens to wear masks in public (recorded August 17, 2020).

travel_restrictions <- c("WA", "OR", "NV", "CA", "NM", "MN", "IL", "OH", "MI", "PA", "VA", "NY", "MA", "VH", "ME", "DE", "MD", "NJ")

require_masks <- c("HI", "WA", "OR", "NV", "CA", "MT", "CO", "NM", "KS", "TX", "MN", "AR", "LA", "WI", "IL", "AL", "MI", "IN", "OH", "KY", "WV", "NC", "VA", "DC", "DE", "PA", "NY", "VT", "NH", "MA", "RI", "CT", "ME")


7. write code to print only the states who implemented both travel restrictions and mask requirements:

# your answer here


8. Write code to print the states who had implemented mask requirements but not travel restrictions:

# your answer here

9. Describe two broad topics you might be interested in exploring for your final project. How would you use data science to gain insight about these topics? We won’t require you to stick with these topics - we just want to see you brainstorming about what you might be interested in.

your written response here