Lab #1 Markdown File

In this lab we will be learning the basics of using R and working with real data.

Be sure to do the required readings first. While many of the problems can be solved using approaches from the lecture videos, lab videos, or required readings, you may need to do some searching on the internet to solve some of the problems. This will be a valuable skill to learn as you develop your data science skills.

This lab should be submitted in both R Markdown (.Rmd) file and knitted HTML web page (.html) formats, and you should be starting with the Lab markdown file (download here) after replacing the author name with your own. While the R Markdown file should include all of the code you used to generate your solutions, the Knitted HTML file should ONLY display the solutions to the assignment, and NOT the code you used to solve it. Imagine this is a document you will submit to a supervisor or professor - they should be able to reproduce the code/analysis if needed, but otherwise only need to see the results and your write-up. The TA should be able to run the R Markdown file to directly generate the same exact html file you submitted. Submit to your TA via direct message on Slack by the deadline indicated on the course website.

Required reading:

Optional resources:


Questions


1. Use the documentation website or the helper functions in R (?function_name) to look up the mean function, and describe each of its arguments and what they do in your own words.

Write your answer here.


2. Extract the third element of the vector using numerical indexing.

random_vector <- c("R","is","great")
# your code here


3. Use R code to identify the data type of some_vector. What is the largest number in this vector? How about the mean value? Your results should be generated from the R code - please do not write the numbers into R manually.

Hint: you’ll have to research some new functions in r to do this- try Google or one of the tutorials linked on the syllabus.

some_vector <- c(25555,342343,123123123,4234234,53243234,54324234,5421111,12312312,111231,
                     1231231,12312312,12312312,123123,898972,789872,2343,23423423,2343221,23423,
                     14444,44324222,2342341,124231111,22233345,1111233333,1231231,1231231)
# your code here


4. Use the congress dataframe for the remainder of questions in this assignment. How many rows and columns does the congress dataframe have? Use a function to show its data type. You must use R code to generate these values.

# your code here


5. What is the average age of all congress members? What is the data type of the birthyear column?

# your code here


6. How much older is Sherrod Brown (a member of congress) compared to the average of members of congress? How about Dianne Feinstein?

Note: you are allowed to use hard-coded numerical subscripts for this problem. For future labs, you will not be allowed to do this.

# your code here


7. Who are the oldest members of congress?

Note: you may use the browsing features of RStudio to identify the oldest members of congress. In the future, you will not be allowed to do this.

Your written answer here