# ResponseIndividual Lab 3
This lab will review the following concepts:
- Reading in a data set
- Make sure the datasets library is installed and loaded up.
- We will be using the “iris” data set.
- Use ggplot to create graphs by :
- Changing aesthetics
- Adding labels
- Using Facets
- Pulling variables from a data set
- Finding the maximum and minimum values of a variable
- Finding the mean and median of the variables.
- Comparing values from the species to the entire data set
- Turning a Data Frame into a Tibble
Step 1 : Install / load the “tidyverse” package
Step 2 : Install / load the “datasets” package. Examine the package and determine how many datasets are in the package.
# ResponseStep 3 : Copy the dataset “iris” into the variable “iris_data”. Print out iris_data to make sure it is correct.
# ResponseWe will now start to create a plot one step at a time :
Step 4 : Create a graph using Sepal Length as the x axis and Sepal Width as the y axis, but no points are plotted.
# ResponseStep 5 : Add the data points using basic dots.
# ResponseStep 6 : Differentiate the dots by coloring the by their species.
# ResponseWe need to start adding labels.
Step 7 : Add the title “Sepal Length vs Sepal Width”
# ResponseStep 8 : Add a subtitle “Wowsers”
# ResponseStep 9 : Add better looking labels on the x and y axis. Rename the x axis “Sepal Length” and the y axis “Sepal Width”
# ResponseStep 10 : Add a caption at the bottom of the graph that says “Source : Iris Data Set”
# ResponseStep 11 : Change the shape of the dots by species
# ResponseStep 12 : Change the size of the dots by Petal Length
# ResponseStep 13 : Create a facet_grid for each species using Species ~ Petal.Length
# ResponseStep 14 : Create a facet wrap. Describe the patterns you see. Is it random? Do the facets follow any kind of a pattern? Linear? Quadratic?
# ResponseStep 15 : Let’s describe each species values by finding the maximum and minimum values of the Sepal Length, Sepal Width, Petal Length, and Petal Width, and the means and medians of each species and comparing that to the mean of the entire data set.
Find the mean and median of the variables using two different techniques :
- Pulling out the data from the data set, storing it in a different variable name, and then using the mean command.
Ex : name_1 <- dataset[,3]
name_2 <- mean(name_1)
# Response- Use the mean and median command directly on the variable in the data set.
Ex : name_3 <- mean(dataset$variable_name)
# Response- Compare the means of the species to the mean of the entire data set. Which variables had values above the mean? Which had values below the mean?
For example, did the variable Sepal Length of the species virginica have a value larger or smaller than the Sepal Length of the entire data set?
# Response- Determine the maximum and minimum values for each of the variables for the four species.
# ResponseTurn the iris_data dataframe into a tibble. Verify your result.
# Response