Description
First question: Provide at least the following in the report for full credit:
(1) Understanding the Data:
-
-
- The structure of the data and a preview of the data.
- Frequency Distribution. (Frequency Tables & Plots for each variable in the dataset (Barplots/Histograms)). Make sure to capture the skewness and kurtosis. – Provide an interpretation in one paragraph (no more than 300 words) explaining the distribution of the data.
- Summary Statistics of the Data at least including mean, quartiles, min/max, and standard deviation.
-
Second question:
Using the mtcars dataset, demonstrate the skills you have learned so far in class and submit a Rmarkdown (word doc) report including the following:
- Develop a hypothesis
- What is your hypothesis?
- What columns are IVs
- What columns are DVs
- What columns are ignoble (why)
- Check for Errors & Missing Data
- Clean the data
- How did you deal with NAs
- How did you deal with outliers
- Check Assumptions using Parametric Tests
- Additivity
- Linearity
- Normality
- Homogeneity, Homoscedasticity
Thirds question: Create a bar graph using the attached Iris dataset: iris CSV fie
Compare the Sepal Length of the flower Species. Include the following:
- Main Title
- X and Y-Axis Labels
- Colors by Species
- Provide an interpretation in one paragraph (no more than 300 words) explaining the distribution of the data.
Which Species Sepal.Length is greater?