Description
This project is about over-fitting and it is based on chapter 6 Statistical Machine Learning from ‘Practical Statistics for Data Scientists’.
Files needed:
P4p4F1.pdf P4p4F1.pdf – Alternative Formats
P4p4F2.pdf P4p4F2.pdf – Alternative Formats
P4p4F3.pdf P4p4F3.pdf – Alternative Formats
Cover in the project the following:
- Explain the data from figure P4p4F1.pdf.
- Explain the differences in (a) and (b) parts in figure P4p4F2.pdf.
- Try to recreate with R or Octave, as close as possible, the data from the figure P4p4F1.pdf. Functions needed are: runif (R) rand (Octave) for uniform distribution and rnorm (R) randn() (Octave) for the normal distribution
- explain how you can recreate P4p4F1.pdf
- compare and discuss my P4p4F3.pdf with the figure you created
- Based on the P4p4F3.pdf, or your data created, explain how you would make a decision tree to classify ‘+’ and ‘o’ similarly to the way it was done in the left tree in P4p2F2.pdf
- In your opinion, why is it practical or useful to simulate the data for the classification?