Welcome to the companion website for the book Multivariate Statistics and Machine Learning in R For Beginners, where you will find the associated videos and R code. At the bottom of this page, you will also find corrected errors identified since the book was published.
Chapter 1 A brief introduction to machine learning and multivariate statistics
- An introduction to Machine Learning and Multivariate Statistics – video
- An introduction to R – video
Chapter 2 Matrix Algebra
R scripts: Chapter 2
- Matrices and matrix operations, part 1 – video
- Matrices and matrix operations, part 2 – video
- Eigenvectors and eigenvalues- video
- Eigenvectors and eigenvalues – the math (optional) – video
Chapter 3 Managing data in R
R scripts: Chapter 3
Chapter 4 Graphical illustration of multivariate data
R scripts: Chapter 4
Chapter 5 Multivariate Relationships
R scripts: Chapter 5
Chapter 6 PCA and PCoA
R script: Chapter 6
- PCA: the basics – video
- PCA – the math (optional) – video
- PCA – the math (optional) – updated video that also explains SVD (NEW)
- PCA: standardization and how to extract components – video
- PCA: how to interpret the weights/loadings and Varimax rotation – video
- PCoA and classical multidimensional scaling – video
Chapter 7 Linear discriminant analysis
R scripts: Chapter 7
Chapter 8 Distances in space
R scripts: Chapter 8
Chapter 9 Multivariate statistical tests
R scripts: Chapter 9
- MANOVA – video
- Hotelling’s T-square – video
- PERMANOVA – video
- Canonical correlation analysis (CCA) – video
Chapter 10 Classification and performance metrics
R scripts: Chapter 10
- Sensitivity and specificity – video
- The positive and negative predictive values – video
- The likelihood ratio – video
- ROC curve – video
- Validation techniques – video
Chapter 11 Supervised Machine Learning
R scripts: Chapter 11
- LDA – how to use it as a classifier – video
- Logistic regression: the basics – video
- Logistic regression: how to use it as a classifier – video
- Decision trees – video
- Random forest – video
- k-nearest neighbors – video
- Gaussian naive Bayes – video
Chapter 12 Clustering
R scripts: Chapter 12
Chapter 13 PCR, PLS and Lasso regression
R scripts: Chapter 13
- Principal component regression – video
- Bootstrap confidence intervals – video
- Partial least squares regression – video
- Lasso regression – video
Chapter 14 Case studies
R scripts: Chapter 14
Paper 1
Dataset: Cytokines
Paper 2
- Kaplan-Meier analysis – video
- The log-rank test – video
- Linear mixed-effects model, part 1 – video
- Linear mixed-effects model, part 2 – video
- Cox proportional hazards model – video
Chapter 15 Answers to the exercises
R scripts: Exercises
Errors identified in the book since it was published
LOOCV in the package Caret
Page 195
fit=train(Species ~ ., data=iris, method="lda",
trControl = trainControl(method = "LOOCV"))
pred= predict(fit,dimen=1)
Tab=table(pred, iris$Species)
Correct
fit=train(Species ~ ., data=iris, method="lda", trControl = trainControl(method = "LOOCV",savePredictions = "final")) Tab=table(fit$pred$pred, fit$pred$obs)
Page: 210
fit = train(Species ~ ., data=df_iris, method="glm",
family="binomial",
trControl = trainControl(method = "LOOCV"))
pred= predict(fit)
Tab=table(pred, df_iris$Species)
Tab
pred versicolor virginica
versicolor 49 1
virginica 1 49
sum(diag(Tab))/sum(Tab)
0.98
Correct
fit = train(Species ~ ., data=df_iris, method="glm", family="binomial", trControl = trainControl(method = "LOOCV",savePredictions = "final")) Tab=table(fit$pred$pred, fit$pred$obs) Tab versicolor virginica versicolor 48 1 virginica 2 49 sum(diag(Tab))/sum(Tab) 0.97