Principle Component Analysis in Regression

Making Predictions

We cannot convert test data into principal components, by applying pca. Instead we have to apply same transformations on test data as we did for train data

1
2
wineTest = read.csv("https://storage.googleapis.com/dimensionless/Analytics/wine_test.csv")
wineTest
1
2
3
## Year Price WinterRain AGST HarvestRain Age FrancePop
## 1 1979 6.9541 717 16.1667 122 4 54835.83
## 2 1980 6.4979 578 16.0000 74 3 55110.24
1
2
pca_test<-predict(pca,wineTest[,3:7])
class(pca_test)
1
## [1] "matrix"
1
pca_test
1
2
3
## PC1 PC2 PC3 PC4 PC5
## [1,] 2.303725 0.5946824 0.4101509 -0.3722356 -0.2074747
## [2,] 2.398317 0.2242893 0.8925278 0.7329912 -0.2649691
1
2
3
# Converting to data frame
pca_test<-as.data.frame(pca_test)
pca_test
1
2
3
## PC1 PC2 PC3 PC4 PC5
## 1 2.303725 0.5946824 0.4101509 -0.3722356 -0.2074747
## 2 2.398317 0.2242893 0.8925278 0.7329912 -0.2649691

Making predictions

1
2
pred_pca<-predict(object = model, newdata=pca_test)
pred_pca
1
2
## 1 2 
## 6.796398 6.720412
1
wineTest$Price
1
## [1] 6.9541 6.4979