Boosting is one of the most powerful techniques in supervised learning. rtemis allows you to easily apply boosting to any learner for regression using boost().
Let’s create some synthetic data:
set.seed(2018)x <-rnormmat(500, 50)colnames(x) <-paste0("Feature", 1:50)w <-rnorm(50)y <- x %*% w +rnorm(500)dat <-data.frame(x, y)res <-resample(dat, seed =2018)
01-07-24 00:31:47 Input contains more than one columns; will stratify on last [resample]
.:Resampling Parameters n.resamples: 10
resampler: strat.sub
stratify.var: y
train.p: 0.75
strat.n.bins: 4
01-07-24 00:31:47 Created 10 stratified subsamples [resample]
01-07-24 00:31:47 Hello, egenn [boost]
.:Regression Input Summary
Training features: 374 x 50
Training outcome: 374 x 1
Testing features: Not available
Testing outcome: Not available
.:Parameters mod: CART
mod.params:
maxdepth: 1
init: -0.182762669446564
max.iter: 50
learning.rate: 0.1
tolerance: 0
tolerance.valid: 1e-05
01-07-24 00:31:47 [ Boosting Classification and Regression Trees... ] [boost]
01-07-24 00:31:47 Iteration #5: Training MSE = 49.08; Validation MSE = 52.02 [boost]
01-07-24 00:31:48 Iteration #10: Training MSE = 45.91; Validation MSE = 49.65 [boost]
01-07-24 00:31:48 Iteration #15: Training MSE = 43.30; Validation MSE = 47.54 [boost]
01-07-24 00:31:48 Iteration #20: Training MSE = 40.92; Validation MSE = 45.75 [boost]
01-07-24 00:31:48 Iteration #25: Training MSE = 38.78; Validation MSE = 44.10 [boost]
01-07-24 00:31:48 Iteration #30: Training MSE = 36.85; Validation MSE = 42.97 [boost]
01-07-24 00:31:48 Iteration #35: Training MSE = 35.08; Validation MSE = 41.76 [boost]
01-07-24 00:31:48 Iteration #40: Training MSE = 33.45; Validation MSE = 40.69 [boost]
01-07-24 00:31:48 Iteration #45: Training MSE = 31.93; Validation MSE = 39.59 [boost]
01-07-24 00:31:48 Iteration #50: Training MSE = 30.53; Validation MSE = 38.68 [boost]
01-07-24 00:31:48 Reached max iterations [boost]
.:Regression Training Summary
MSE = 30.53 (42.94%)
RMSE = 5.53 (24.46%)
MAE = 4.36 (23.22%)
r = 0.82 (p = 1.3e-91)
R sq = 0.4301-07-24 00:31:48 Completed in 0.01 minutes (Real: 0.64; User: 0.58; System: 0.06) [boost]
We notice the validation error is quite higher than the training error and is also less smooth.
14.2 Boost CART stumps: step slower
To get better results out of boosting, it usually helps to decrease the learning rate and increase the number of steps. From an optimization point of view, the lower learning rate does not mean that you simply take more, smallet steps instead of fewer bigger steps, but it makes you follow a different, more precise optimization path.