12  Experiment Logging

12.1 Log to directory

You can save any rtemis supervised learning model to disk by specifying an output directory using the outdir argument:

iris.cart <- s_CART(iris,
                    outdir = "./Results/iris_CART")

This will save:

  • A text .log file with the console output
  • A PDF with True vs. Fitted for Regression and a confusion matrix for Classification
  • A RDS file with the trained model (i.e. the R6 object iris.cart in the above example)

The RDS files can be shared with others and loaded back into R at any time.

When running a series of experiments it makes sense to use the outdir argument to save models to disk for reference.

12.2 Interactive logging

The above method of specifying an outdir is the main way to save models to disk. In practice, we often train a series of models interactively and would like to keep track of what we have tried and how it worked out. rtemis includes rtModLogger to help with that. You first create a new logger object, think of it as a container that will hold model parameters and error metrics - not the model itself. Once the logger is created you can add any models to it:

Some synthetic data:

x <- rnormmat(400, 400, seed = 2019)
w <- rnorm(400)
y <- c(x %*% w + rnorm(400))

dat <- data.frame(x, y)
res <- resample(dat)
06-30-24 10:57:30 Input contains more than one columns; will stratify on last [resample]
.:Resampling Parameters
    n.resamples: 10 
      resampler: strat.sub 
   stratify.var: y 
        train.p: 0.75 
   strat.n.bins: 4 
06-30-24 10:57:30 Created 10 stratified subsamples [resample]

dat.train <- dat[res$Subsample_1, ]
dat.test <- dat[-res$Subsample_1, ]

Initialize a new logger object:

logger <- rtModLogger$new()
logger
.:.:rtemis Supervised Model Logger

   Contents: no models yet 

12.2.1 Train some models and add them to the logger:

mod.ridge <- s_GLMNET(dat.train, dat.test,
                      alpha = 0, lambda = .01, verbose = F)
logger$add(mod.ridge)
06-30-24 10:57:30 Added 1 model to logger; 1 total [logger$add]

mod.lasso <- s_GLMNET(dat.train, dat.test,
                      alpha = 1, lambda = .01, verbose = F)
logger$add(mod.lasso)
06-30-24 10:57:30 Added 1 model to logger; 2 total [logger$add]

mod.elnet <- s_GLMNET(dat.train, dat.test,
                      alpha = .5, lambda = .01, verbose = F)
logger$add(mod.elnet)
06-30-24 10:57:30 Added 1 model to logger; 3 total [logger$add]

12.2.2 Plot model performance:

logger$plot(names = c("Ridge", "LASSO", "Elastic Net"))

12.2.3 Get a quick summary:

results <- logger$summary()
results
         Train Rsq  Test Rsq
GLMNET_1 0.9999773 0.5522008
GLMNET_2 0.9998142 0.7465513
GLMNET_3 0.9999467 0.7420425
attr(,"metric")
[1] "Rsq"

12.2.4 Write model hyperparameters and performance to a multi-sheet XLSX file:

logger$tabulate(filename = "./Results/model_metrics.xlsx")

In this example, the XLSX file will contain 3 sheets, one per model. We can save the output of tabulate to a list as well:

tbl <- logger$tabulate()
tbl
$GLMNET_1
  ModelName lambda alpha  Train.MAE  Train.MSE Train.RMSE Train.NRMSE
1    GLMNET   0.01     0 0.07712676 0.01000251  0.1000125 0.000816399
  Train.MAE.EXP Train.MAE.RED Train.MSE.EXP Train.MSE.RED Train.RMSE.EXP
1      16.86324     0.9954263      440.4743     0.9999773       20.98748
  Train.RMSE.RED   Train.r Train.r.p Train.SSE Train.SSR Train.SST Train.Rsq
1      0.9952347 0.9999887         0  2.990749  131644.9  131701.8 0.9999773
  Train.stderr Test.MAE Test.MSE Test.RMSE Test.NRMSE Test.MAE.EXP Test.MAE.RED
1    0.1000125  10.6674 182.2739  13.50089  0.1483299     15.84375    0.3267122
  Test.MSE.EXP Test.MSE.RED Test.RMSE.EXP Test.RMSE.RED    Test.r     Test.r.p
1     406.6002    0.5517121      20.16433     0.3304569 0.7991484 1.306608e-23
  Test.SSE Test.SSR Test.SST  Test.Rsq Test.stderr
1 18409.67 49026.38 41111.44 0.5522008    13.50089

$GLMNET_2
  ModelName lambda alpha Train.MAE  Train.MSE Train.RMSE Train.NRMSE
1    GLMNET   0.01     1 0.2330074 0.08185309  0.2860998 0.002335423
  Train.MAE.EXP Train.MAE.RED Train.MSE.EXP Train.MSE.RED Train.RMSE.EXP
1      16.86324     0.9861825      440.4743     0.9998142       20.98748
  Train.RMSE.RED  Train.r Train.r.p Train.SSE Train.SSR Train.SST Train.Rsq
1      0.9863681 0.999923         0  24.47407  130198.3  131701.8 0.9998142
  Train.stderr Test.MAE Test.MSE Test.RMSE Test.NRMSE Test.MAE.EXP Test.MAE.RED
1    0.2860998 7.997417 103.0594  10.15182  0.1115348     15.84375    0.4952319
  Test.MSE.EXP Test.MSE.RED Test.RMSE.EXP Test.RMSE.RED    Test.r     Test.r.p
1     406.6002    0.7465338      20.16433     0.4965457 0.8717328 1.945751e-32
  Test.SSE Test.SSR Test.SST  Test.Rsq Test.stderr
1    10409 40018.68 41069.47 0.7465513    10.15182

$GLMNET_3
  ModelName lambda alpha Train.MAE  Train.MSE Train.RMSE Train.NRMSE
1    GLMNET   0.01   0.5 0.1243415 0.02348304  0.1532418 0.001250907
  Train.MAE.EXP Train.MAE.RED Train.MSE.EXP Train.MSE.RED Train.RMSE.EXP
1      16.86324     0.9926265      440.4743     0.9999467       20.98748
  Train.RMSE.RED   Train.r Train.r.p Train.SSE Train.SSR Train.SST Train.Rsq
1      0.9926984 0.9999776         0  7.021428  130928.9  131701.8 0.9999467
  Train.stderr Test.MAE Test.MSE Test.RMSE Test.NRMSE Test.MAE.EXP Test.MAE.RED
1    0.1532418 8.200421 104.9249  10.24329  0.1125397     15.84375     0.482419
  Test.MSE.EXP Test.MSE.RED Test.RMSE.EXP Test.RMSE.RED    Test.r     Test.r.p
1     406.6002    0.7419457      20.16433     0.4920096 0.8713408 2.240629e-32
  Test.SSE Test.SSR Test.SST  Test.Rsq Test.stderr
1 10597.42 41196.62 41082.03 0.7420425    10.24329