11  Predict

Figure 11.1: The Predict module makes advanced machine learning available in a few clicks

The Predict module allows training of cross-validated and optionally tuned regression and classification models.

11.1 Resampling

The “Outer Resampling” settings control the training-test splits. These control the arguments passed to the rtemis resample() function.

  • Resampling method
    • Stratified subsampling
    • Stratified bootstrap
    • K-fold
    • Bootstrap
    • Leave-one-out
  • Seed: for reproducibility, if you set a seed here, all train-test resamples will be the same between runs, e.g. this allows direct comparison of models trained with different algorithms

11.2 Algorithm

Available algorithms:

Algorithm-specific options appear once an Algorithm has been selected. Tooltips explain each hyperparameter.

11.3 Hyperparameter tuning

The Predict module uses the rtemis elevate() function to perform automatic nested resampling, which means:

  • Splitting full sample into multiple training & testing subsets
  • Splitting each training sample into training & validation subsets to perform hyperparameter tuning (model selection)

A musical note in front of an input box means the hyperparameter is tunable. Automatic hyperparameter tuning will be performed if more than one value is entered.

For example, if you have selected Gradient Boosting as the learning algorithm, you can input “2, 3” in “Max depth”. Internal 5-fold cross-validation of each training set will be performed, the best overall performing combination of hyperparameters will be chosen, and a model will be retrained on the full training set using the best hyperparameter combination.