import rtemis as rt
▄▄▄ ▄▄▄▄▄▄▄▄ .• ▌ ▄ ·. ▪ .▄▄ ·
▀▄ █·•██ ▀▄.▀··██ ▐███▪██ ▐█ ▀.
▐▀▀▄ ▐█.▪▐▀▀▪▄▐█ ▌▐▌▐█·▐█·▄▀▀▀█▄
▐█•█▌ ▐█▌·▐█▄▄▌██ ██▌▐█▌▐█▌▐█▄▪▐█
.▀ ▀ ▀▀▀ ▀▀▀ ▀▀ █▪▀▀▀▀▀▀ ▀▀▀▀ py
.:rtemispy v.0.2.0 🏝 macOS-13.4-arm64-arm-64bit
import rtemis as rt
▄▄▄ ▄▄▄▄▄▄▄▄ .• ▌ ▄ ·. ▪ .▄▄ ·
▀▄ █·•██ ▀▄.▀··██ ▐███▪██ ▐█ ▀.
▐▀▀▄ ▐█.▪▐▀▀▪▄▐█ ▌▐▌▐█·▐█·▄▀▀▀█▄
▐█•█▌ ▐█▌·▐█▄▄▌██ ██▌▐█▌▐█▌▐█▄▪▐█
.▀ ▀ ▀▀▀ ▀▀▀ ▀▀ █▪▀▀▀▀▀▀ ▀▀▀▀ py
.:rtemispy v.0.2.0 🏝 macOS-13.4-arm64-arm-64bit
Load the sonar data set from the UCI repository (downloaded locally):
= rt.read("~/Data/Sonar.csv") dat
06-20-23 17:47:04 ▶ Reading Sonar.csv... [read]
06-20-23 17:47:04 Got 208 rows & 61 columns [read]
06-20-23 17:47:04 Read in 0.0133 seconds [read]
rt.check_data(dat)
DataFrame with 208 rows x 61 columns
Data types
60 float columns.
0 integer columns.
1 character column.
0 categorical columns.
Issues
0 constant columns.
0 duplicated rows.
0 missing values total.
Recommendations
Everything looks good.
There are 60 continuous features and 1 character. We want to convert the character variable to a categorical. We can either re-load the data using the argument string2cat=True
or we can use the preprocess
function with the same argument.
= rt.preprocess(dat, string2cat=True) dat
06-20-23 17:47:04 Converting string columns to categorical [preprocess]
rt.check_data(dat)
DataFrame with 208 rows x 61 columns
Data types
60 float columns.
0 integer columns.
0 character columns.
1 categorical column.
Issues
0 constant columns.
0 duplicated rows.
0 missing values total.
Recommendations
Everything looks good.
Create resample using resample()
:
= rt.resample(dat, seed=2023) res
06-20-23 17:47:04 Created 10 stratified subsamples [resample]
Spli train and testing data using split_train_test()
:
= rt.split_train_test(dat, res[0]) dat_train, dat_test
= rt.s_LightGBM(dat_train, dat_test) sonar_lgbm
06-20-23 17:47:04 Welcome, egenn 🌉 [s_LightGBM]
Input data summary:
│ Training: 155 x 61
└─ Testing: 53 x 61
Outcome: Class
06-20-23 17:47:04 Tuning LightGBM by grid search... [gridsearch]
06-20-23 17:47:04 Created 5 bootstraps [resample]
06-20-23 17:47:04 Grid search: Running 5 combinations [gridsearch]
06-20-23 17:47:05 Completed in 0.925 seconds [gridsearch]
06-20-23 17:47:05 Best LightGBM hyperparameters: [s_LightGBM]
{'max_nrounds': 231, 'num_leaves': 16, 'learning_rate': 0.01, 'lambda_l1': 0.0, 'lambda_l2': 0.0}
06-20-23 17:47:05 Training LightGBM with tuned hyperparameters [s_LightGBM]
[100] training's binary_logloss: 0.39129
[200] training's binary_logloss: 0.247466
Classification was performed using LightGBM.
│ Training Balanced Accuracy was 0.99.
└─ Testing Balanced Accuracy was 0.85.
06-20-23 17:47:05 Training complete. in 1.11 seconds [s_LightGBM]