install.packages("remotes")
::install_github("egenn/rtemis") remotes
1 Setup
1.1 Install the latest rtemis version from GitHub
You can run the install_github()
command as often as you like: it will only work if there is an update available on GitHub. It will install rtemis with a minimal set of dependencies. A dependency check is run each time a function is called and will inform you if a package is missing. Install the following packages to begin with a reasonable lightweight setup:
<- c("data.table", "future", "gbm", "glmnet", "plyr", "ranger", "rpart")
packages install.packages(packages)
1.2 R
For an introduction to R, see Programming for Data Science in R.
1.3 IDEs: VS Code, RStudio
You can run rtemis in the command line or using the IDE of your choice. VS Code and RStudio are probably the two best options right now.
1.4 macOS
1.4.1 Prerequisites
If you are installing on macOS, make sure you have installed:
Note on R + Java on macOS: In order to run some R packages that use rJava, like bartMachine, you may need to add a link to libjvm.dylib
inside your R lib
folder as explained here
1.5 External frameworks
The following are all optional - install as needed.
1.5.1 H2O
To use H2O (d.H2OGLRM()
, s.H2ODL.R()
, s.H2OGBM.R()
, s.H2ORF()
, u.H2OKMEANS()
), you will need to install H2O first. Follow instructions on the H2O website.
1.5.2 Spark
To use Spark’s ML framework (currently s.MLRF()
), installation can be performed within R:
install.packages("sparklyr")
::spark_install() sparklyr
1.5.3 Keras + TensorFlow
You can easily install Keras for R and the TensorFlow library:
::install_github("rstudio/keras")
remoteslibrary(keras)
install_keras()
Learn more on the RStudio website
1.6 Load rtemis
library(rtemis)
1.7 Setup project directories
rtemis includes a function and RStudio addin to initialize a simple directory structure under the working directory for your data analysis projects with the following:
- ./R/
Directory to save your project.R
code files - ./Data/
Directory to save your project data files, e.g..rds
,.csv
, etc - ./Results/
Directory to save your output, e.g. rtemis supervised learning output directories (define usingoutdir
, e.g.outdir = "./Results/Dataset_Algorithm"
) - ./rtInit.log
Log file with R session info
Call the function directly or use RStudio’s Addins drop down menu:
rtInitProjectDir()