Use select_decom() to get a listing of available decomposition algorithms:
select_decom()
.:select_decom
rtemis supports the following decomposition algorithms:
Name Description
H2OAE H2O Autoencoder
H2OGLRM H2O Generalized Low-Rank Model
ICA Independent Component Analysis
Isomap Isomap
KPCA Kernel Principal Component Analysis
LLE Locally Linear Embedding
MDS Multidimensional Scaling
NMF Non-negative Matrix Factorization
PCA Principal Component Analysis
SPCA Sparse Principal Component Analysis
SVD Singular Value Decomposition
TSNE t-distributed Stochastic Neighbor Embedding
UMAP Uniform Manifold Approximation and Projection
We can further divide decomposition algorithms into linear (e.g. PCA, ICA, NMF) and nonlinear dimensionality reduction, (also called manifold learning, like LLE and tSNE).
9.0.1 Linear Dimensionality Reduction
As a simple example, let’s look the famous iris dataset. Note that we use this to demonstrate usage and is not a good example to assess the effectiveness of decomposition algorithms as the iris dataset consists of only 4 variables.
First, we select all variables from the iris dataset, excluding the group names, i.e. the labels. Since the iris dataset includes one duplicate observation, we can remove using preprocess(). This is required for t-SNE to work.
x <-preprocess(iris[, 1:4], removeDuplicates =TRUE)
Now, let’s try a few different algorithms, projecting to two dimensions and visualizing using [mplot3_xy]. Notice we are using the real labels to colo points in these examples:
9.0.1.1 Principal Component Analysic (PCA)
iris.PCA <-d_PCA(x)
[90m01-07-24 00:23:44[90m [0m[0mHello, egenn[90m [d_PCA]
[0m[90m01-07-24 00:23:44[90m [0m[0m||| Input has dimensions 149 rows by 4 columns,[90m [d_PCA]
[0m[90m01-07-24 00:23:44[90m [0m[0m interpreted as 149 cases with 4 features.[90m [d_PCA]
[0m[90m01-07-24 00:23:44[90m [0m[0mPerforming Principal Component Analysis...[90m [d_PCA]
[0m[90m01-07-24 00:23:44[90m [0m[0mCompleted in 5e-05 minutes (Real: 3e-03; User: 3e-03; System: 0.00)[90m [d_PCA]
[0m
mplot3_xy(iris.PCA$projections.train[, 1], iris.PCA$projections.train[, 2],group = iris$Species, main ="PCA on iris", xlab ="1st PCA component", ylab ="2nd PCA component")
9.0.1.2 Independent Component Analysis (ICA)
iris.ICA <-d_ICA(x, k =2)
[90m01-07-24 00:23:44[90m [0m[0mHello, egenn[90m [d_ICA]
[0m[90m01-07-24 00:23:44[90m [0m[0m||| Input has dimensions 149 rows by 4 columns,[90m [d_ICA]
[0m[90m01-07-24 00:23:44[90m [0m[0m interpreted as 149 cases with 4 features.[90m [d_ICA]
[0m[90m01-07-24 00:23:44[90m [0m[0mRunning Independent Component Analysis...[90m [d_ICA]
[0m[90m01-07-24 00:23:44[90m [0m[0mCompleted in 1.3e-04 minutes (Real: 0.01; User: 2e-03; System: 1e-03)[90m [d_ICA]
[0m
mplot3_xy(iris.ICA$projections.train[, 1], iris.ICA$projections.train[, 2],group = iris$Species, main ="ICA on iris",xlab ="1st ICA component", ylab ="2nd ICA component")
9.0.1.3 Non-negative Matrix Factorization (NMF)
iris.NMF <-d_NMF(x, k =2)
[90m01-07-24 00:23:44[90m [0m[0mHello, egenn[90m [d_NMF]
[0m[90m01-07-24 00:23:45[90m [0m[0m||| Input has dimensions 149 rows by 4 columns,[90m [d_NMF]
[0m[90m01-07-24 00:23:45[90m [0m[0m interpreted as 149 cases with 4 features.[90m [d_NMF]
[0m[90m01-07-24 00:23:45[90m [0m[0mRunning Non-negative Matrix Factorization...[90m [d_NMF]
[0m[90m01-07-24 00:23:45[90m [0m[0mCompleted in 0.01 minutes (Real: 0.84; User: 0.78; System: 0.04)[90m [d_NMF]
[0m
mplot3_xy(iris.NMF$projections.train[, 1], iris.NMF$projections.train[, 2],group = iris$Species, main ="NMF on iris",xlab ="1st NMF component", ylab ="2nd NMF component")
9.0.2 Non-linear dimensionality reduction
9.0.2.1 Isomap
iris.Isomap <-d_Isomap(x, k =2)
[90m01-07-24 00:23:45[90m [0m[0mHello, egenn[90m [d_Isomap]
[0m[90m01-07-24 00:23:46[90m [0m[0m||| Input has dimensions 149 rows by 4 columns,[90m [d_Isomap]
[0m[90m01-07-24 00:23:46[90m [0m[0m interpreted as 149 cases with 4 features.[90m [d_Isomap]
[0m[90m01-07-24 00:23:46[90m [0m[0mRunning Isomap...[90m [d_Isomap]
[0m[90m01-07-24 00:23:46[90m [0m[0mCompleted in 0.01 minutes (Real: 0.49; User: 0.45; System: 0.02)[90m [d_Isomap]
[0m
mplot3_xy(iris.Isomap$projections.train[, 1], iris.Isomap$projections.train[, 2],group = iris$Species, main ="Isomap on iris",xlab ="1st Isomap projection", ylab ="2nd Isomap projection")