x <- 1:10
z <- 11:20
x + z
[1] 12 14 16 18 20 22 24 26 28 30
Many built-in R functions are vectorized and so are many functions from external packages as well.
A vectorized function operates on all elements of an object.
Vectorization is very efficient: it can save both human (your) time and machine time.
In many cases, applying a function on all elements simultaneously may seem like the obvious or expected behavior, but since not all functions are vectorized, make sure to check the documentation (and/or test whether a function is vectorized using a simple example).
Such operations are applied between corresponding elements of each vector:
x <- 1:10
z <- 11:20
x + z
[1] 12 14 16 18 20 22 24 26 28 30
i.e. the above is equal to c(x[1] + z[1], x[2] + z[2], ..., x[n] + z[n])
.
Weight <- rnorm(20, mean = 80, sd = 1.7)
Weight
[1] 79.27580 77.07372 80.00327 79.47291 81.42009 79.92871 77.84448 78.29229
[9] 78.71447 78.20023 79.53383 79.57139 80.67948 82.17516 78.79052 77.29051
[17] 81.08844 81.29285 81.42259 77.60911
Height <- rnorm(20, mean = 1.7, sd = 0.1)
Height
[1] 1.628971 1.658772 1.674722 1.693329 1.756331 1.699521 1.755047 1.509184
[9] 1.614873 1.624307 1.775038 1.745995 1.743432 1.761433 1.729579 1.750039
[17] 1.636869 1.642027 1.630898 1.716704
BMI <- Weight/Height^2
BMI
[1] 29.87542 28.01127 28.52479 27.71636 26.39482 27.67259 25.27263 34.37438
[9] 30.18408 29.63957 25.24275 26.10182 26.54319 26.48552 26.33862 25.23658
[17] 30.26435 30.15032 30.61196 26.33429
In this cases, the scalar is repeated to match the length of the vector, i.e. it is recycled:
x + 10
[1] 11 12 13 14 15 16 17 18 19 20
x * 2
[1] 2 4 6 8 10 12 14 16 18 20
x / 10
[1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x ^ 2
[1] 1 4 9 16 25 36 49 64 81 100
Operations between a vector and a scalar are a special case of operations between vectors of unequal length. Whenever you perform an operation between two objects of different length, the shorter object’s elements are recycled:
x + c(2:1)
[1] 3 3 5 5 7 7 9 9 11 11
Operations between objects of unequal length can occur by mistake. If the shorter object’s length is a multiple of the longer object’s length, there will be no error or warning, as above. Otherwise, there is a warning (which may be confusing at first) BUT recycling still happens and is highly unlikely to be intentional.
x + c(1, 3, 9)
Warning in x + c(1, 3, 9): longer object length is not a multiple of shorter
object length
[1] 2 5 12 5 8 15 8 11 18 11
Operations between matrices are similarly vectorized, i.e. performed between corresponding elements:
Some examples of common mathematical operations that are vectorized:
log(x)
[1] 0.0000000 0.6931472 1.0986123 1.3862944 1.6094379 1.7917595 1.9459101
[8] 2.0794415 2.1972246 2.3025851
sqrt(x)
[1] 1.000000 1.414214 1.732051 2.000000 2.236068 2.449490 2.645751 2.828427
[9] 3.000000 3.162278
sin(x)
[1] 0.8414710 0.9092974 0.1411200 -0.7568025 -0.9589243 -0.2794155
[7] 0.6569866 0.9893582 0.4121185 -0.5440211
cos(x)
[1] 0.5403023 -0.4161468 -0.9899925 -0.6536436 0.2836622 0.9601703
[7] 0.7539023 -0.1455000 -0.9111303 -0.8390715