18  Control Flow

Code is often executed non-linearly (i.e. not line-by-line). Control flow (or flow of control) operations define the order in which code segments are executed.

Execution is often conditional (using if - else or switch()).

Segments of code may be repeated a defined number of times (for-loop) or as long as certain conditions are met (while-loop). Any loop can be cut short if needed.

Control flow operations form some of the fundamental building blocks of programs. Each operation is very simple - combine enough of them and you can build up to arbitrary complexity.

Tip

To access the documentation for if, for, while, or repeat, surround the command with backticks, e.g. ?`if`. Alternatively, you can use help("if").

18.1 Conditionals

18.1.1 if - else

Consider a systolic blood pressure measurement:

SBP <- 146 # mmHg
if (SBP <= 120) {
  cat("SBP is normal")
} else {
  cat("SBP is high")
}
SBP is high

18.1.2 if - else if - else

Consider a single blood sodium result

Na <- 142 # mEq/L
Na
[1] 142
if (Na > 145) {
  result <- "Hypernatremia"
} else if (Na < 135) {
  result <- "Hyponatremia"
} else {
  result <- "Normal"
}
result
[1] "Normal"

18.1.3 Conditional assignment with if - else

You can directly assign the output of an if statement to an object.

Na <- 142 # mEq/L
result <- if (Na > 145) {
  "Hypernatremia"
} else if (Na < 135) {
  "Hyponatremia"
} else {
  "Normal"
}

18.1.4 Conditional assignment with ifelse()

ifelse() is vectorized and can be a great, compact method of conditional assignment.

Consider a vector of blood bilirubin levels:

conjBil <- sample(runif(100, min = 0, max = 0.5), size = 20)
conjBil
 [1] 0.2036966 0.1424590 0.3737957 0.2863451 0.4854572 0.2377842 0.4321054
 [8] 0.1965843 0.3863984 0.4469768 0.1191254 0.2951860 0.1062061 0.3220131
[15] 0.2723623 0.3367104 0.2804635 0.3797272 0.2835055 0.4595988
conjBil_bin <- ifelse(conjBil > 0.3, "Hyperbilirubinemia", "Normal")
conjBil_bin
 [1] "Normal"             "Normal"             "Hyperbilirubinemia"
 [4] "Normal"             "Hyperbilirubinemia" "Normal"            
 [7] "Hyperbilirubinemia" "Normal"             "Hyperbilirubinemia"
[10] "Hyperbilirubinemia" "Normal"             "Normal"            
[13] "Normal"             "Hyperbilirubinemia" "Normal"            
[16] "Hyperbilirubinemia" "Normal"             "Hyperbilirubinemia"
[19] "Normal"             "Hyperbilirubinemia"

The values assigned to the “yes” and “no” conditions can be a vector of the same length as the first argument.

Consider an arbitrary numeric example:

a <- 1:10
y <- ifelse(a > 5, 11:20, 21:30)
y
 [1] 21 22 23 24 25 16 17 18 19 20

So what did this do?

It is equivalent to an if-else statement within a for-loop:

idl <- a > 5
yes <- 11:20
no <- 21:30
out <- vector("numeric", length = 10)
for (i in seq(a)) {
  if (idl[i]) {
    out[i] <- yes[i]
  } else {
    out[i] <- no[i]
  }
}
out
 [1] 21 22 23 24 25 16 17 18 19 20

i.e.

  • Create a logical index using test
  • for each element i in test:
    • if the element i is TRUE, return yes[i], else no[i]

For another example, lets take integers 1:11 and square the odd ones and cube the even ones. We use the modulo operation %% to test if each element is odd or even:

x <- 1:11
xsc <- ifelse(x %% 2 == 0, c(1:11)^3, c(1:11)^2)
xsc
 [1]    1    8    9   64   25  216   49  512   81 1000  121

18.1.5 Conditional assignment with multiple options using switch()

Instead of using multiple if - else if statements, we can build a more compact call using switch, which is best suited for options that are of type character, rather than numeric.

Department <- sample(letters[seq(5)], size = 1)
Department
[1] "c"
output <- switch(Department, # 1. Some expression
  a = "Outpatient",          # 2. The possible values of the expression, unquoted
  b = "Emergency",           #    followed by the `=` and the conditional output
  c = "Cardiology",
  d = "Neurology",
  e = "Oncology",
  "Unknown Department"        # 3. An optional last argument is the default
                              #    value, if there is no match above
)
output
[1] "Cardiology"

18.2 Loops

18.2.1 for loops

Tip

Use for loops to repeat execution of a block of code a certain number of times.

The for loop syntax is for (var in vector) expression.

The expression is usually surrounded by curly brackets and can include any number of lines, any amount of code:

for (i in 1:3) {
  cat("This is item", i, "\n")
}
This is item 1 
This is item 2 
This is item 3 

The loop executes for length(vector) times.
At iteration i, var = vector[i].
You will often use the value of var inside the loop - but you don’t have to:

for (i in seq(10)) {
  cat(i^2, "\n")
}
1 
4 
9 
16 
25 
36 
49 
64 
81 
100 

letters is a built-in constant that includes all 26 lowercase letters of the Roman alphabet; LETTERS similarly includes all 26 uppercase letters.

for (letter in letters[1:5]) {
  cat(letter, "is a letter!\n")
}
a is a letter!
b is a letter!
c is a letter!
d is a letter!
e is a letter!

18.2.1.1 Working on data within a for loop

A common scenario involves working on a data object, whether a vector, matrix, list, data.frame, and performing an operation on each elements, one at a time. While a lot of these operations are often performed using loop functions instead, for-loops can also be used.

You can start by initializing an object of the appropriate class and dimensions to hold the output. Then, each iteration of the for loop will assign its output to the corresponding element/s of this object.

In the following example we transform the mtcars built-in dataset’s features to z-scores. The built-in command scale() will do this for quickly and conveniently, this is for demonstration purposes:

First, initialize the output to be the desired class and dimensions:

class(mtcars)
[1] "data.frame"
dim(mtcars)
[1] 32 11
mtcars_z <- data.frame(matrix(0, nrow = 32, ncol = 11))
colnames(mtcars_z) <- colnames(mtcars)

or, it is simpler to just make a copy of mtcars to be overwritten by the for loop later:

mtcars_z <- mtcars

Standardization involves subtracting the mean and dividing by the standard deviation.

Here is the for loop - we iterate through each column and assign the transformed data:

for (i in 1:ncol(mtcars)) {
  mtcars_z[, i] <- (mtcars[, i] - mean(mtcars[, i])) / sd(mtcars[, i])
}

Let’s compare to the output of the scale() command by print the first 3 rows and columns of each:

mtcars_z2 <- as.data.frame(scale(mtcars))
mtcars_z[1:3, 1:3]
                    mpg        cyl       disp
Mazda RX4     0.1508848 -0.1049878 -0.5706198
Mazda RX4 Wag 0.1508848 -0.1049878 -0.5706198
Datsun 710    0.4495434 -1.2248578 -0.9901821
mtcars_z2[1:3, 1:3]
                    mpg        cyl       disp
Mazda RX4     0.1508848 -0.1049878 -0.5706198
Mazda RX4 Wag 0.1508848 -0.1049878 -0.5706198
Datsun 710    0.4495434 -1.2248578 -0.9901821

Note that we wrapped scale() around as.data.frame() because it outputs a matrix.

all.equal(mtcars_z, mtcars_z2)
[1] TRUE

18.2.1.2 Nested for loops

a <- matrix(1:9, nrow = 3)
for (i in seq(3)) {
  for (j in seq(3)) {
    cat("  a[", i, ",", j, "] is ", a[i, j], "\n", sep = "")
  }
}
  a[1,1] is 1
  a[1,2] is 4
  a[1,3] is 7
  a[2,1] is 2
  a[2,2] is 5
  a[2,3] is 8
  a[3,1] is 3
  a[3,2] is 6
  a[3,3] is 9

18.2.1.3 Printing within a for loop

In the R console objects get printed just by typing their name:

a <- 4
a
[1] 4
# same as
print(a)
[1] 4

This “automatic printing” does not happen within a for loop, so you simply use print() (or cat() as preferred):

The following loop does not print out anything:

a <- 0
for (i in 1:4) {
  a <- a + i^2
  a
}

but this does:

a <- 0
for (i in 1:4) {
  a <- a + i^2
  print(a)
}
[1] 1
[1] 5
[1] 14
[1] 30

18.2.2 while loops

a <- 10
while (a > 0) {
  a <- a - 1
  cat("a is equal to", a, "\n")
}
a is equal to 9 
a is equal to 8 
a is equal to 7 
a is equal to 6 
a is equal to 5 
a is equal to 4 
a is equal to 3 
a is equal to 2 
a is equal to 1 
a is equal to 0 
cat("when all is said and done, a is", a)
when all is said and done, a is 0

18.2.3 break stops execution of a loop:

for (i in seq(10)) {
  if (i == 5) break
  cat(i, "squared is", i^2, "\n")
}
1 squared is 1 
2 squared is 4 
3 squared is 9 
4 squared is 16 

18.2.4 next skips the current iteration:

for (i in seq(7)) {
  if (i == 5) next
  cat(i, "squared is", i^2, "\n")
}
1 squared is 1 
2 squared is 4 
3 squared is 9 
4 squared is 16 
6 squared is 36 
7 squared is 49 

18.2.5 repeat loops

A repeat block initiates an infinite loop and you must use break to exit. They are less commonly used.

i <- 10
repeat {
  i <- i - 1
  if (i == 0) break
  cat("i is", i, "\n")
}
i is 9 
i is 8 
i is 7 
i is 6 
i is 5 
i is 4 
i is 3 
i is 2 
i is 1 
Note

Any number of control flow operations can be combined and nested as needed.