16 vroom Data I/O
vroom
is a high-performance delimited data reader and writer with extended functionality.
16.1 Read delimited file
Important arguments:
-
delim
: Field delimiter. IfNULL
, will guess by looking at the data. -
na
: Character vector of strings to interpret as missing values, i.e.NA
.
dat <- vroom("https://archive.ics.uci.edu/ml/machine-learning-databases/00519/heart_failure_clinical_records_dataset.csv")
Rows: 299 Columns: 13
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
dbl (13): age, anaemia, creatinine_phosphokinase, diabetes, ejection_fractio...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
vroom
returns a tibble.
dat
# A tibble: 299 × 13
age anaemia creatinine_phosphokinase diabetes ejection_fraction
<dbl> <dbl> <dbl> <dbl> <dbl>
1 75 0 582 0 20
2 55 0 7861 0 38
3 65 0 146 0 20
4 50 1 111 0 20
5 65 1 160 1 20
6 90 1 47 0 40
7 75 1 246 0 15
8 60 1 315 1 60
9 65 0 157 0 65
10 80 1 123 0 35
# ℹ 289 more rows
# ℹ 8 more variables: high_blood_pressure <dbl>, platelets <dbl>,
# serum_creatinine <dbl>, serum_sodium <dbl>, sex <dbl>, smoking <dbl>,
# time <dbl>, DEATH_EVENT <dbl>
16.2 Select columns to read
dat <- vroom(
"https://archive.ics.uci.edu/ml/machine-learning-databases/00519/heart_failure_clinical_records_dataset.csv",
col_select = c("age", "anaemia", "ejection_fraction"))
Rows: 299 Columns: 3
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
dbl (3): age, anaemia, ejection_fraction
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
dat
# A tibble: 299 × 3
age anaemia ejection_fraction
<dbl> <dbl> <dbl>
1 75 0 20
2 55 0 38
3 65 0 20
4 50 1 20
5 65 1 20
6 90 1 40
7 75 1 15
8 60 1 60
9 65 0 65
10 80 1 35
# ℹ 289 more rows
16.3 Resources
Read more about vroom here.