using CategoricalArrays
13 CategoricalArrays
To represent categorical variables in Julia, we can use the CategoricalArray
type from CategoricalArrays.jl.
13.1 Create CategoricalArray with categorical()
= ["a", "c", "d", "b", "a", "a", "d", "c"] x
8-element Vector{String}:
"a"
"c"
"d"
"b"
"a"
"a"
"d"
"c"
= categorical(x) xc
8-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
"a"
"c"
"d"
"b"
"a"
"a"
"d"
"c"
The same can be achieved using the type object:
= CategoricalArray(x) xc
8-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
"a"
"c"
"d"
"b"
"a"
"a"
"d"
"c"
13.2 The underlying UInt32
vector
A CategoricalArray is a mapping between an underlying UInt32
index to a set of levels.
You can access the underlying integers:
xc.refs
8-element Vector{UInt32}:
0x00000001
0x00000003
0x00000004
0x00000002
0x00000001
0x00000001
0x00000004
0x00000003
Convert them to Int32:
.% Int32 xc.refs
8-element Vector{Int32}:
1
3
4
2
1
1
4
3
or using convert()
:
convert(Array{Int32}, xc.refs)
8-element Vector{Int32}:
1
3
4
2
1
1
4
3
13.3 Get levels of a CategoricalArray
with levels()
levels(xc)
4-element Vector{String}:
"a"
"b"
"c"
"d"
13.4 Set new level labels with recode()
& recode!()
recode!(xc,
"a" => "alpha",
"b" => "beta",
"c" => "gamma",
"d" => "delta")
8-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
"alpha"
"gamma"
"delta"
"beta"
"alpha"
"alpha"
"delta"
"gamma"
13.5 Reorder levels with levels()
& levels!()
levels()
in Julia vs. R
In Julia, levels()
reorders levels of a CategoricalArray
, unlike in R where it recodes / changes level labels.
xc
8-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
"alpha"
"gamma"
"delta"
"beta"
"alpha"
"alpha"
"delta"
"gamma"
levels(xc)
4-element Vector{String}:
"alpha"
"beta"
"gamma"
"delta"
levels!(xc, ["delta", "gamma", "beta", "alpha"])
8-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
"alpha"
"gamma"
"delta"
"beta"
"alpha"
"alpha"
"delta"
"gamma"
levels(xc)
4-element Vector{String}:
"delta"
"gamma"
"beta"
"alpha"