12  Strings

Julia includes a Char type for single characters and a String type for strings of characters.

Strings are created with double quotes, and characters with single quotes:

12.1 Character

a = 'c'
'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)
typeof(a)
Char

12.1.1 String

b = "prettystring"
"prettystring"
typeof(b)
String

12.2 Char Array

x = ['p', 'a', 's', 'h']
4-element Vector{Char}:
 'p': ASCII/Unicode U+0070 (category Ll: Letter, lowercase)
 'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
 's': ASCII/Unicode U+0073 (category Ll: Letter, lowercase)
 'h': ASCII/Unicode U+0068 (category Ll: Letter, lowercase)

12.3 String Array

x = ["PID", "Age", "Sex", "Handedness"]
4-element Vector{String}:
 "PID"
 "Age"
 "Sex"
 "Handedness"

12.4 Integer to string

Use string() to convert integers to strings

string(10)
"10"

Use the broadcast operation (.) to operate on a vector or range:

string.([1, 3, 4, 7, 11])
5-element Vector{String}:
 "1"
 "3"
 "4"
 "7"
 "11"
string.(1:5)
5-element Vector{String}:
 "1"
 "2"
 "3"
 "4"
 "5"

12.5 String concatenation with *

Use * to concatenate two strings:

"This is" * " important"
"This is important"

Use .* to broadcast concatenation:

"Feature_" .* string.(1:5)
5-element Vector{String}:
 "Feature_1"
 "Feature_2"
 "Feature_3"
 "Feature_4"
 "Feature_5"

12.6 String length

x = ["a", "bb", "ccc"]
3-element Vector{String}:
 "a"
 "bb"
 "ccc"

length() on an array returns the array length:

length(x)
3

Use length.() to get the length of each string within the array:

length.(x)
3-element Vector{Int64}:
 1
 2
 3

12.7 Repeat string or character with ^ or repeat()

a^3
"ccc"
repeat(a, 3)
"ccc"

12.8 Convert any value to string with repr()

repr(415)
"415"

12.9 Get substring with SubString()

x = ["001Emergency", "010Cardiology", "018Neurology", 
    "020Anesthesia", "021Surgery", "051Psychiatry"]
6-element Vector{String}:
 "001Emergency"
 "010Cardiology"
 "018Neurology"
 "020Anesthesia"
 "021Surgery"
 "051Psychiatry"
SubString.(x, 1, 3)
6-element Vector{SubString{String}}:
 "001"
 "010"
 "018"
 "020"
 "021"
 "051"

Second index defaults to lastindex()

SubString.(x, 4)
6-element Vector{SubString{String}}:
 "Emergency"
 "Cardiology"
 "Neurology"
 "Anesthesia"
 "Surgery"
 "Psychiatry"

12.10 Find

Find whether a string occur in another string

occursin("42", "This was a long time, ago, 42 months, to be exact")
true

broadcast:

names = ["Jon", "John", "James", "Jim", "Jimmy", "Jimmie"]
occursin.("Jimmy", names)
6-element BitVector:
 0
 0
 0
 0
 1
 0

map:

map(x -> occursin("Jimmy", x), names)
6-element Vector{Bool}:
 0
 0
 0
 0
 1
 0

broadcast + findall to get position of match(es)

findall(occursin.("Jimmy", names))
1-element Vector{Int64}:
 5

broadcast + map to get position of match(es)

findall(x -> occursin("Jimmy", x), names)
1-element Vector{Int64}:
 5

12.11 Replace

replace dot with underscore

replace("First.Dose", "." => "_")
"First_Dose"

broadcast to apply on array of strings:

colnames = ["First.Dose", "Second.Dose"]
replace.(colnames,  "." => "_")
2-element Vector{String}:
 "First_Dose"
 "Second_Dose"

12.12 Regular Expressions

re = r"^Feat"
typeof(re)
Regex
m = match(re, "Feature_1")
RegexMatch("Feat")

Get the part of the input that matched the regex:

m.match
"Feat"

Get an iterable of all matches with eachmatch() and optionally collect it to a vector:

eachmatch(
    r"[[:alpha:]]*_1", 
    "There were three important variables: Small_1, Medium_2, and Large_1") |> collect
2-element Vector{RegexMatch}:
 RegexMatch("Small_1")
 RegexMatch("Large_1")

Get the integer indices of the matches using findall():

findall(
    r"[[:alpha:]]*_1", 
    "There were three important variables: Small_1, Medium_2, and Large_1")
2-element Vector{UnitRange{Int64}}:
 39:45
 62:68
v = ["Alpha_1", "Alpha_2", "Beta_1", "Beta_2"]
4-element Vector{String}:
 "Alpha_1"
 "Alpha_2"
 "Beta_1"
 "Beta_2"

broadcast to match across vector:

match.(r"_1$", v)
4-element Vector{Union{Nothing, RegexMatch}}:
 RegexMatch("_1")
 nothing
 RegexMatch("_1")
 nothing

12.13 Hexadecimal to Integer

12.13.1 Integer to hex:

string(255, base = 16)
"ff"

or, not converting to hex, but results in printing the integer as hex:

UInt(255)
0x00000000000000ff

12.13.2 Hex to integer:

parse(Int, "ff", base = 16)
255

or:

Int(0xff)
255

12.14 Resources