Elixir : Basics of control flow structures
Control flow structures in programming are structures that control, alter or branch the sequential flow of execution based on conditions or loops. Elixir, being a functional programming language utilises features such as pattern matching, guard clauses, function clauses and recursion to handle and control the flow of execution. Other than the above mentioned features and techniques, elixir also provides imperative style control-flow structures such as if/else, case, cond, for comprehension etc. Anything achieved using the imperative control-flow structures can also be achieved using their functional programming equivalents and the choice is just a matter of preference. This article explains the different control-flow structures available in elixir along with its syntax and usage.
If/else construct
The if/else construct in elixir contains two branches of code flow, out of which one is chosen based on the result of a conditional expression. The conditional expression can have multiple expressions combined by the boolean operators &&/||
or their stricter equivalents and/or
. The first branch is executed if the conditional expression evaluates to a truthy value(anything except nil
and false
) and the second branch is executed if the conditional expression evaluates to a falsy value. The syntax involves using if conditional_expression do
followed by the first branch of expressions, then else
followed by the second branch of expressions concluded by an end
keyword. Like every other expression in elixir, the if/else construct returns the value evaluated from the last expression of the executed branch.
defmodule Test do
def even_or_odd(n) do
if rem(n, 2) == 0 do
IO.puts("Even number")
:even
else
IO.puts("Odd number")
:odd
end
end
end
Test.even_or_odd(5)
Odd number
:odd
Test.even_or_odd(2)
Even number
:even
If the if
and else
branch only have one expression each, then a single line short syntax can be used, eliminating the end
keyword from the construct.
defmodule Test do
def even_or_odd(n) do
if rem(n, 2) == 0, do: :even, else: :odd
end
end
Test.even_or_odd(5)
:odd
Test.even_or_odd(2)
:even
If only one branch i.e. the if
block branch is used without the else
branch, nil is implicitly returned if the conditional expression evaluates to a falsy value.
defmodule Test do
def truthy(val) do
if val, do: :truthy # equivalent to - if val, do: :truthy, else: nil
end
end
Test.truthy(5)
:truthy
Test.truthy(nil)
nil
Test.truthy(false)
nil
The variables defined in the outer scope can be accessed from within the if
and else
block expressions. But rebinding the variables present in the outer scope will only have its effect within the respective branch’s block. The original value bound to the variable in the outer scope will not be altered.
defmodule Test do
def scope_test(n) do
if rem(n, 2) == 0 do
n = :even
IO.puts("If block - #{n}")
else
n = :odd
IO.puts("Else block - #{n}")
end
IO.puts("Function block - #{n}")
end
end
Test.scope_test(2)
If block - even
Function block - 2
:ok
Test.scope_test(5)
Else block - odd
Function block - 5
:ok
As you can see, once the inner if and else blocks go out of context, printing the variable n
reveals that the original value bound to it before entering the if/else construct is retained and is not affected.
If you have to rebind a variable defined in the outer scope using the if/else construct, then the variable must be bound with the value returned from the if/else construct as follows.
defmodule Test do
def scope_test(n) do
n = if rem(n, 2) == 0, do: :even, else: :odd
IO.puts("Function block - #{n}")
end
end
Test.scope_test(2)
Function block - even
:ok
Test.scope_test(5)
Function block - odd
:ok
Nesting of if/else structures is possible in elixir just like the other programming languages.
defmodule Test do
def even_or_odd(n) do
if is_integer(n) do
if rem(n, 2) == 0, do: :even, else: :odd
else
:error
end
end
end
Test.even_or_odd(5)
:odd
Test.even_or_odd(2)
:even
Test.even_or_odd("test")
:error
Unless/else construct
Unless is just a less commonly used alternative for the if construct where the conditional expression’s evaluated value is negated before choosing the branch to execute. In other words, an if construct executes the first branch if the conditional expression evaluates to a truthy value and the second branch if the conditional expression evaluates to a falsy value. On the other hand, the Unless construct evaluates the first branch if the conditional expression evaluates to a falsy value and the second branch if the conditional expression evaluates to a truthy value. Other than the negation part, the unless construct behaves the same way as the if/else construct.
defmodule Test do
def non_empty_list?(list) do
unless list == [], do: true, else: false
end
end
Test.non_empty_list?([])
false
Test.non_empty_list?([1, 2, 3])
true
The above code can be written using if construct as follows.
defmodule Test do
def non_empty_list?(list) do
if list != [], do: true, else: false
end
end
Test.non_empty_list?([])
false
Test.non_empty_list?([1, 2, 3])
true
Cond construct
The cond construct is similar to if/else if
construct from other programming languages. It can have multiple branches of code flow based on multiple conditional expressions. Each conditional expression will be evaluated from top to bottom and the first expression to return a truthy value will have its branch executed. The syntax involves using cond do
followed by multiple expression -> body
in each line, terminated by an end
keyword. A branch of code can have multiple expressions written in multiple lines.
defmodule Test do
def even_or_odd?(n) do
cond do
!is_integer(n) -> :error
rem(n, 2) == 0 -> :even
true -> :odd
end
end
end
Test.even_or_odd?(5)
:odd
Test.even_or_odd?("test")
:error
Test.even_or_odd?(2)
:even
Just like the if/else construct, the cond construct returns the value of the last expression of the executed branch and, in order to rebind a variable from the outer scope, the variable must be explicitly bound to the value returned from the cond construct.
defmodule Test do
def even_or_odd?(n) do
result = cond do
!is_integer(n) -> :error
rem(n, 2) == 0 -> :even
true -> :odd
end
IO.puts(result)
end
end
Test.even_or_odd?(2)
even
:ok
If none of the provided conditional expressions evaluate to a truthy value, then a CondClauseError
will be thrown. In order to avoid this, a catch-all branch can be created by having a truthy value as the last conditional expression, so that it always evaluates to true if the branches above do not get executed.
defmodule Test do
def even_or_odd?(n) do
cond do
!is_integer(n) -> :error
rem(n, 2) == 0 -> :even
end
end
end
Test.even_or_odd?(5)
** (CondClauseError) no cond clause evaluated to a truthy value
Case construct
The case construct pattern matches on an elixir term and executes the branch of code related to the successfully matched pattern. It behaves the same way as pattern matching in function clauses. The patterns allow usage of guard clauses and they are matched one by one from top to bottom. The syntax involves using case expression do
followed by the multiple patterns in the format pattern -> body
in each line, terminated by an end
keyword.
Any error in any of the guard clauses does not propagate and just makes the respective pattern a failed match, thus moving the execution to match the next available pattern. If none of the patterns produce a successful match, a FunctionClauseError
will be raised, which can be avoided by using a match-all pattern as the last pattern in the construct.
Just like the constructs above, the case construct can have multiple expressions in a branch, it will return the value of the last expression of the executed branch and it requires you to bind the returned value from the case construct to an outer scope variable in order to rebind it.
defmodule Test do
def decrement_val(map, key) do
case map[key] do
nil ->
IO.puts("Key not found")
map
x when x > 1 -> Map.put(map, key, x - 1)
_ -> Map.delete(map, key)
end
end
end
Test.decrement_val(%{1 => 1}, 2)
Key not found
%{1 => 1}
Test.decrement_val(%{1 => 1}, 1)
%{}
Test.decrement_val(%{1 => 2}, 1)
%{1 => 1}
With construct
The with construct is a neat and concise alternative for nested case statements where you require chaining multiple expressions, with each of the expression’s execution depending on the previous expression’s result. Let us consider the following code.
defmodule Test do
def cuboid_volume(cuboid) do
case parse_num(cuboid[:length]) do
{:ok, length} -> case parse_num(cuboid[:breadth]) do
{:ok, breadth} -> case parse_num(cuboid[:height]) do
{:ok, height} -> length * breadth * height
error -> error
end
error -> error
end
error -> error
end
end
defp parse_num(num) when is_integer(num) or is_float(num), do: {:ok, num}
defp parse_num(_), do: {:error, "Invalid number"}
end
Test.cuboid_volume(%{})
{:error, "Invalid number"}
Test.cuboid_volume(%{length: 10})
{:error, "Invalid number"}
Test.cuboid_volume(%{length: 10, breadth: 5, height: nil})
{:error, "Invalid number"}
Test.cuboid_volume(%{length: 10, breadth: 5, height: 5})
250
In the above code, we are chaining three case constructs involving the extraction of the length, breadth and height values from the cuboid. Each nested construct’s execution depends on the previous case construct’s result and the final expression length * breadth * height
is executed only when all the three measurements are proper numbers. Now let us see a much cleaner with construct implementation of the above code.
defmodule Test do
def cuboid_volume(cuboid) do
with {:ok, length} <- parse_num(cuboid[:length]),
{:ok, breadth} <- parse_num(cuboid[:breadth]),
{:ok, height} <- parse_num(cuboid[:height]) do
length * breadth * height
end
end
defp parse_num(num) when is_integer(num) or is_float(num), do: {:ok, num}
defp parse_num(_), do: {:error, "Invalid number"}
end
Test.cuboid_volume(%{})
{:error, "Invalid number"}
Test.cuboid_volume(%{length: 10})
{:error, "Invalid number"}
Test.cuboid_volume(%{length: 10, breadth: 5, height: nil})
{:error, "Invalid number"}
Test.cuboid_volume(%{length: 10, breadth: 5, height: 5})
250
As you can see above, the with construct involves chaining all of the success cases required in order to execute the final expression. It separates each of the pattern <- expression
pair by a comma and finally contains a do block that gets executed only if all of the expressions above match with the required pattern provided to their left. If any of the required expressions fail to match the pattern, then the return value of the failed expression will be returned and the rest of the with construct won’t be executed. Since pattern matching is used in the with construct, all of the rules related to it such as usage of guard clauses, error in used guard clauses etc applies here as well. All the variables created in the with construct are only accessible within the with construct and they go out of scope once the execution gets past the with construct. The with construct can also be written in a single line as follows.
with x <- expression1, y <- expression2, do: x * y
The with construct also provides an else block syntax where the code inside the else block will be executed if any of the required expressions fail to match the given pattern. The else block can be used to process the result of the failed expression and perform error handling. The else block behaves like a case construct where the returned value from the failed expression is matched with patterns one by one and the matching pattern’s body is executed. If none of the provided patterns in the else block match the failed expression’s return value, then a WithClauseError
will be thrown, which can be avoided by using a match-all default pattern as the last pattern.
defmodule Test do
def cuboid_volume(cuboid) do
with {:ok, length} <- parse_measurement(cuboid, :length),
{:ok, breadth} <- parse_measurement(cuboid, :breadth),
{:ok, height} <- parse_measurement(cuboid, :height) do
length * breadth * height
else
{:error, msg} -> IO.puts("Error - #{msg}")
-> IO.puts("Unknown error")
end
end
defp parse_measurement(cuboid, key) do
case parse_num(cuboid[key]) do
{:error, msg} -> {:error, msg <> ": #{key}"}
val -> val
end
end
defp parse_num(num) when not(is_integer(num)) and not(is_float(num)) do
{:error, "Input not a number"}
end
defp parse_num(num) when num <= 0, do: {:error, "Input not a positive number"}
defp parse_num(num), do: {:ok, num}
end
--------------------------------------------------------------------------------
Test.cuboid_volume(%{})
Error - Input not a number: length
:ok
Test.cuboid_volume(%{length: 10, breadth: -5})
Error - Input not a positive number: breadth
:ok
Test.cuboid_volume(%{length: 10, breadth: 5, height: 5})
250
The with construct also allows other intermediate expressions between the available pattern <- expression
. This can be used for doing intermediate processing on the variables bound in the previous expressions, creating intermediate variables or for printing details etc.
map = %{1 => 1, 2 => 2}
with {:ok, x} <- Map.fetch(map, 1),
x_double = 2 * x,
{:ok, y} <- Map.fetch(map, 2),
y_double = 2 * y,
IO.inspect("x, y : #{x}, #{y}") do
x_double * y_double * (x + y)
end
"x, y : 1, 2"
24
The with construct can also just contain only the normal expressions that bind values to different variables without any patterns, which can be used to write a concise single line version of code that would usually take multiple lines in the normal syntax.
test = fn a, b ->
sum = a + b
diff = abs(a - b)
div(sum * diff, sum + diff)
end
test.(5, 10)
3
___________________________________________________________________________
test = fn a,b ->
with sum = a + b, diff = abs(a - b), do: div(sum * diff, sum + diff)
end
test.(5, 10)
3
One more difference between the above two versions is that, in the with syntax, the variables created within it such as sum
and diff
are temporary and they go out of scope as soon as the with construct ends. But in the normal syntax these variables are accessible anywhere within the function after they are created.
For comprehension
The for comprehensions are powerful looping constructs in elixir that operate on terms such as bitstrings and terms that implement the enumerable protocol such as lists, maps, keyword lists etc. The for comprehensions provide a neat and concise syntax for tasks that involve complex enumeration involving multiple terms and expressions. The for comprehension consists of generators, filters, normal expressions and some options separated by comma. Based on different combinations of the above four expressions, the for construct can be made to behave in different ways, mimicking and combining functionalities of built-in Enum methods like Enum.map/2, Enum.filter/2, Enum.reduce/3 and much more. This is what makes the for comprehension a very powerful and versatile feature. Following the combination of the four allowed expressions mentioned above, is the do block where the final evaluation is done on the enumerated data. Similar to other constructs above, the variables used inside the for comprehension construct does not affect the variables in the outer scope and will go out of context once the for comprehension’s do block ends.
Generators
Every for comprehension must start with a generator. A generator has the syntax of pattern <- enumerable
. It generates data that matches the pattern on the left side, one by one from the enumerable on the right side. Any data from the enumerable that does not match the pattern on the left, will be rejected and skipped. These patterns on the left also support guard clauses just like any other construct that uses pattern matching.
for x <- [1, 2, 3], do: x # enumerable generator syntax; mimics Enum.map
[1, 2, 3]
for x when rem(x, 2) == 0 <- [1, 2, 3, 4], do: x
[2, 4]
for {_key, val} = x when val > 1 <- %{a: 1, b: 2, c: 3}, do: x
[b: 2, c: 3]
for <<x::8 <- "hello">>, do: <<x>> # bitstring generator syntax
["h", "e", "l", "l", "o"]
Multiple generators can be combined together to mimic a traditional nested loop construct.
for x <- [1, 2, 3],
{_, y} <- %{1 => "a", 2 => "b", 3 => "c"},
<<z::8 <- <<4, 5, 6>> >>, do: "#{x}#{y}#{z}"
["1a4", "1a5", "1a6", "1b4", "1b5", "1b6", "1c4", "1c5", "1c6", "2a4", "2a5",
"2a6", "2b4", "2b5", "2b6", "2c4", "2c5", "2c6", "3a4", "3a5", "3a6", "3b4",
"3b5", "3b6", "3c4", "3c5", "3c6"]
Filters
Filters are expressions that return boolean, used after a generator to filter the values generated from it. They are similar to guard clauses used in the patterns, but they are not restricted just to the allowed expressions and operators in the guard clauses. These filters can use any function or expression. If the filter operates on a value generated by the generator and returns a truthy value, then the value will flow to the next expression in the for construct. If the filter returns a falsy value, then the particular value will be skipped and will not flow into the successive expressions. Multiple filters can be combined together or used individually and these filters will act on all data flowing in from the previous expression.
for x <- [1, 2, 3, 4], :math.pow(2, x) > 4,
y <- [1, 2, 3, 4, 5, 6], :math.pow(y, 2) > 4, rem(y, 2) == 0,
do: [x, y]
[[3, 4], [3, 6], [4, 4], [4, 6]]
After the first generator, there is a filter that checks if 2 raised to the power of x is greater than 4 and only allows the data that satisfy this condition to pass through. After the second generator, there are two filters separated by comma that check if data from the second generator satisfies both filters, similar to using the &&
operator to combine the two filters into one. Thus, the data 3 and 4 pass from the first generator and the data 4 and 6 pass from the second generator which are combined together to form a list in the do block.
The filters can even use data from multiple generators if required.
for x <- [1, 2, 3, 4],
y <- [1, 2, 3, 4, 5, 6], abs(x - y) > 3,
do: [x, y]
[[1, 5], [1, 6], [2, 6]]
In the code above, the filter works on data from both generators and only allows the data combination from the first and second generator to pass into the do block if they satisfy the given filter.
Normal expressions
Similar to the with construct, normal expressions can be used in for constructs to perform intermediate calculations on the available data in the pipeline. They can also be used to create new variables that can be used further down the pipeline.
for x <- [1, 2, 3, 4], squared = x**2, squared > 4, do: [x, squared]
[[3, 9], [4, 16]]
In the code above, the normal expression evaluates the square of each of the generator’s data and binds it to a variable, squared
, that is being accessed later down the pipeline.
In elixir, every expression returns a value and so when you are binding a value to a variable using the match operator, the bound value will be returned as the result of the pattern matching expression. Similarly, when using normal expressions in the for construct that bind a falsy value to a variable, then the falsy value will be returned as the result of the pattern matching expression and this in turn would behave like a failing filter that would lead to skipping the current data in the pipeline. This is because, internally, all expressions in the for construct other than the generators behave like a filter. Whether it is an explicit filter or a normal expression, if the result of a non-generator expression is falsy, then the current data in the pipeline will be rejected or skipped. In order to avoid this, the normal expression with a possibility of binding a falsy value to a variable can be wrapped into a generator with the value wrapped into a list.
for x <- [1, 2, 3, 4], y = false, do: [x, y]
[]
---------------------------------------------------------------------------
for x <- [1, 2, 3, 4], y <- [false], do: [x, y]
[[1, false], [2, false], [3, false], [4, false]]
In the code above, the first for comprehension construct uses a normal expression that binds false
to the variable y
. This would return false as the result of the pattern matching expression and would behave like a failing filter that skips every combination of data from the generator. Hence in the second for comprehension construct, the expression is wrapped as a generator to successfully include falsy values in the pipeline.
Options
The for comprehension allows usage of three main options such as :into
, :uniq
and :reduce
. The :uniq
and :into
options can be used together if required, while the :reduce
option can only be used alone.
:uniq
The :uniq option, when used with the value true
, makes sure that the final accumulated result generated from the for construct does not have any duplicates in it. It skips a generated term, if it is already present in the accumulated result.
for x <- [1, 2, 3], y <- [1, 2], do: x + y
[2, 3, 3, 4, 4, 5]
for x <- [1, 2, 3], y <- [1, 2], uniq: true, do: x + y
[2, 3, 4, 5]
:into
The :into
option can be used to accumulate the result into another data structure other than the default type, list. This option takes any term that implements the Collectable protocol. Thus, the result can be accumulated in the form of Collectable protocol-implemented types such as lists, maps, MapSets and bitstrings. The final returned value from the do block must be compatible with the type that is used in the :into
option. For a map, the returned value must be in the format {key, value}
and for a bitstring, the returned value must be wrapped in a bitstring syntax, <<val>>
.
for x <- [1, 2, 3, 4], into: [], do: x
[1, 2, 3, 4]
for x <- [1, 2, 3, 4], into: %{}, do: {x, x * x}
%{1 => 1, 2 => 4, 3 => 9, 4 => 16}
for x <- [1, 2, 3, 4, 1, 1, 2], into: MapSet.new, do: x
MapSet.new([1, 2, 3, 4])
for x <- [1, 2, 3, 4], into: <<>>, do: <<x>>
<<1, 2, 3, 4>>
In addition to providing empty structures as values to the :into
flag, existing structures with values can also be provided. In this case the existing structure will be updated with the new values. Please note that using a non-empty list as an :into
option has been deprecated in the recent versions. For a non-empty map as the value for :into
option, it will behave like a merge operation, updating values of existing keys and combining the keys and values from both old and new data.
for x <- [1, 2, 3, 4], into: [5, 6], do: x
warning: the Collectable protocol is deprecated for non-empty lists.......
[5, 6, 1, 2, 3, 4]
for x <- [1, 2, 3, 4], into: %{1 => nil, 2 => nil, 5 => 5}, do: {x, x * x}
%{1 => 1, 2 => 4, 3 => 9, 4 => 16, 5 => 5}
for x <- [1, 2, 3, 4, 1, 1, 2], into: MapSet.new([1, 2, 5, 6]), do: x
MapSet.new([1, 2, 3, 4, 5, 6])
for x <- [1, 2, 3, 4], into: <<5, 6>>, do: <<x>>
<<5, 6, 1, 2, 3, 4>>
:reduce
The :reduce
option behaves like the Enum.reduce/3 function where an accumulator is passed around, updated in each iteration and returned as the final result of the expression. The :reduce
option can take any valid elixir term as the value and the do block would have a different syntax where the accumulator is matched by patterns one by one just like a case construct and the matching pattern’s body is executed to update the accumulator and pass it on to the next iteration. Again, similar to a case construct, if none of the patterns match the accumulator, then a CaseClauseError
will be thrown.
for x <- [1, 2, 3, 4], reduce: 0 do # generates sum
acc -> acc + x
end
10
---------------------------------------------------------------------------
for x <- [1, 2, 3, 4, 1, 2], reduce: %{} do # creates a frequency map
%{^x => val} = acc -> Map.put(acc, x, val + 1)
acc -> Map.put(acc, x, 1)
end
%{1 => 2, 2 => 2, 3 => 1, 4 => 1}
Recursion
Any form of looping in functional programming languages like elixir is performed using recursion. All of the inbuilt Enum functions internally use recursion to loop and iterate over data structures. Recursion is a technique where a function processes some data and calls itself with the remaining set of data. A tail call recursion involves a function call to itself as the last expression in the body. Since elixir is tail call optimised, it uses constant stack frames for tail call recursive functions, making them very efficient for enumerating large datasets. Unlike traditional for loops in other languages, where data in the outer scope is accessed and mutated inside the for loop, recursion involves creating new versions of data and continuously passing them as arguments into successive function calls of the same function.
defmodule Test do
def recursive_sum([hd | tl]), do: hd + recursive_sum(tl)
def recursive_sum([]), do: 0
def tail_recursive_sum([hd | tl], sum), do: tail_recursive_sum(tl, sum + hd)
def tail_recursive_sum(_, sum), do: sum
end
Test.recursive_sum([1, 2, 3, 4])
10
Test.tail_recursive_sum([1, 2, 3, 4], 0)
10
Every time a function is executed, a separate frame is created in the process’s stack that holds information such as local variables in the function. This frame takes up memory in the stack and is deleted once the function is executed and a value is returned from it to its caller.
In the above example, the function recursive_sum
is not a tail call recursive function since the last thing that happens is the addition of hd
and the result of the recursive call recursive_sum(tl)
. In this case, the function needs to keep track of the variable hd
until the end of execution and so holding on to the current function’s stack frame is required. Thus n number of stack frames will be created for n number of recursive calls and all of them will be present in the stack at the same time, taking up memory in the stack. Hence this is not efficient and may lead to stack overflow for very large datasets.
But in the function tail_recursive_sum
, the last thing that happens is a call to itself with the remaining elements in the list and the updated sum as arguments. Since the updated sum is passed on to the next function call as an argument, there is no need to keep track of it in the current function’s stack frame and hence it can be destroyed before creating a new stack frame for the next recursive call. This is exactly what the tail call optimisation does, thus using a constant set of stack frames at any particular point and making tail recursion efficient for large datasets.