Elixir : Basics of List datatype
List is a built-in collection datatype in elixir that can contain ordered heterogeneous elements. Lists in elixir are built on top of the erlang type system’s List datatype of the same name. Lists in elixir internally behave as singly linked lists.
Syntax
Lists can be defined using the square braces syntax []
, containing the elements as comma separated values within the square braces.
a = [] #empty list
b = [1, 2.0, :atom_1, "ello", [2,3,4]] #list containing five elements
Since they are internally linked lists, every list can be represented as a head-tail pair. The head is the first element of the list and the tail is a list containing the rest of the elements. This pattern continues recursively until the end of the list, where the last tail is implicitly an empty list. The head and tail representation syntax of a list uses the cons |
operator to separate the head and tail of the list. Let’s take a list containing 5 elements such as [1, 2, 3, 4, 5]
. In this list, the element 1 is the head and the tail is a list containing the rest of the elements. So this list can also be written according to the head and tail notation as [1 | [2, 3, 4, 5]]
. The tail of the list, also being another list, can further be written using another head and tail notation. This notation is expressive and is useful in recursion, pattern matching and prepending elements to the list. Please note that while using the head and tail notation, if the last element in the list is not denoted as a list, the result would be an improper list.
[1,2,3,4,5]
[1 | [2,3,4,5]]
[1 | [2 | [3,4,5]]]
[1 | [2 | [3 | [4,5]]]]
[1 | [2 | [3 | [4 | [5]]]]]
[1 | [2 | [3 | [4 | [5 | []]]]]]
[1 | [2 | [3]] #valid list [1,2,3]
[1 | [2 | 3]] #improper list [1,2|3]
Appending and prepending
Elements can be appended to the end of the lists using the ++
list concatenation operator provided in the Kernel module. The second operand must also be a list containing the elements to be appended. Using the ++
operator on a single element without enclosing it with square braces would again result in an improper list. Similar to the list concatenation operator there is also a list subtraction operator, --
which removes the first occurrence of all the elements in the second operand from the first operand.
[1, 2, 3] ++ [4, 5]
[1, 2, 3, 4, 5]
[1, 2, 3, 4] ++ [5]
[1, 2, 3, 4, 5]
[1, 2, 3, 4] ++ 5
[1, 2, 3, 4 | 5]
[1, 2, 3, 1, 2] -- [1, 2, 4]
[3, 1, 2]
Prepending an element to the list is a constant time operation and is much more efficient compared to appending elements to the list. Prepending can be done using the head and tail notation by mentioning the new element to be appended as head and the original list as the tail. Multiple elements can also be prepended to the list.
a = [2, 3, 4]
[1 | a]
[1, 2, 3, 4]
[-1, 0, 1 | a]
[-1, 0, 1, 2, 3, 4]
Reference modules
The Kernel module has functions such as hd/1
which returns head of the list, tl/1
that returns the tail of the list, length/1
that returns the length of the list and operators such as in/2
that returns true if an element is present in the list. The List module provides numerous functions that operates and manipulates the list data type. Please note that all data types are immutable in elixir and hence all functions that operate on lists return a new list and never modifies the original list. Since the List data type implements the Enumerable protocol, the Enum and the Stream modules containing numerous useful functions, can operate on the lists.
hd([1, 2, 3])
1
tl([1, 2, 3])
[2, 3]
length([1, 2, 3])
3
1 in [1, 2, 3]
true
5 in [1, 2, 3]
false
Enum.at([1, 2, 3], 0)
1
Charlists
Please note that when lists contain integers that are all valid ASCII characters, then the list is printed as a charlist. In fact, a charlist is just a list of integers that denote printable characters. A charlist can be directly defined using the 'content'
syntax and it has been a part of elixir in order to be backward compatible with erlang. The default string representation in erlang is a charlist and strings are represented as binaries in elixir. Hence the binary string representation in elixir should be converted into a charlist before using an erlang function that accepts a string as an argument and vice versa.
IO.inspect([97, 98, 99])
~c"abc" # prints the equivalent ascii characters - abc
IO.inspect([97, 98, 1000])
[97, 98, 1000] # prints [97,98,1000] as 1000 is not a valid ascii char
'hello' == [104, 101, 108, 108, 111]
true
'😘🖤💖'
[128536, 128420, 128150]
Keyword lists
Keyword lists are just lists with a specific structure. Every element in a keyword list is a two element tuple where the first element is an atom used as a key and the second element could be any term denoting the value associated with the atom key. They are mainly used for sending a list of options to function calls and other constructs.
keyword_list = [{:one, 1}, {:two, 2}]
[one: 1, two: 2]
Since keyword lists are commonly used in Elixir, a shorter syntax without the curly braces and with the colon present after the atom key, can be used to denote a keyword list.
[{:one, 1}, {:two, 2}] == [one: 1, two: 2]
true
Since keyword lists are internally lists, the main advantages of using them for key based value access over maps, is that their keys are ordered and a key can be present multiple times, being associated with multiple values. But most of the operations will take linear time since they are internally lists.
k_list = [key1: 1, key2: 2, key1: :one]
[key1: 1, key2: 2, key1: :one]
Since passing them as the last argument as options to function calls is a very common occurrence, the square braces surrounding the keyword list can be removed. This is only valid if the keyword list being passed as an argument is the last of all the arguments.
Module.function(arg1, [one: 1, two: 2]) #valid
Module.function(arg1, one: 1, two: 2) #valid
Module.function([one: 1, two: 2], arg2) #valid
Module.function(one: 1, two: 2, arg2) # invalid
** (SyntaxError) invalid syntax found on iex:67:31:
error: unexpected expression after keyword list. Keyword lists must always
come as the last argument. Therefore, this is not allowed:
function_call(1, some: :option, 2)
Instead, wrap the keyword in brackets:
function_call(1, [some: :option], 2)
The main functionality of keyword lists are the key based value access. They allow using the access syntax list[:key]
, similar to maps for accessing values using keys. If a key is present multiple times, then the value associated with the first occurrence of the key will be returned.
kw_list = [one: 1, two: 2, two: "2"]
[one: 1, two: 2]
kw_list[:one]
1
kw_list[:three]
nil
kw_list[:two]
2
Keyword lists have their own dedicated module called Keyword which contains a lot of functions that can be used to read and manipulate keyword lists.
If you are curious about the internal data representation of the lists, check out this article.