Elixir : Basics of Map datatype
The map data type in elixir is an associative collection data structure that stores key-value pairs. They can have any term as keys and values and they do not allow duplicate keys. They can also have variables as keys. Maps in elixir are internally implemented as two different structures based on the size. Small maps with size less than or equal to 32 are implemented based on a sorted tuple structure and large maps with size greater than 32 are implemented based on persistent hash array mapped tries. The keys in small maps are sorted based on their type and value and the large maps do not follow any sorting order.
Data type ordering in elixir
integer < float < atom < reference < function < port < pid < tuple < map
< list < bitstring
Syntax
Maps in elixir can be defined using the %{}
syntax with comma separated key => value
pairs. If the key is an atom, then the key-value pair can also be defined using the atom_content: value
syntax. When these two key-value pair syntaxes are mixed together, the atom_content: value
syntax must always come at the end.
%{} # an empty map
%{"one" => 1, :two => 2, [3] => 3} # map with three key-value pairs
%{"one" => 1, two: 2} # map with `key: value` syntax for atom keys
%{two: 2, "one" => 1} # invalid as `atom: value` syntax is not at the end
** (SyntaxError) iex:48:9: unexpected expression after keyword list....
a = "key"
%{a => "value"} # map that uses a variable as key -> %{"key" => "value"}
%{"two" => 2, 4 => 4, 3 => 3, one: 1}
# Since the above is a small map, the keys will be sorted and ordered as
# %{3 => 3, 4 => 4, :one => 1, "two" => 2}
Reading data
Data can be read from the map using many different ways. Access operator map[key]
syntax can be used to obtain the value associated with a key. If the key is an atom, then the map.key
can also be used to access values. In case of a nested map, the keys can be chained in both syntaxes. In addition to these syntaxes, the Map module provides functions such as get/3
and fetch/2
which can also be used to obtain the value associated with a key. It is useful to know the differences between these syntaxes such as what would happen if the accessed key is not present in the map. The Kernel module also provides a function called get_in/2
to read data from nested maps by specifying the list of keys in the path.
map = %{"one" => 1, "key_2" => %{ 3 => 3}, two: 2, key_3: %{four: 4}}
map["one"] # 1
map["key_2"][3] # 3
map["not a key"] # nil
map.two # 2
map.key_3.four # 4
map.no_key
** (KeyError) key :no_key not found in: %{:two => 2, :key_3 => %{four: 4}, "key_2" => %{3 => 3}, "one" => 1}
Map.fetch(map, "one") # {:ok, 1}
Map.fetch(map, "nokey") # :error
Map.get(map, "one") # 1
Map.get(map, "nokey") # nil
Map.get(map, "nokey", "default value") # "default value"
get_in(map, ["key_2", 3]) # 3
Updating data
Data in maps can be updated using the various functions available in the Map module such as put/3
, put_new/3
etc. Elixir also provides another syntax that uses the cons |
operator to update values for multiple keys. But this syntax can only be used for updating values for keys that are already present in the map. Similar to get_in/2
, the Kernel module provides the put_in/3
and update_in/3
to update values in a nested map structure. Please note that data in elixir is immutable and hence all functions return a new version of the map and does not modify the existing maps.
map = %{1 => 1}
Map.put(map, 1, :one) # %{1 => :one}
Map.put(map, 2, 2) # %{1 => :one, 2 => 2}
Map.put_new(map, 1, :one) # %{1 => 1}
Map.put_new(map, 2, 2) # %{1 => 1, 2=> 2}
%{map | 1 => :one} # %{1 => :one}
%{map | 2 => :two}
** (KeyError) key 2 not found in: %{1 => 1}
map = Map.put(map, 2 ,2) # %{1 => 1, 2 => 2}
%{map | 1 => :one, 2 => :two} # %{1 => :one, 2 => :two}
map = %{1 => %{2 => 2}}
put_in(map, [1,2], :two) # %{1 => %{2 => :two}}
Reference modules
As mentioned above, the reference module Map contains numerous functions that can be used on maps. In addition to that, the Kernel module provides guard functions such as is_map/1
, is_map_key/2
, map_size/1
etc. Maps also implement the Enumerable protocol that enables the functions from Enum and Stream modules to operate on the map data type.
If you are curious about how maps are represented internally, please check out these articles.
Internal data representation of small maps.
Internal data representation of large maps.