Elixir : Basics of Atom datatype

Arunmuthuram M
4 min readSep 9, 2023

--

Atoms in Elixir are constants whose names are their own values. They are similar to features such as enums and symbols in other programming languages. Atoms are used widely in Elixir and they follow after the datatype of the same name in the underlying erlang type system. Atoms are an efficient alternative for something like this.

ok = "ok"
error = "error"

Syntax

An atom in Elixir can be defined using the syntax starting with a : symbol followed by the atom’s value. There are two main syntaxes that can be used to define the atoms based on their content. The simple atoms that start with an alphabet or underscore, contain alphanumeric characters and the @ symbol, end with an alphanumeric character, ? or !, can be defined using the :content syntax.

Atoms having content that violates the above rules can be defined using the :"content" syntax within the double quotes. The maximum length of the characters in an atom is 255. Please note that some non-ascii characters can take up more space and can be counted as either 2(e.g. á) or 4(e.g. 🙂) characters.

:atom_1 #starts with alphabet, contains alphanumerics and underscore

:_atom@one? #starts with underscore, ends with ?, contains alphanumerics and @

:áéíó_123úüñ! #starts with an alphabet, contains alphanumerics and ends with !

:"123 go crazy á, é, í, ó, ú, ü, ñ, ¿, ¡ !@#$%^&*()-_ 🙂" #uses the :"" syntax, can contain any character within the quotes

:"🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂🙂"
** (SyntaxError) iex:18:1: atom length must be less than system limit # 64 🙂 * 4 = 256 which is greater than the allowed 255 limit

Other than the above two syntaxes, atoms can also be defined using the following syntax.

Atom

Another_atom

This syntax is an alias for :"Elixir.Content". Internally atoms created using this syntax will be converted and expanded with an Elixir. prefix. This alias syntax is commonly used for module names.

Atom == :"Elixir.Atom"
true

Efficiency

An atom is unique and two atoms having the same content will always be equal. Atoms are more efficient than strings in terms of memory usage and comparison as they are interned in a global atom table. An atom will be created only once in the atom table and any new atoms created with the same content will always refer to the previously created atom in the atom table. They also offer very fast comparison as only the indexes referring to the atom tables will be compared and not the actual content of the atom. This is why they are used widely as keys or tags that aid in efficient pattern matching of structures.

:atom_1 == :atom_1 # true
{:ok, resp_data} = http_resp # atoms used as tags for tuples that aid in pattern matching
%{a: a_val} = %{a: 3, b: 5} # atoms used as keys in maps that enable efficient pattern matching

1,048,576

One important thing to note about atoms is that the number of atoms that can be created in a VM instance is finite and limited. The default limit is 1,048,576 which can also be modified using the +t flag. But once the limit is set during the VM instance start-up, it cannot be changed during runtime. The default limit also includes all the internal atoms created by both Elixir and erlang. The atoms once created in the atom table will not be garbage collected and if the atoms reach the maximum limit, the VM would crash. To avoid this, atoms must not be created dynamically in run time from user inputs.

System generated atoms

Some of the internally created atoms are :true and :false, which are used by Elixir as the boolean terms with the alias of the same name without the colon. Some other notable internal atoms would include :nil. When you fire up an iex session, you would already have around 20k atoms created internally by various processes, Elixir and erlang.

true == :true # true
false == :false # true
nil == :nil #true
Kernel.is_atom(true) # true

Elixir also represents all of the loaded modules internally as atoms. This includes all the built-in modules of elixir, erlang and the user defined modules. The modules from erlang are represented as the default syntax :math and the modules of elixir are represented with an elixir prefix, :"Elixir.String" which as seen above, will have aliases as String . So whenever you reference an elixir module, it is just an alias for the atom :"Elixir.ModuleName" .

String == :"Elixir.String" # true
String.length("123") # 3
:"Elixir.String".length("123") # 3

If you are curious about the internal data representation of atoms, check out this article.

--

--