28 Days - Intro to Elixir Enumerable
My co-worker Ben recently sent me a message for a blog post idea: looking at how Enumerable is utilized by Elixir Enum module. Specifically, he told me that all of Enum behavior can be implemented just by writing a single reduce function. If you’re like me, you might be thinking: “what, how is this?” Let’s dive in.
The goal of today’s post isn’t around how to use Enumerable to write your own, but rather about how Enum is implemented and how it can use such a simple Enumerable interface to achieve all of the Enum functionality.
Enumerable Protocol
The Elixir Enumerable docs seem like a good place to start this investigation. Sure enough, here we have it:
This protocol requires four functions to be implemented, reduce/3, count/1, member?/2, and slice/1. The core of the protocol is the reduce/3 function. All other functions exist as optimizations paths for data structures that can implement certain properties in better than linear time.
Okay, so we know that only reduce/3 is required to be implemented. However, this function has a very particular type signature to make this all possible. A picture is worth a thousand words, so here is the reduce implementation for List:
def reduce(_, {:halt, acc}, _fun), do: {:halted, acc}
def reduce(list, {:suspend, acc}, fun), do: {:suspended, acc, &reduce(list, &1, fun)}
def reduce([], {:cont, acc}, _fun), do: {:done, acc}
def reduce([h | t], {:cont, acc}, fun), do: reduce(t, fun.(h, acc), fun)
The final accumulation value is returned as {:done, term}
. Two alternate states should be
handled: halt
and suspend
. suspend
in particular is a special case which is noted as
not being needed by most regular enumerables.
How is reduce/3 used?
It took me a second to realize how this singular reduce/3
function signature can be
used to implement all Enum functions. Let’s look at a few examples:
def any?(enumerable, fun) do
Enumerable.reduce(enumerable, {:cont, false}, fn entry, _ ->
if fun.(entry), do: {:halt, true}, else: {:cont, false}
end)
|> elem(1)
end
In the any? implementation, the reducer is used in such a way that it halts immediately if the function is truthy, otherwise continuing the enumeration.
Let’s look at a function that utilizes the optional count
Enumerable function:
def count(enumerable) do
case Enumerable.count(enumerable) do
{:ok, value} when is_integer(value) ->
value
{:error, module} ->
enumerable |> module.reduce({:cont, 0}, fn _, acc -> {:cont, acc + 1} end) |> elem(1)
end
end
In the above Enum.count implementation, the Enumerable.count/1
function is invoked for the
enumerable. If it’s available and is an integer, then it’s returned. Otherwise, the reduce function
is used for a linear time count implementation. It’s possible to see how the reduce function is
used for a count, keeping an accumulator that starts at 0 and becomes acc + 1 on each enumeration.
Finally, certain Enum functions rely on a set of macros defined in Stream.Reducers
.
These functions appear as R.fn
in the Enum code, as the required module is aliased as such.
The Reducers module is implemented using some interesting meta programming. If we look at the
map/2
macro definition,
it utilizes a next/3
function which isn’t defined in this macro module. This is actually defined
in the Enum module,
and works due to macros not evaluating code at generation time, but rather at execution time. I talked
about this trait of macros in the Beauty of Macros post.
Writing Enumerable Implementations
I’m purposely keeping this post fairly light on how to implement the Enumerable protocol in practice, there are several good resources on how to do that. Understanding the underlying implementation is more of my focus here. However, there is benefit in looking at the Enumerable implementation for some core types like List and Map. The Enum module defines these implementations and they can serve as a good starting point when writing your own.
As you find yourself wondering how Elixir is internally built, keep in mind that the majority of Elixir is simply Elixir code. You most likely won’t need to read erlang or C code like you may need to in other languages. The best answers to your questions and curiosities are found in the code itself.
Thanks for reading the 14th post in my 28 days of Elixir. Keep up through the month of February to see if I can stand subjecting myself to 28 days of straight writing. I am looking for new topics to write about, so please reach out if there’s anything you really want to see!