28 Days - Intro to Elixir Enumerable

My co-worker Ben recently sent me a message for a blog post idea: looking at how Enumerable is utilized by Elixir Enum module. Specifically, he told me that all of Enum behavior can be implemented just by writing a single reduce function. If you’re like me, you might be thinking: “what, how is this?” Let’s dive in.

The goal of today’s post isn’t around how to use Enumerable to write your own, but rather about how Enum is implemented and how it can use such a simple Enumerable interface to achieve all of the Enum functionality.

Enumerable Protocol

The Elixir Enumerable docs seem like a good place to start this investigation. Sure enough, here we have it:

This protocol requires four functions to be implemented, reduce/3, count/1, member?/2, and slice/1. The core of the protocol is the reduce/3 function. All other functions exist as optimizations paths for data structures that can implement certain properties in better than linear time.

Okay, so we know that only reduce/3 is required to be implemented. However, this function has a very particular type signature to make this all possible. A picture is worth a thousand words, so here is the reduce implementation for List:

def reduce(_,       {:halt, acc}, _fun),   do: {:halted, acc}
def reduce(list,    {:suspend, acc}, fun), do: {:suspended, acc, &reduce(list, &1, fun)}
def reduce([],      {:cont, acc}, _fun),   do: {:done, acc}
def reduce([h | t], {:cont, acc}, fun),    do: reduce(t, fun.(h, acc), fun)

The final accumulation value is returned as {:done, term}. Two alternate states should be handled: halt and suspend. suspend in particular is a special case which is noted as not being needed by most regular enumerables.

How is reduce/3 used?

It took me a second to realize how this singular reduce/3 function signature can be used to implement all Enum functions. Let’s look at a few examples:

def any?(enumerable, fun) do
  Enumerable.reduce(enumerable, {:cont, false}, fn entry, _ ->
    if fun.(entry), do: {:halt, true}, else: {:cont, false}
  end)
  |> elem(1)
end

In the any? implementation, the reducer is used in such a way that it halts immediately if the function is truthy, otherwise continuing the enumeration.

Let’s look at a function that utilizes the optional count Enumerable function:

def count(enumerable) do
  case Enumerable.count(enumerable) do
    {:ok, value} when is_integer(value) ->
      value

    {:error, module} ->
      enumerable |> module.reduce({:cont, 0}, fn _, acc -> {:cont, acc + 1} end) |> elem(1)
  end
end

In the above Enum.count implementation, the Enumerable.count/1 function is invoked for the enumerable. If it’s available and is an integer, then it’s returned. Otherwise, the reduce function is used for a linear time count implementation. It’s possible to see how the reduce function is used for a count, keeping an accumulator that starts at 0 and becomes acc + 1 on each enumeration.

Finally, certain Enum functions rely on a set of macros defined in Stream.Reducers. These functions appear as R.fn in the Enum code, as the required module is aliased as such. The Reducers module is implemented using some interesting meta programming. If we look at the map/2 macro definition, it utilizes a next/3 function which isn’t defined in this macro module. This is actually defined in the Enum module, and works due to macros not evaluating code at generation time, but rather at execution time. I talked about this trait of macros in the Beauty of Macros post.

Writing Enumerable Implementations

I’m purposely keeping this post fairly light on how to implement the Enumerable protocol in practice, there are several good resources on how to do that. Understanding the underlying implementation is more of my focus here. However, there is benefit in looking at the Enumerable implementation for some core types like List and Map. The Enum module defines these implementations and they can serve as a good starting point when writing your own.

As you find yourself wondering how Elixir is internally built, keep in mind that the majority of Elixir is simply Elixir code. You most likely won’t need to read erlang or C code like you may need to in other languages. The best answers to your questions and curiosities are found in the code itself.

Thanks for reading the 14th post in my 28 days of Elixir. Keep up through the month of February to see if I can stand subjecting myself to 28 days of straight writing. I am looking for new topics to write about, so please reach out if there’s anything you really want to see!

View other posts tagged: engineering elixir 28 days of elixir