Elixir Probes - Replacing Elixometer

It’s been a while since my last post. I’ve been chipping away at Elixir still, fortunately, focusing on making sure that other teams at SalesLoft are best equipped to build Elixir microservices. One of the most common items that need implemented on every Elixir project is a set of Datadog metrics. These metrics provide both VM health and application specific info.

Elixometer

Until now, all apps at SalesLoft have shipped with using Elixometer as an instrumentation service in our Elixir apps. Elixometer has a variety of methods that allow collection of stats and can report to a StatsD server. It also has a variety of problems:

1. The mix.exs entry for Elixometer is quite involved.

There seem to be some incompatiblies in the various libraries that have been released over time. Here is my current Elixometer mix.exs entry:

# start exometer; force "correct" modules due to elixometer not compiling properly
{:elixometer, "~> 1.2"},
{:lager, ">= 3.2.1", override: true},
{:exometer, github: "Feuerlabs/exometer"},
{:exometer_core, "~>1.4.0", override: true},
{:amqp_client, git: "https://github.com/dsrosario/amqp_client.git", branch: "erlang_otp_19", override: true},
# end exometer

I haven’t even tried updating these because it took me several days to get a working config.

2. Elixometer includes libraries that are outside of instrumentation.

For instance, lager is a listed dependency. This may not manifest as a problem in your project, but it could also be hiding from you. I discovered that including lager’s error logger module would clear out all of the other SASL error loggers, which is how Bugsnag was included. This meant that Bugsnag was being forcibly removed without my knowledge. The solution here was to disable lager’s error logger.

3. Elixometer is fairly complex to setup and use.

This has been a complaint both across the team and also on my own projects.

I once wanted to report a simple count to Datadog and plot a rate of change. After 1-2 days of trying to figuring out why it was not working, I discovered that all stats report as gauges unless a complex setup is used to specify the reporting type…I ended up using a different StatsD library (Statix) at that point.

Instruments - Elixometer’s Replacement

Going forward, I will be using the Instruments library to report probed metrics to Datadog. I will be using Statix to report non-probed application metrics to Datadog.

A probe is a bit of code that runs on a defined interval and reports the statistics on each run. An example of this is asking for VM memory utilization every 1s and sending that to StatsD. Instruments makes defining probes very easy, and I’m going to share my standard configuration.

Instruments Setup

I followed the guide on Github to setup Instruments in my application. Outside of the recommended config, I did find that reporting to the standard Logger module during testing makes a ton of sense. To do that, I placed the following in config/text.exs:

config :instruments, reporter_module: Instruments.StatsReporter.Logger

My other config looks like:

config :statix,
  prefix: "okr_app.#{Mix.env}",
  host: System.get_env("STATSD_HOST"),
  port: String.to_integer(System.get_env("STATSD_PORT") || "8125"),
  disabled: System.get_env("STATSD_HOST") == nil

config :instruments,
  fast_counter_report_interval: 100,
  probe_prefix: "probes"

Instruments Probe Definition

Probes are defined in Instruments a bit differently than in Elixometer. Elixometer utilizes a static configuration, but I cannot find such an option for Instruments. I defined probes in my application.ex file:

def setup_probes() do
  # I allow instruments to be disabled as this is an open source application and StatsD isn't required
  if Application.get_env(:instruments, :disabled) != true do
    {:ok, _} = Application.ensure_all_started(:instruments)
    interval = 1_000

    Instruments.Probe.define(
      "erlang.process_count",
      :gauge,
      mfa: {:erlang, :system_info, [:process_count]},
      report_interval: interval
    )

    Instruments.Probe.define(
      "erlang.memory",
      :gauge,
      mfa: {:erlang, :memory, []},
      keys: [:total, :atom, :processes],
      report_interval: interval
    )

    Instruments.Probe.define(
      "erlang.statistics.run_queue",
      :gauge,
      mfa: {:erlang, :statistics, [:run_queue]},
      report_interval: interval
    )

    Instruments.Probe.define(
      "erlang.system_info.process_count",
      :gauge,
      mfa: {:erlang, :system_info, [:process_count]},
      report_interval: interval
    )
  end
end

Instruments’ documentation discusses how custom probes can be designed for your application specifically. In addition to these probes, you can utilize Statix library to send StatsD metrics. That is outside of the scope of this post, but it is useful to note that Instruments defines various functions for sending data to the underlying Statix module. You can also use Instruments.Statix directly, if that suits your needs better.

Final Thoughts

I haven’t seen any downside with Instruments yet that would make me use Elixometer again. It seems much easier to setup, is more obvious to read, and doesn’t involve interfacing with an erlang module in a sometimes confusing fashion. Due to all of this, it seems like a great package for setting up StatsD probes.

I hinted at an open source implementation that uses Instruments. I’m working on putting the final touches on the repo and open-sourcing it under the SalesLoft Github org. There will be a blog post when that happens, as I’ve been trying to utilize best practices (of the current moment) when building it.

View other posts tagged: elixir engineering