Verifying Queries with Ecto's prepare_query Callback

You can use many different techniques to build and scale Software as a Service applications. One technique that is very popular is to use a single database for multiple paying customers. This multi-tenant approach to SaaS works well for many people, but there are a few dangers to look out for. The biggest danger is the risk of cross-tenant data leaking. I consider this the worst possible scenario for a multi-tenant SaaS application, even beyond a full system outage.

In this post, we’re going to look at a technique to guarantee that cross-tenant leaks don’t happen in an Elixir application. We’ll be looking at Ecto’s new(ish) prepare_query callback and how it can be used to inspect (almost) every query that goes through your application. I’ll discuss how I test drove a query inspector to inspect every query for tenancy.

The ultimate goal of this post is to serve as a light reference for how I navigated the Ecto.Query struct to implement the tenancy enforcer. The biggest challenge that I faced was understanding what went into the structure and how to walk it.

Let’s first look at what prepare_query is.

Ecto’s prepare_query Callback

The prepare_query callback was introduced in September of 2019. You can define a function (prepare_query) in your Application.Repo module. The function is invoked before a query is executed, and you are provided the full query structure as well as some metadata.

With prepare_query, you can inspect the query or even modify it. The Ecto documentation gives an example where a Repo filters out “soft deleted” records unless the user is an admin. It looks like this:

# From https://hexdocs.pm/ecto/Ecto.Repo.html#c:prepare_query/3

@impl true
def prepare_query(_operation, query, opts) do
  if opts[:admin] do
    {query, opts}
  else
    query = from(x in query, where: is_nil(x.deleted_at))
    {query, opts}
  end
end

We can use this to detect whether a query has tenancy set correctly. We won’t actually modify the query in this post, due to concerns I have with how that approach removes multi-tenancy awareness.

In order to get started, we’ll need to define a function that shows us the Ecto.Query structure. Let’s do that next.

Understanding Ecto Query Structure from Inspect Protocol

In order to get started with a runnable example, I modified test/support/tenancy/repo.ex by replacing prepare_query with the following empty function.

  def prepare_query(_operation, query, opts) do
    IO.inspect query
    {query, opts}
  end

I can then run the test "valid tenancy is only condition" with mix test test/integration/prepare_test.exs:26, and see the following output:

➜  ecto_tenancy_enforcer git:(master) ✗ mix test test/integration/prepare_test.exs:26
#Ecto.Query<from s0 in "schema_migrations", lock: "FOR UPDATE",
 select: type(s0.version, :integer)>
Including tags: [line: "26"]
Excluding tags: [:test]

#Ecto.Query<from c0 in Tenancy.Company, where: c0.tenant_id == 1, select: c0>
.

Finished in 0.2 seconds
43 tests, 0 failures, 42 excluded

That’s not quite helpful, because the Ecto.Query is printed out in text form. We can add structs: false to the IO.inspect call and we get a different result.

%{
  __struct__: Ecto.Query,
  aliases: %{},
  assocs: [],
  combinations: [],
  distinct: nil,
  from: %{
    __struct__: Ecto.Query.FromExpr,
    as: nil,
    hints: [],
    prefix: nil,
    source: {"companies", Tenancy.Company}
  },
  group_bys: [],
  havings: [],
  joins: [],
  limit: nil,
  lock: nil,
  offset: nil,
  order_bys: [],
  prefix: nil,
  preloads: [],
  select: %{
    __struct__: Ecto.Query.SelectExpr,
    expr: {:&, [], [0]},
    fields: nil,
    file: "/Users/stephenbussey/src/ecto_tenancy_enforcer/deps/ecto/lib/ecto/query/planner.ex",
    line: 814,
    params: [],
    take: %{}
  },
  sources: nil,
  updates: [],
  wheres: [
    %{
      __struct__: Ecto.Query.BooleanExpr,
      expr: {:==, [],
       [
         {{:., [], [{:&, [], [0]}, :tenant_id]}, [], []},
         %{
           __struct__: Ecto.Query.Tagged,
           tag: nil,
           type: {0, :tenant_id},
           value: 1
         }
       ]},
      file: "/Users/stephenbussey/src/ecto_tenancy_enforcer/test/integration/prepare_test.exs",
      line: 27,
      op: :and,
      params: []
    }
  ],
  windows: [],
  with_ctes: nil
}

Alright, now we have the actual struct that we can work with. We are capable of writing a query, inspecting the struct, and then figuring out how to walk / enforce that query. If we continued with TDD at this point, we’d eventually hit a snag. Ecto.Query represents referenced tables (joins, some wheres) with a positional index system. However, the structure doesn’t include the list of positional references—you have to build it yourself.

This led to a bit of a pickle, because it’s not documented anywhere and is considered an internal query structure. However, we know of at least one place that knows how to resolve a positional index into a table / module name (Ecto.Query inspect). Let’s find that and see what it’s doing.

A search for defimpl in Ecto brings us to this line, which is the implementation of Inspect for Ecto.Query. In particular, the following code is of interest to us:

  defp to_list(query) do
    names =
      query
      |> collect_sources
      |> generate_letters
      |> generate_names
      |> List.to_tuple()

    ...

With this lead, it’s possible to use these same functions to create a positional index lookup. You can see a finished example of this in EctoTenancyEnforcer.SourceCollector. I modified this to return the schema module name, rather than a letter or table name, so that I could use my enforced_schemas option to check if a referenced table needs checked or not.

We have about as much as we’re going to get out of Ecto. At this point, it’s possible to jump in and start coding a query verifier to do whatever you want. I found myself still a bit lost at this point, but I did know what I wanted to pass / fail. I’ll walk through the steps I took to test drive a query enforcer.

Test Driving EctoTenancyEnforcer

I don’t practice test driven development often, but I find it valuable when I have no clue what the solution is going to be, but I know what I want it to look like. That is the case in a query enforcer, because it’s easy to write out queries that should pass and fail into a test suite. I can then go one-by-one to make the incorrect ones fail, while still having the valid ones pass.

I started with a set of very simple tests, like so:

test "no filters at all" do
  assert_raise(TenancyViolation, fn ->
    Repo.all(Company)
  end)
end

test "valid tenancy is only condition" do
  valid = from c in Company, where: c.tenant_id == 1
  assert Repo.all(valid) |> length == 1
end

I started with these to build some simple confidence. Once they were passing, I listed out about 40 queries that I knew should work or not work. I chipped my way through these and eventually found the patterns in the Ecto.Query struct that led to fairly clean code to walk them.

The final result can be seen in the EctoTenancyEnforcer.QueryVerifier module. I’m sure that there are cases I missed in my TDD, but I’m happy enough with this to use it in production applications. I’ll add new tests as cases are encountered in the wild.

Wrapping Up

Ecto’s prepare_query callback is incredibly powerful for query inspection and modification. It’s a bit dense to get started with, due to the Ecto.Query structure being undocumented, but TDD helped me out significantly. The Ecto.Query walking is undocumented, but I’m anticipating stability in the Query structure going into the future. That may or may not pan out, but I think it’s a decent bet.

If you’re looking for tenancy enforcement in Ecto queries, try out EctoTenancyEnforcer. You can refer to this repo as a complete example of query enforcement.

The Book Plug

My book “Real-Time Phoenix: Build Highly Scalable Systems with Channels” is now in beta through The Pragmatic Bookshelf. This book explores using Phoenix Channels, GenStage, and more to build real-time applications in Elixir. The first draft has been completed for a little bit and the book should be in production by February, with print coming at the end of the production process.

Real-Time Phoenix by The Pragmatic Bookshelf
View other posts tagged: elixir engineering