Verifying Queries with Ecto's prepare_query Callback
You can use many different techniques to build and scale Software as a Service applications. One technique that is very popular is to use a single database for multiple paying customers. This multi-tenant approach to SaaS works well for many people, but there are a few dangers to look out for. The biggest danger is the risk of cross-tenant data leaking. I consider this the worst possible scenario for a multi-tenant SaaS application, even beyond a full system outage.
In this post, we’re going to look at a technique to guarantee that cross-tenant leaks don’t happen in an Elixir application. We’ll be
looking at Ecto’s new(ish) prepare_query
callback and how it can be used to inspect (almost) every query that goes through your
application. I’ll discuss how I test drove a query inspector to inspect every query for tenancy.
The ultimate goal of this post is to serve as a light reference for how I navigated the Ecto.Query
struct to implement
the tenancy enforcer. The biggest challenge that I faced was understanding what went into the structure and how to walk it.
Let’s first look at what prepare_query
is.
Ecto’s prepare_query
Callback
The prepare_query
callback was introduced in September of 2019. You
can define a function (prepare_query
) in your Application.Repo
module. The function is invoked before a query is executed, and you
are provided the full query structure as well as some metadata.
With prepare_query
, you can inspect the query or even modify it. The Ecto documentation gives an example where a Repo filters out
“soft deleted” records unless the user is an admin. It looks like this:
# From https://hexdocs.pm/ecto/Ecto.Repo.html#c:prepare_query/3
@impl true
def prepare_query(_operation, query, opts) do
if opts[:admin] do
{query, opts}
else
query = from(x in query, where: is_nil(x.deleted_at))
{query, opts}
end
end
We can use this to detect whether a query has tenancy set correctly. We won’t actually modify the query in this post, due to concerns I have with how that approach removes multi-tenancy awareness.
In order to get started, we’ll need to define a function that shows us the Ecto.Query
structure. Let’s do that next.
Understanding Ecto Query Structure from Inspect Protocol
In order to get started with a runnable example, I modified
test/support/tenancy/repo.ex
by replacing prepare_query
with the following empty function.
def prepare_query(_operation, query, opts) do
IO.inspect query
{query, opts}
end
I can then run the test "valid tenancy is only condition"
with mix test test/integration/prepare_test.exs:26
, and see the following
output:
➜ ecto_tenancy_enforcer git:(master) ✗ mix test test/integration/prepare_test.exs:26
#Ecto.Query<from s0 in "schema_migrations", lock: "FOR UPDATE",
select: type(s0.version, :integer)>
Including tags: [line: "26"]
Excluding tags: [:test]
#Ecto.Query<from c0 in Tenancy.Company, where: c0.tenant_id == 1, select: c0>
.
Finished in 0.2 seconds
43 tests, 0 failures, 42 excluded
That’s not quite helpful, because the Ecto.Query
is printed out in text form. We can add structs: false
to the IO.inspect
call
and we get a different result.
%{
__struct__: Ecto.Query,
aliases: %{},
assocs: [],
combinations: [],
distinct: nil,
from: %{
__struct__: Ecto.Query.FromExpr,
as: nil,
hints: [],
prefix: nil,
source: {"companies", Tenancy.Company}
},
group_bys: [],
havings: [],
joins: [],
limit: nil,
lock: nil,
offset: nil,
order_bys: [],
prefix: nil,
preloads: [],
select: %{
__struct__: Ecto.Query.SelectExpr,
expr: {:&, [], [0]},
fields: nil,
file: "/Users/stephenbussey/src/ecto_tenancy_enforcer/deps/ecto/lib/ecto/query/planner.ex",
line: 814,
params: [],
take: %{}
},
sources: nil,
updates: [],
wheres: [
%{
__struct__: Ecto.Query.BooleanExpr,
expr: {:==, [],
[
{{:., [], [{:&, [], [0]}, :tenant_id]}, [], []},
%{
__struct__: Ecto.Query.Tagged,
tag: nil,
type: {0, :tenant_id},
value: 1
}
]},
file: "/Users/stephenbussey/src/ecto_tenancy_enforcer/test/integration/prepare_test.exs",
line: 27,
op: :and,
params: []
}
],
windows: [],
with_ctes: nil
}
Alright, now we have the actual struct that we can work with. We are capable of writing a query, inspecting the struct, and then figuring
out how to walk / enforce that query. If we continued with TDD at this point, we’d eventually hit a snag. Ecto.Query
represents
referenced tables (joins, some wheres) with a positional index system. However, the structure doesn’t include
the list of positional references—you have to build it yourself.
This led to a bit of a pickle, because it’s not documented anywhere and is considered an internal query structure. However, we know of at least one place that knows how to resolve a positional index into a table / module name (Ecto.Query inspect). Let’s find that and see what it’s doing.
A search for defimpl
in Ecto brings us to this line,
which is the implementation of Inspect
for Ecto.Query
. In particular, the following code is of interest to us:
defp to_list(query) do
names =
query
|> collect_sources
|> generate_letters
|> generate_names
|> List.to_tuple()
...
With this lead, it’s possible to use these same functions to create a positional index lookup. You can see a finished example of this
in EctoTenancyEnforcer.SourceCollector
. I modified this to return the schema module name, rather than a letter or table name, so that I could use my
enforced_schemas
option to check if a referenced table needs checked or not.
We have about as much as we’re going to get out of Ecto. At this point, it’s possible to jump in and start coding a query verifier to do whatever you want. I found myself still a bit lost at this point, but I did know what I wanted to pass / fail. I’ll walk through the steps I took to test drive a query enforcer.
Test Driving EctoTenancyEnforcer
I don’t practice test driven development often, but I find it valuable when I have no clue what the solution is going to be, but I know what I want it to look like. That is the case in a query enforcer, because it’s easy to write out queries that should pass and fail into a test suite. I can then go one-by-one to make the incorrect ones fail, while still having the valid ones pass.
I started with a set of very simple tests, like so:
test "no filters at all" do
assert_raise(TenancyViolation, fn ->
Repo.all(Company)
end)
end
test "valid tenancy is only condition" do
valid = from c in Company, where: c.tenant_id == 1
assert Repo.all(valid) |> length == 1
end
I started with these to build some simple confidence. Once they were passing, I listed out about 40 queries that I knew should work
or not work. I chipped my way through these and eventually found the patterns in the Ecto.Query
struct that led to fairly clean code to
walk them.
The final result can be seen in the EctoTenancyEnforcer.QueryVerifier
module. I’m sure that there are cases I missed in my TDD, but I’m happy enough with this to
use it in production applications. I’ll add new tests as cases are encountered in the wild.
Wrapping Up
Ecto’s prepare_query
callback is incredibly powerful for query inspection and modification. It’s a bit dense to get started with,
due to the Ecto.Query
structure being undocumented, but TDD helped me out significantly. The Ecto.Query
walking is undocumented,
but I’m anticipating stability in the Query structure going into the future. That may or may not pan out, but I think it’s a decent
bet.
If you’re looking for tenancy enforcement in Ecto queries, try out EctoTenancyEnforcer. You can refer to this repo as a complete example of query enforcement.
The Book Plug
My book “Real-Time Phoenix: Build Highly Scalable Systems with Channels” is now in beta through The Pragmatic Bookshelf. This book explores using Phoenix Channels, GenStage, and more to build real-time applications in Elixir. The first draft has been completed for a little bit and the book should be in production by February, with print coming at the end of the production process.