Ouch! Killing PG Performance with Row-Level Security

Ouch! Killing PG Performance with Row-Level Security
Photo by Hannah Lim / Unsplash

Postgres 11 and higher has a feature called Row Security Policies, or Row-Level Security, which is a great feature for restricting access to rows in a tabel on a per-user basis.

However, if you’re not careful, you can drag down performance with a bad policy.

The post below comes from a recent project. Now that I’ve researched for this post, I want to benchmark these policies against each other and compare their explain plans. Unfortunately, I am writing on a laptop that doesn’t have much installed, and the environment we were working in when we encountered this issue didn’t have adequate explain plans.

The post below has examples, but no benchmarks. Stay tuned for a followup post!

What is RLS in Postgres?

Let’s start with a policy on a single table. This table is called pets and we’re going to let the current_user who is logged in to the database only view pets that belong to them:

CREATE TABLE pets (
    id SERIAL PRIMARY KEY,
    name TEXT NOT NULL,
    human TEXT NOT NULL
);

ALTER TABLE pets ENABLE ROW LEVEL SECURITY;

CREATE POLICY read_pets ON pets
    FOR SELECT
    USING (human = current_user);

When I run SELECT * FROM PETS I will only get records back that map onto my user. If I try SELECT a pet that does not belong to me, I will get nothing back—as if the record didn’t exist at all.

Performance Foot-Gun

Okay, now let’s shoot ourselvs in the foot with Row-Level Security. Fun!

Let’s create a table called humans that can have many pets through a join table called humans_pets. We’ll also enable Row-Level Security on pets and add in index to the join table:

CREATE TABLE humans (
    id SERIAL PRIMARY KEY,
    name TEXT NOT NULL
);

CREATE TABLE pets (
    id SERIAL PRIMARY KEY,
    name TEXT NOT NULL,
);

CREATE TABLE humans_pets (
    human_id INTEGER REFERENCES humans(id),
    pet_id INTEGER REFERENCES pets(id),
    PRIMARY KEY (human_id, pet_id)
)

ALTER TABLE pets ENABLE ROW LEVEL SECURITY;

CREATE index ON humans_pets(human_id, pet_id);

Now we'll add a policy that allows humans to only read pets that belong to them. For this example, we’ll assume that current_user_id() is a function that returns the ID of the user we are selecting for.

In order to write this policy, we need to look at the humans_pets table. And here is where we shoot ourselves in the foot!

CREATE POLICY read_pets ON pets
    FOR SELECT
    USING (EXISTS (
        SELECT 1 FROM human_pets
        WHERE human_pets.pet_id = pets.id AND
        human_pets.human_id = current_user_id()
    ));

This policy looks correct because it is ensuring that there is a record that associates the human to the pet. This is all fine and dandy when we run SELECT * FROM pets WHERE id = 1 because we’re only selecting one pet.

But what happens when we run SELECT * FROM pets;? Postgres will compare the policy statement to every row in the table. That means the the USING statement will be executed once per row.

Oops! This is an n + 1! We now query + 1 time for n rows. Not good!

How to Write a Performant RLS Policy

Okay, lets fix this. We can rely a bit on the query planner here and write a policy that uses IN to see if the row exists in the table:

CREATE POLICY pets_self ON pets
    FOR SELECT
    USING (
        id IN (
            SELECT pet_id FROM human_pets
            WHERE human_id = current_user_id()
        )
    );

This approach is much better. It will still call the policy once per row, but at least the query planner will have cached the plan for the SELECT. Here we’re comparing the id to an array, rather than having to select each human_pet record individually.

Can We Make it Faster?

Maybe! The following code is untested, but I want to give it a try. This blog post indicates that using an array of pet_ids is fast. That’s great, we’re already doing that. There’s another trick, though. Under the hood Postgres will treat functions as prepared statements. Which should improve performance when we need to call the function again.

Here’s an example of what that might look like. First we create a function that returns pet IDs.

CREATE OR REPLACE FUNCTION humans_pets_ids(human_id)
AS $$
  SELECT pet_id
  FROM humans_pets
  WHERE (
    humans_id = humans_pets.human_id
  )
$$
LANGUAGE sql;

Then we use that function in our policy.

CREATE POLICY pets_self ON pets
    FOR SELECT
    USING (
        id IN(
            SELECT pet_id FROM humans_pets_ids(
                current_user_id()
            )
        )
    );

In this policy, we’re using the function humans_pets_ids, which might give us a small performance boost thanks to PL/pgSQL.

Conclusion

I’m hoping to have time this week to benchmark gese policies and get some explain plans. I’m really curious to know if the function approach is any faster and I’d like to see explain plans for the first two policies. I’ll write a followup post soon!