Multi-client capability with PostgreSQL
In an application with several clients, it must be ensured that the data of one client is not displayed for others or can be changed by the actions of others. This article describes a solution for PostgreSQL databases.
Anyone who has ever had to develop a multi-client capability (or “multi-tenancy”) in a project will know that there are many different solutions for this.
Usually the following three approaches are spoken of:
1. one database per client
There is a separate database for each client. The application must ensure communication with the correct database for a query.
2. a shared database with one schema per client
There is only one database for all clients, but each client has its own database schema. Each client also has its own database user for database queries. The application has to take care of establishing or using a database connection with the right user.
3. a shared database with a shared schema for all clients
There is only one database with only one schema containing the data of all clients. To be able to assign the data to the clients, the data records are extended with an identifier.
The third variant has two major disadvantages. The data is not isolated from each other and database queries must always be extended with the client identifier by the backend developer so that only the data of the desired client is read during a query.
Row-Level-Security
Since version 9.5 PostgreSQL offers the function Row-Level-Security (short: RLS). This avoids the two disadvantages mentioned above. RLS can be used to create so-called policies on tables. With these policies, access to the table can be restricted both during reading and writing.
Here is an example:
In an employee administration, the table “employees” contains the employees of each client.
The following policy is now created on this table:
CREATE POLICY tenant_policy ON employees
USING (tenant_id = 1)
WITH CHECK (tenant_id = 1)
This means that only employees with client ID 1 can be read and inserted. Such a policy would of course not apply in practice. Here it is merely a matter of clarifying the functionality of the policies.
The “USING” part determines which data records are taken into account when reading and deleting and the “WITH CHECK” part determines which condition must be fulfilled in order to create or change a new data record.
With this policy, the following query:
SELECT * FROM employees
show this result:
However, row-level security must first be activated for each table individually. The following command activates RLS on a table:
ALTER TABLE employees ENABLE ROW LEVEL SECURITY;
Database user
For this procedure to work, there must be at least two database users. One admin user and one application user. The admin user must create the entire database schema, policies, and functions. The application always connects to the application user.
Assigning a Client ID to a Database Connection
As already mentioned, the example policy is not practical because it only works for the client with ID 1. The policy must work for each client.
First of all, there must be a way to determine the client ID for the current database connection. This requires a table in which the client ID for the session ID can be stored.
The application user must not have read or write access to this table.
To enter the client ID for the current session, the following function is created by the admin user:
CREATE FUNCTION set_tenant_id(p_tenant_id INTEGER, pwd TEXT) RETURNS text AS $$
DECLARE
v_pwd TEXT;
BEGIN
SELECT set_tenant_id_password INTO v_pwd FROM secrets WHERE set_tenant_id_password = crypt(pwd, set_tenant_id_password);
IF v_pwd IS NULL THEN
RAISE EXCEPTION ‘invalid password’;
END IF;
INSERT INTO public.pid_tenant_id VALUES (pg_backend_pid(), p_tenant_id) ON CONFLICT (pid) DO UPDATE SET tenant_id = p_tenant_id;
RETURN p_tenant_id;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER VOLATILE;
The function sets the transferred client ID to the table “pid_tenant_id”. This function is also password protected. This ensures that the client ID cannot be changed by SQL injection. The password is stored here in a table called “secrets” as a hash value.
This function is called in a secure area in the backend. Ideally at the point where the database connection is taken from a connection pool for the current request.
To reset the client ID of the session after use, the function is called with “-1” shortly before the database connection is released.
Using the Client ID of the Session in the Policy
Now another function is required to read the client ID of the current session again:
CREATE FUNCTION get_tenant_id() RETURNS integer AS $$
DECLARE
v_tenant_id INT;
BEGIN
SELECT tenant_id INTO v_tenant_id FROM public.pid_tenant_id WHERE pid = pg_backend_pid();
RETURN v_tenant_id;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER STABLE;
The policy can now be extended with the „get_tenant_id“ function.
CREATE POLICY tenant_policy ON employees
USING (tenant_id = get_tenant_id())
WITH CHECK (tenant_id = get_tenant_id())
This means that a session can only read, change, create and delete the data records with the assigned client ID.
Conclusion
RLS offers an elegant solution to securely separate the data of multiple clients on a database without having to extend the database queries in the backend. RLS ensures that the data of one client can’t be read or changed by another client via SQL injection or programming errors.
If it is therefore not possible to set up a separate database for each client, RLS is the lean alternative.
Comments
No Comments