Open Policy Agent access control#
The Open Policy Agent access control plugin enables the use of Open Policy Agent (OPA) as authorization engine for fine-grained access control to catalogs, schemas, tables, and more in Trino. Policies are defined in OPA, and Trino checks access control privileges in OPA.
Requirements#
A running OPA deployment
Network connectivity from the Trino cluster to the OPA server
With the requirements fulfilled, you can proceed to set up Trino and OPA with your desired access control configuration.
Trino configuration#
To use only OPA for access control, create the file etc/access-control.properties
with the following minimal configuration:
access-control.name=opa
opa.policy.uri=https://opa.example.com/v1/data/trino/allow
To combine OPA access control with file-based or other access control systems,
configure multiple access control configuration file paths in
etc/config.properties
:
access-control.config-files=etc/trino/file-based.properties,etc/trino/opa.properties
Order the configuration files list in the desired order of the different systems for overall access control. Configure each access-control system in the specified files.
The following table lists the configuration properties for the OPA access control:
Name |
Description |
---|---|
|
The required URI for the OPA endpoint, for example,
|
|
The optional URI for fetching row filters - if not set no row filtering
is applied. For example, |
|
The optional URI for fetching column masks - if not set no masking
is applied. For example, |
|
The optional URI for fetching columns masks in batches; must not
be used with |
|
The optional URI for activating batch mode for certain authorization
queries where batching is applicable, for example
|
|
Configure if request details, including URI, headers and the entire body, are
logged prior to sending them to OPA. Defaults to |
|
Configure if OPA response details, including URI, status code, headers and
the entire body, are logged. Defaults to |
|
Configure if permission management operations are allowed. Find more details in
Permission management. Defaults to |
|
Optional HTTP client configurations for the connection from Trino to OPA,
for example |
Logging#
When request or response logging is enabled, details are logged at the DEBUG
level under the io.trino.plugin.opa.OpaHttpClient
logger. The Trino logging
configuration must be updated to include this class, to ensure log entries are
created.
Note that enabling these options produces very large amounts of log data.
Permission management#
The following operations are allowed or denied based on the setting of
opa.allow-permission-management
If set to true
, these operations are
allowed. If set to false
, they are denied. In both cases, no request is sent
to OPA.
GrantSchemaPrivilege
DenySchemaPrivilege
RevokeSchemaPrivilege
GrantTablePrivilege
DenyTablePrivilege
RevokeTablePrivilege
CreateRole
DropRole
GrantRoles
RevokeRoles
The setting defaults to false
due to the complexity and potential unexpected
consequences of having SQL-style grants and roles together with OPA.
You must enable permission management if another custom security system in Trino is capable of grant management and used together with OPA access control.
Additionally, users are always allowed to show information about roles (SHOW ROLES
), regardless of this setting. The following operations are always
allowed:
ShowRoles
ShowCurrentRoles
ShowRoleGrants
OPA configuration#
The OPA access control in Trino contacts OPA for each query and issues an
authorization request. OPA must return a response containing a boolean allow
field, which determines whether the operation is permitted or not.
Policies in OPA are defined with the purpose built policy language Rego. Find more information in the detailed documentation. After the initial installation and configuration in Trino, these policies are the main configuration aspect for your access control setup.
A query from the OPA access control in Trino to OPA contains a context
and an
action
as its top level fields.
The context
object contains all other contextual information about the query:
identity
: The identity of the user performing the operation, containing the following two fields:user
: usernamegroups
: list of groups this user belongs to
softwareStack
: Information about the software stack issuing the request to OPA. The following information is included:trinoVersion
: Version of Trino used
The action
object contains information about what action is performed on what
resources. The following fields are provided:
operation
: the performed operation, for exampleSelectFromColumns
.resource
: information about the accessed objectstargetResource
: information about any newly created object, if applicablegrantee
: grantee of a grant operation.
Fields that are not applicable for a specific operation are set to null.
Examples are an empty targetResource
if not modifying a table or schema or
catalog is modified, or an empty grantee
if not granting permissions is set.
Any null field is omitted altogether from the action
object.
Example requests to OPA#
Accessing a table results in a query similar to the following example:
{
"context": {
"identity": {
"user": "foo",
"groups": ["some-group"]
},
"softwareStack": {
"trinoVersion": "434"
}
},
"action": {
"operation": "SelectFromColumns",
"resource": {
"table": {
"catalogName": "example_catalog",
"schemaName": "example_schema",
"tableName": "example_table",
"columns": [
"column1",
"column2",
"column3"
]
}
}
}
}
The targetResource
is used in cases where a new resource, distinct from the one in
resource
is created. For example, when renaming a table.
{
"context": {
"identity": {
"user": "foo",
"groups": ["some-group"]
},
"softwareStack": {
"trinoVersion": "434"
}
},
"action": {
"operation": "RenameTable",
"resource": {
"table": {
"catalogName": "example_catalog",
"schemaName": "example_schema",
"tableName": "example_table"
}
},
"targetResource": {
"table": {
"catalogName": "example_catalog",
"schemaName": "example_schema",
"tableName": "new_table_name"
}
}
}
}
Row filtering#
Row filtering allows Trino to remove some rows from the result before returning
it to the caller, controlling what data different users can see. The plugin
supports retrieving filter definitions from OPA by configuring the OPA endpoint
for row filter processing with opa.policy.row-filters-uri
.
For example, an OPA policy for row filtering may be defined by the following rego script:
package trino
import future.keywords.in
import future.keywords.if
import future.keywords.contains
default allow := true
table_resource := input.action.resource.table
is_admin {
input.context.identity.user == "admin"
}
rowFilters contains {"expression": "user_type <> 'customer'"} if {
not is_admin
table_resource.catalogName == "sample_catalog"
table_resource.schemaName == "sample_schema"
table_resource.tableName == "restricted_table"
}
The response expected by the plugin is an array of objects, each of them in the
format {"expression":"clause"}
. Each expression essentially behaves like an
additional WHERE
clause. The script can also return multiple row filters for a
single OPA request, and all filters are subsequently applied.
Each object may contain an identity field. The identity field allows Trino to evaluate these row filters under a different identity - such that a filter can target a column the requesting user cannot see.
Column masking#
Column masking allows Trino to obscure data in one or more columns of the result
set for specific users, without outright denying access. The plugin supports
fetching column masks from OPA by configuring the OPA endpoint for columns mask
processing with opa.policy.column-masking-uri
in the opa-plugin configuration.
For example, a policy configuring column masking may be defined by the following rego script:
package trino
import future.keywords.in
import future.keywords.if
import future.keywords.contains
default allow := true
column_resource := input.action.resource.column
is_admin {
input.context.identity.user == "admin"
}
columnMask := {"expression": "NULL"} if {
not is_admin
column_resource.catalogName == "sample_catalog"
column_resource.schemaName == "sample_schema"
column_resource.tableName == "restricted_table"
column_resource.columnName == "user_phone"
}
columnMask := {"expression": "'****' || substring(user_name, -3)"} if {
not is_admin
column_resource.catalogName == "sample_catalog"
column_resource.schemaName == "sample_schema"
column_resource.tableName == "restricted_table"
column_resource.columnName == "user_name"
}
Unlike row filtering, only a single column mask may be returned for a given column.
The same “identity” field may be returned to evaluate column masks under a different identity.
Batch column masking#
If column masking is enabled, by default, the plugin will fetch each column mask individually from OPA. When working with very wide tables this can result in a performance degradation.
Configuring opa.policy.batch-column-masking-uri
allows Trino to fetch the masks
for multiple columns in a single request. The list of requested columns is included
in the request under action.filterResources
.
If opa.policy.batch-column-masking-uri
is set it overrides the value of
opa.policy.column-masking-uri
so that the plugin uses batch column
masking.
An OPA policy supporting batch column masking must return a list of objects, each containing the following data:
viewExpression
:expression
: the expression to apply to the column, as a stringidentity
(optional): the identity to evaluate the expression as, as a string
index
: a reference the index of the column in the request to which this mask applies
For example, a policy configuring batch column masking may be defined by the following rego script:
package trino
import future.keywords.in
import future.keywords.if
import future.keywords.contains
default allow := true
batchColumnMasks contains {
"index": i,
"viewExpression": {
"expression": "NULL"
}
} if {
some i
column_resource := input.action.filterResources[i]
column_resource.catalogName == "sample_catalog"
column_resource.schemaName == "sample_schema"
column_resource.tableName == "restricted_table"
column_resource.columnName == "user_phone"
}
batchColumnMasks contains {
"index": i,
"viewExpression": {
"expression": "'****' || substring(user_name, -3)",
"identity": "admin"
}
} if {
some i
column_resource := input.action.filterResources[i]
column_resource.catalogName == "sample_catalog"
column_resource.schemaName == "sample_schema"
column_resource.tableName == "restricted_table"
column_resource.columnName == "user_name"
}
A batch column masking request is similar to the following example:
{
"context": {
"identity": {
"user": "foo",
"groups": ["some-group"]
},
"softwareStack": {
"trinoVersion": "434"
}
},
"action": {
"operation": "GetColumnMask",
"filterResources": [
{
"column": {
"catalogName": "sample_catalog",
"schemaName": "sample_schema",
"tableName": "restricted_table",
"columnName": "user_phone",
"columnType": "VARCHAR"
}
},
{
"column": {
"catalogName": "sample_catalog",
"schemaName": "sample_schema",
"tableName": "restricted_table",
"columnName": "user_name",
"columnType": "VARCHAR"
}
}
]
}
}
The related OPA response is displayed in the following snippet:
[
{
"index": 0,
"viewExpression": {
"expression": "NULL"
}
},
{
"index": 1,
"viewExpression": {
"expression": "'****' || substring(user_name, -3)",
"identity": "admin"
}
}
]
Batch mode#
A very powerful feature provided by OPA is its ability to respond to
authorization queries with more complex answers than a true
or false
boolean
value.
Many features in Trino require filtering to determine to which resources a user is granted access. These resources are catalogs, schema, queries, views, and others objects.
If opa.policy.batched-uri
is not configured, Trino sends one request to OPA
for each object, and then creates a filtered list of permitted objects.
Configuring opa.policy.batched-uri
allows Trino to send a request to
the batch endpoint, with a list of resources in one request using the
under action.filterResources
node.
All other fields in the request are identical to the non-batch endpoint.
An OPA policy supporting batch operations must return a list containing the
indices of the items for which authorization is granted. Returning a null
value or an empty list is equivalent and denies any access.
You can add batching support for policies that do not support it:
package foo
import future.keywords.contains
# ... rest of the policy ...
# this assumes the non-batch response field is called "allow"
batch contains i {
some i
raw_resource := input.action.filterResources[i]
allow with input.action.resource as raw_resource
}
# Corner case: filtering columns is done with a single table item, and many columns inside
# We cannot use our normal logic in other parts of the policy as they are based on sets
# and we need to retain order
batch contains i {
some i
input.action.operation == "FilterColumns"
count(input.action.filterResources) == 1
raw_resource := input.action.filterResources[0]
count(raw_resource["table"]["columns"]) > 0
new_resources := [
object.union(raw_resource, {"table": {"column": column_name}})
| column_name := raw_resource["table"]["columns"][_]
]
allow with input.action.resource as new_resources[i]
}