Routing rules¶
Trino Gateway includes a routing rules engine.
By default, Trino Gateway reads the X-Trino-Routing-Group
request header to
route requests. If this header is not specified, requests are sent to the
default routing group called adhoc
.
The routing rules engine feature enables you to either write custom logic to route requests based on the request info such as any of the request headers, or set a URL address to make an HTTP POST request and route based on the returned result.
Routing rules are defined in a configuration file or implemented in separate, custom service application. The connection to the separate service is configured as a URL. It can implement any dynamic rule changes or other behavior.
Enabling the routing rules engine¶
To enable the routing rules engine, find the following lines in
gateway-ha-config.yml
:
- Set
rulesEngineEnabled
totrue
, thenrulesType
asFILE
orEXTERNAL
. - If you set
rulesType: FILE
, then setrulesConfigPath
to the path to your rules config file. - The rules file will be re-read every minute by default. You may change this by setting
rulesRefreshPeriod: Duration
, where duration is an airlift style Duration such as30s
. - If you set
rulesType: EXTERNAL
, setrulesExternalConfiguration
to the URL of an external service for routing rules processing. rulesType
is by defaultFILE
unless specified.
routingRules:
rulesEngineEnabled: true
rulesType: FILE
rulesConfigPath: "app/config/routing_rules.yml" # replace with actual path to your rules config file
rulesExternalConfiguration:
urlPath: https://router.example.com/gateway-rules # replace with your own API path
excludeHeaders:
- 'Authorization'
- 'Accept-Encoding'
- Redirect URLs are not supported.
- Optionally, add headers to the
excludeHeaders
list to exclude requests with corresponding header values from being sent in the POST request. - Check headers to exclude when making API requests, specifics depend on the network configuration.
If there is error parsing the routing rules configuration file, an error is
logged, and requests are routed using the routing group header
X-Trino-Routing-Group
as default.
Configuring API requests with HTTP client config¶
You can configure the HTTP client by adding the following configuration to
the serverConfig:
section with the router
prefix.
Please refer to the Trino HTTP client properties documentation for more.
Use an external service for routing rules¶
You can use an external service for processing your routing by setting the
rulesType
to EXTERNAL
and configuring the rulesExternalConfiguration
.
Trino Gateway then sends all headers, other than those specified in
excludeHeaders
, as a map in the body of a POST request to the external
service. If requestAnalyzerConfig.analyzeRequest
is set to true
,
TrinoRequestUser
and TrinoQueryProperties
are also included.
Additionally, the following HTTP information is included:
remoteUser
method
requestURI
queryString
session
remoteAddr
remoteHost
parameterMap
The external service can process the information in any way desired and must return a result with the following criteria:
- Response status code of OK (200)
- Message in JSON format
- Only one group can be returned
- If errors is not null, then query would route to default routing group adhoc
Configure routing rules with a file¶
Rules consist of a name, description, condition, and list
of actions. If the condition of a particular rule evaluates to true
, its
actions are fired. Rules are stored as a
multi-document YAML file.
---
name: "airflow"
description: "if query from airflow, route to etl group"
condition: 'request.getHeader("X-Trino-Source") == "airflow"'
actions:
- 'result.put("routingGroup", "etl")'
---
name: "airflow special"
description: "if query from airflow with special label, route to etl-special group"
condition: 'request.getHeader("X-Trino-Source") == "airflow" && request.getHeader("X-Trino-Client-Tags") contains "label=special"'
actions:
- 'result.put("routingGroup", "etl-special")'
Three objects are available by default. They are
* request
, the incoming request as an HttpServletRequest
* state
, a HashMap<String, Object>
that allows passing arbitrary state from one rule evaluation to the next
* result
, a HashMap<String, String>
that is used to return the result of rule evaluation to the engine
In addition to the default objects, rules may optionally utilize
trinoRequestUser and
trinoQueryProperties
, which provide information about the user and query respectively.
You must include an action of the form result.put(\"routingGroup\", \"foo\")
to trigger routing of a request that satisfies the condition to the specific
routing group. Without this action, the default adhoc group is used and the
whole routing rule is redundant.
The condition and actions are written in MVEL,
an expression language with Java-like syntax. Classes from java.util
, data-type
classes from java.lang
such as Integer
and String
, as well as java.lang.Math
and java.lang.StrictMath
are available in rules. Rules may not use java.lang.System
and other classes that allow access the Java runtime and operating system.
In most cases, you can write
conditions and actions in Java syntax and expect it to work. One exception is
parametrized types. Exclude type parameters, for example to add a HashSet
to the
state
variable, use an action such as:
new HashSet<Object>()
.
There are some
MVEL-specific operators. For example, instead of doing a null-check before
accessing the String.contains
method like this:
condition: 'request.getHeader("X-Trino-Client-Tags") != null && request.getHeader("X-Trino-Client-Tags").contains("label=foo")'
You can use the contains
operator
If no rules match, then the request is routed to the default adhoc
routing
group.
TrinoStatus¶
The TrinoStatus
class attempts to track the current state of the configured
Trino clusters. The three possible states of these cluster are updated with
every healthcheck:
PENDING
: A Trino cluster shows this state when it is still starting up. It is treated as unhealthy byRoutingManager
, and therefore requests are not be routed to these clusters.HEALTHY
: A Trino cluster shows this state when healthchecks report the cluster as healthy and ready.RoutingManager
only routes requests to healthy clusters.UNHEALTHY
: A Trino cluster shows this state when healthchecks report the cluster as unhealthy.RoutingManager
does not route requests to unhealthy clusters.
TrinoRequestUser¶
The TrinoRequestUser
class attempts to extract user information from a
request, in the following order:
X-Trino-User
header.Authorization: Basic
header.Authorization: Bearer
header. Requires configuring an OAuth2 User Info URL.Trino-UI-Token
or__Secure-Trino-ID-Token
cookie.
Kerberos and Certificate authentication are not currently supported. If the
request contains the Authorization: Bearer
header, an attempt is made to treat
the token as a JWT and deserialize it. If this is successful, the value of the
claim named in requestAnalyzerConfig.tokenUserField
is used as the username.
By default, this is the email
claim. If the token is not a valid JWT, and
requestAnalyzerConfig.oauthTokenInfoUrl
is configured, then the token is
exchanged with the Info URL. Responses are cached for 10 minutes to avoid
triggering rate limits.
You may call trinoRequestUser.getUser()
and trinoRequestUser.getUserInfo()
in your routing rules. If user information was not successfully extracted,
trinoRequestUser.getUser()
returns an empty Optional
.
trinoRequestUser.getUserInfo()
returns an Optional<UserInfo>
, with an
OpenID Connect UserInfo
if a token is successfully exchanged with the oauthTokenInfoUrl
, and an empty
Optional
otherwise.
trinoRequestUser.userExistsAndEquals("usernameToTest")
can be used to check a
username against the extracted user. It returns false
if a user has not been
extracted.
User extraction is only available if enabled by configuring
requestAnalyzerConfig.analyzeRequest = True
TrinoQueryProperties¶
The TrinoQueryProperties
class attempts to parse the body of a request to
determine the SQL statement and other information. Note that only a
syntactic analysis is performed.
If a query references a view, then that view is not expanded, and tables
referenced by the view are not recognized. Views and materialized views are
treated as tables and added to the list of tables in all contexts, including
statements such as CREATE VIEW
.
A routing rule can call the following methods on the trinoQueryProperties
object:
String errorMessage()
: the error message, only if there was any error while creating thetrinoQueryProperties
object.boolean isNewQuerySubmission()
: boolean flag to indicate if the request is a POST to thev1/statement
query endpoint.String getQueryType()
: the class name of theStatement
, e.g.ShowCreate
. Note that these are not mapped to theResourceGroup
query types. For a full list of potential query types, see the classes in STATEMENT_QUERY_TYPESString getResourceGroupQueryType()
: the Resource Group query type, for exampleSELECT
,DATA_DEFINITION
. For a full list see queryType in the Trino documentationString getDefaultCatalog()
: the default catalog, if set. It may or may not be referenced in the actual SQLString getDefaultSchema()
: the default schema, if set. It may or may not be referenced in the actual SQLSet<String> getCatalogs()
: the set of catalogs used in the query. Includes the default catalog if used by a non-fully qualified table referenceSet<String> getSchemas()
: the set of schemas used in the query. Includes the default schema if used by a non-fully qualified table referenceSet<String> getCatalogSchemas()
the set of qualified schemas used in the query, in the formcatalog.schema
boolean tablesContains(String testName)
returns true if the query contains a reference to the tabletestName
.testName
should be fully qualified, for exampletestcat.testschema.testtable
Set<QualifiedName> getTables()
: the set of tables used in the query. These are fully qualified, any partially qualified table reference in the SQL will be qualified by the default catalog and schema.String getBody()
: the raw request body
Configuration¶
The trinoQueryProperties
are configured under the requestAnalyzerConfig
configuration node.
analyzeRequest
:
Set to True
to make trinoQueryProperties
and trinoRequestUser
available.
maxBodySize
:
By default, the max body size is 1,000,000 characters. This can be modified by
configuring maxBodySize
. If the request body is greater or equal to this
limit, Trino Gateway does not process the query. A buffer of length
maxBodySize
is allocated per query. Reduce this value if you observe
excessive garbage collection at runtime. maxBodySize
cannot be set to values
larger than 2**31-1, the maximum size of a Java String.
isClientsUseV2Format
:
Some commercial extensions to Trino use the V2 Request Structure
V2 style request structure.
Support for V2-style requests can be enabled by setting this property to true
.
If you use a commercial version of Trino, ask your vendor how to set this
configuration.
tokenUserField
:
When extracting the user from a JWT token, this field is used as the username.
By default, the email
claim is used.
oauthTokenInfoUrl
:
If configured, Trino Gateway attempts to retrieve the user info by exchanging potential authorization tokens with this URL. Responses are cached for 10 minutes to avoid triggering rate limits.
Execution of rules¶
All rules whose conditions are satisfied fire. For example, in the "airflow"
and "airflow special" example rules from the following rule priority section, a
query with source airflow
and label special
satisfies both rules. The
routingGroup
is set to etl
and then to etl-special
because of the order in
which the rules of defined. If you swap the order of the rules, then you get
etl
instead.
You can avoid this ordering issue by writing atomic rules, so any query matches exactly one rule. For example you can change the first rule to the following:
---
name: "airflow"
description: "if query from airflow, route to etl group"
condition: 'request.getHeader("X-Trino-Source") == "airflow" && request.getHeader("X-Trino-Client-Tags") == null'
actions:
- 'result.put("routingGroup", "etl")'
---
This can difficult to maintain with more rules. To have better control over the execution of rules, we can use rule priorities. Overall, priorities and other constructs that MVEL support allows you to express your routing logic.
Rule priority¶
You can assign an integer value priority
to a rule. The lower this integer is,
the earlier it fires. If the priority is not specified, the priority defaults to
INT_MAX
. Following is an example with priorities:
---
name: "airflow"
description: "if query from airflow, route to etl group"
priority: 0
condition: 'request.getHeader("X-Trino-Source") == "airflow"'
actions:
- 'result.put("routingGroup", "etl")'
---
name: "airflow special"
description: "if query from airflow with special label, route to etl-special group"
priority: 1
condition: 'request.getHeader("X-Trino-Source") == "airflow" && request.getHeader("X-Trino-Client-Tags") contains "label=special"'
actions:
- 'result.put("routingGroup", "etl-special")'
Note that both rules still fire. The difference is that you are guaranteed
that the first rule (priority 0) is fired before the second rule (priority 1).
Thus routingGroup
is set to etl
and then to etl-special
, so the
routingGroup
is always etl-special
in the end.
More specific rules must be set to a higher priority so they are evaluated last
to set a routingGroup
.
Passing State¶
The state
object may be used to pass information from one rule evaluation to
the next. This allows an author to avoid duplicating logic in multiple rules.
Priority should be used to ensure that state
is updated before being used
in downstream rules. For example, the atomic rules may be re-implemented as
---
name: "initialize state"
description: "Add a set to the state map to track rules that have evaluated to true"
priority: 0
condition: "true"
actions:
- |
state.put("triggeredRules",new HashSet())
# MVEL does not support type parameters! HashSet<String>() would result in an error.
---
name: "airflow"
description: "if query from airflow, route to etl group"
priority: 1
condition: |
request.getHeader("X-Trino-Source") == "airflow"
actions:
- |
result.put("routingGroup", "etl")
- |
state.get("triggeredRules").add("airflow")
---
name: "airflow special"
description: "if query from airflow with special label, route to etl-special group"
priority: 2
condition: |
state.get("triggeredRules").contains("airflow") && request.getHeader("X-Trino-Client-Tags") contains "label=special"
actions:
- |
result.put("routingGroup", "etl-special")
If statements (MVEL Flow Control)¶
You can use MVEL support for if
statements and other flow control. The following logic
in pseudocode:
if source == "airflow":
if clientTags["label"] == "foo":
return "etl-foo"
else if clientTags["label"] = "bar":
return "etl-bar"
else
return "etl"
This logic Can be implemented with the following rules:
---
name: "airflow rules"
description: "if query from airflow"
condition: "request.getHeader(\"X-Trino-Source\") == \"airflow\""
actions:
- "if (request.getHeader(\"X-Trino-Client-Tags\") contains \"label=foo\") {
result.put(\"routingGroup\", \"etl-foo\")
}
else if (request.getHeader(\"X-Trino-Client-Tags\") contains \"label=bar\") {
result.put(\"routingGroup\", \"etl-bar\")
}
else {
result.put(\"routingGroup\", \"etl\")
}"