Druid connector#

The Druid connector allows querying an Apache Druid database from Trino.
Requirements#
To connect to Druid, you need:
Druid version 0.18.0 or higher.
Network access from the Trino coordinator and workers to your Druid broker. Port 8082 is the default port.
Configuration#
Create a catalog properties file that specifies the Druid connector by setting
the connector.name
to druid
and configuring the connection-url
with
the JDBC string to connect to Druid.
For example, to access a database as druid
, create the file
etc/catalog/druid.properties
. Replace BROKER:8082
with the correct
host and port of your Druid broker.
connector.name=druid
connection-url=jdbc:avatica:remote:url=http://BROKER:8082/druid/v2/sql/avatica/
You can add authentication details to connect to a Druid deployment that is secured by basic authentication by updating the URL and adding credentials:
connection-url=jdbc:avatica:remote:url=http://BROKER:port/druid/v2/sql/avatica/;authentication=BASIC
connection-user=root
connection-password=secret
Now you can access your Druid database in Trino with the druiddb
catalog
name from the properties file.
General configuration properties#
The following table describes general catalog configuration properties for the connector:
Property name |
Description |
Default value |
---|---|---|
|
Support case insensitive schema and table names. |
|
|
|
|
|
Path to a name mapping configuration file in JSON format that allows Trino to disambiguate between schemas and tables with similar names in different cases. |
|
|
Frequency with which Trino checks the name matching configuration file for changes. |
|
|
Duration for which metadata, including table and column statistics, is cached. |
|
|
Cache the fact that metadata, including table and column statistics, is not available |
|
|
Maximum number of objects stored in the metadata cache |
|
|
Maximum number of statements in a batched execution. Do not change this setting from the default. Non-default values may negatively impact performance. |
|
|
Push down dynamic filters into JDBC queries |
|
|
Maximum duration for which Trino will wait for dynamic filters to be collected from the build side of joins before starting a JDBC query. Using a large timeout can potentially result in more detailed dynamic filters. However, it can also increase latency for some queries. |
|
|
Minimum size of query predicates above which Trino compacts the predicates. Pushing down a large list of predicates to the data source can compromise performance. For optimization in that situation, Trino can compact the large predicates into a simpler range predicate. If necessary, adjust the threshold to ensure a balance between performance and predicate pushdown. |
|
Procedures#
system.flush_metadata_cache()
Flush JDBC metadata caches. For example, the following system call flushes the metadata caches for all schemas in the
example
catalogUSE example.myschema; CALL system.flush_metadata_cache();
Case insensitive matching#
When case-insensitive-name-matching
is set to true
, Trino
is able to query non-lowercase schemas and tables by maintaining a mapping of
the lowercase name to the actual name in the remote system. However, if two
schemas and/or tables have names that differ only in case (such as “customers”
and “Customers”) then Trino fails to query them due to ambiguity.
In these cases, use the case-insensitive-name-matching.config-file
catalog
configuration property to specify a configuration file that maps these remote
schemas/tables to their respective Trino schemas/tables:
{
"schemas": [
{
"remoteSchema": "CaseSensitiveName",
"mapping": "case_insensitive_1"
},
{
"remoteSchema": "cASEsENSITIVEnAME",
"mapping": "case_insensitive_2"
}],
"tables": [
{
"remoteSchema": "CaseSensitiveName",
"remoteTable": "tablex",
"mapping": "table_1"
},
{
"remoteSchema": "CaseSensitiveName",
"remoteTable": "TABLEX",
"mapping": "table_2"
}]
}
Queries against one of the tables or schemes defined in the mapping
attributes are run against the corresponding remote entity. For example, a query
against tables in the case_insensitive_1
schema is forwarded to the
CaseSensitiveName schema and a query against case_insensitive_2
is forwarded
to the cASEsENSITIVEnAME
schema.
At the table mapping level, a query on case_insensitive_1.table_1
as
configured above is forwarded to CaseSensitiveName.tablex
, and a query on
case_insensitive_1.table_2
is forwarded to CaseSensitiveName.TABLEX
.
By default, when a change is made to the mapping configuration file, Trino must
be restarted to load the changes. Optionally, you can set the
case-insensitive-name-mapping.refresh-period
to have Trino refresh the
properties without requiring a restart:
case-insensitive-name-mapping.refresh-period=30s
Type mapping#
Type mapping configuration properties#
The following properties can be used to configure how data types from the connected data source are mapped to Trino data types and how the metadata is cached in Trino.
Property name |
Description |
Default value |
---|---|---|
|
Configure how unsupported column data types are handled:
The respective catalog session property is |
|
|
Allow forced mapping of comma separated lists of data types to convert to
unbounded |
SQL support#
The connector provides globally available and read operation statements to access data and metadata in the Druid database.
Table functions#
The connector provides specific table functions to access Druid.
query(varchar) -> table
#
The query
function allows you to query the underlying database directly. It
requires syntax native to Druid, because the full query is pushed down and
processed in Druid. This can be useful for accessing native features which are
not available in Trino or for improving query performance in situations where
running a query natively may be faster.
As an example, use STRING_TO_MV
and MV_LENGTH
from
Druid SQL’s multi-value string functions
to split and then count the number of comma-separated values in a column:
SELECT
num_reports
FROM
TABLE(
druid.system.query(
query => 'SELECT
MV_LENGTH(
STRING_TO_MV(direct_reports, ",")
) AS num_reports
FROM company.managers'
)
);