Clickhouse Dictionaries - Sources of External Dictionaries
Clickhouse允许从不同的源构造外部字典,配置文件通常像这样:
1 | <yandex> |
数据源则是通过source
项进行配置
其中支持的数据源的类型有(source_type):
- Local file
- Executable file
- HTTP(s)
- DBMS
- MySQL
- ClickHouse
- MongoDB
- ODBC
Local file 本地文件
配置
1 | <source> |
- path - 本地文件的绝对路径
- format - The file format. All the formats described in “Formats” are supported
Executable File
ClickHouse通过向可执行文件的STDIN发送命令来获得所需的内容
1 | <source> |
- command: 可执行文件的绝对路径,或者文件名(可执行命令的目录已添加到PATH)
- format - The file format. All the formats described in “Formats” are supported
HTTP(s)
ClickHouse通过发送POST
请求来获得所需的内容
配置样例:
1 | <source> |
如果需要支持https,需要在服务端配置文件中配置好ssl 相关设置,具体参考
- url – The source URL.
- format – The file format. All the formats described in “Formats” are supported.
ODBC
通过ODBC驱动连接任意数据库表
1 | <odbc> |
- db – 数据库名称. 如果在
<connection_string>
中,则可以忽略. - table – Name of the table and schema if exists.
- connection_string – Connection string.
- invalidate_query – 用于检测字典状态的QuerySQL. 可选参数. Read more in the section Updating dictionaries.
ClickHouse从ODBC驱动程序接收引用符号并将查询中的所有设置引用到驱动程序,因此必须根据数据库中的表名称大小写相应地设置表名。
注意:已知ODBC字典功能的漏洞
通过ODBC驱动程序连接参数连接到数据库时,可以替换Servername。 在这种情况下,来自odbc.ini的USERNAME和PASSWORD值将被发送到远程服务器并可能被泄露。
/etc/odbc.ini 示例(PostgreSQL)
1 | [gregtest] |
If you then make a query such as
1 | SELECT * FROM odbc('DSN=gregtest;Servername=some-server.com', 'test_db'); |
ODBC driver will send values of USERNAME
and PASSWORD
from odbc.ini
to some-server.com
.
Example of Connecting PostgreSQL
一. 安装 unixODBC 和 the ODBC driver for PostgreSQL:
1 | sudo apt-get install -y unixodbc odbcinst odbc-postgresql |
二. 配置 /etc/odbc.ini (or ~/.odbc.ini):
1 | [DEFAULT] |
You may need to edit odbc.ini
to specify the full path to the library with the driver DRIVER=/usr/local/lib/psqlodbcw.so
.
三. 在clickhouse中配置字典项
1 | <yandex> |
Example of Connecting MS SQL Server
- 安装驱动
- 配置驱动
- 设置字典配置
DBMS
MySQL
1 | <source> |
参数项:
- port – The port on the MySQL server. You can specify it for all replicas, or for each one individually (inside
). - user – Name of the MySQL user. You can specify it for all replicas, or for each one individually (inside
). - password – Password of the MySQL user. You can specify it for all replicas, or for each one individually (inside
). - replica – Section of replica configurations. There can be multiple sections.
- replica/host – The MySQL host.
- replica/priority – The replica priority. When attempting to connect, ClickHouse traverses the replicas in order of priority. The lower the number, the higher the priority.
- db – Name of the database.
- table – Name of the table.
- where – The selection criteria. Optional parameter.
- invalidate_query – Query for checking the dictionary status. Optional parameter. Read more in the section Updating dictionaries.
MySQL can be connected on a local host via sockets. To do this, set host and socket
1 | <source> |
ClickHouse
1 | <source> |
- host – The ClickHouse host. If it is a local host, the query is processed without any network activity. To improve fault tolerance, you can create a Distributed table and enter it in subsequent configurations.
- port – The port on the ClickHouse server.
- user – Name of the ClickHouse user.
- password – Password of the ClickHouse user.
- db – Name of the database.
- table – Name of the table.
- where – The selection criteria. May be omitted.
- invalidate_query – Query for checking the dictionary status. Optional parameter. Read more in the section Updating dictionaries.
MongoDB
配置举例:
1 | <source> |
参数项
- host – The MongoDB host.
- port – The port on the MongoDB server.
- user – Name of the MongoDB user.
- password – Password of the MongoDB user.
- db – Name of the database.
- collection – Name of the collection.