赞
踩
Apache Doris 代码仓库地址:apache/incubator-doris 欢迎大家关注加星
Binlog Load提供了一种使Doris增量同步用户在Mysql数据库的对数据更新操作的CDC(Change Data Capture)功能,使用户更方面的完成Mysql数据的导入
注意:
该功能需要在0.15及以后的版本里使用
下载canal-1.1.5: https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.deployer-1.1.5.tar.gz
- #################################################
- ## mysql serverId , v1.0.26+ will autoGen
- canal.instance.mysql.slaveId=12115
-
- # enable gtid use true/false
- canal.instance.gtidon=false
-
- # position info
- canal.instance.master.address=10.220.146.11:3306
- canal.instance.master.journal.name=
- canal.instance.master.position=
- canal.instance.master.timestamp=
- canal.instance.master.gtid=
-
- # rds oss binlog
- canal.instance.rds.accesskey=
- canal.instance.rds.secretkey=
- canal.instance.rds.instanceId=
-
- # table meta tsdb info
- canal.instance.tsdb.enable=true
- #canal.instance.tsdb.url=jdbc:mysql://127.0.0.1:3306/canal_tsdb
- #canal.instance.tsdb.dbUsername=canal
- #canal.instance.tsdb.dbPassword=canal
-
- #canal.instance.standby.address =
- #canal.instance.standby.journal.name =
- #canal.instance.standby.position =
- #canal.instance.standby.timestamp =
- #canal.instance.standby.gtid=
-
- # username/password
- canal.instance.dbUsername=zhangfeng
- canal.instance.dbPassword=zhangfeng800729)(*Q
- canal.instance.connectionCharset = UTF-8
- # enable druid Decrypt database password
- canal.instance.enableDruid=false
- #canal.instance.pwdPublicKey=MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBALK4BUxdDltRRE5/zXpVEVPUgunvscYFtEip3pmLlhrWpacX7y7GCMo2/JM6LeHmiiNdH1FWgGCpUfircSwlWKUCAwEAAQ==
-
- # table regex
- canal.instance.filter.regex=demo\\..*
- # table black regex
- canal.instance.filter.black.regex=
- # table field filter(format: schema1.tableName1:field1/field2,schema2.tableName2:field1/field2)
- #canal.instance.filter.field=test1.t_product:id/subject/keywords,test2.t_company:id/name/contact/ch
- # table field black filter(format: schema1.tableName1:field1/field2,schema2.tableName2:field1/field2)
- #canal.instance.filter.black.field=test1.t_product:subject/product_image,test2.t_company:id/name/contact/ch
-
- # mq config
- #canal.mq.topic=
- # dynamic topic route by schema or table regex
- #canal.mq.dynamicTopic=mytest1.user,mytest2\\..*,.*\\..*
- #canal.mq.partition=0
- # hash partition config
- #canal.mq.partitionsNum=3
- #canal.mq.partitionHash=test.table:id^name,.*\\..*
- #################################################
注意:canal instance user/passwd
1.1.5 版本,在canal.properties里加上这两个配置
canal.user = canal canal.passwd = E3619321C1A937C46A0D8BD1DAC39F93B27D4458
默认密码为canal/canal,canal.passwd的密码值可以通过select password("xxx") 来获取
tail -200f logs/demo/demo.log
用户需要先在Doris端创建好与Mysql端对应的目标表
Binlog Load只能支持Unique类型的目标表,且必须激活目标表的Batch Delete功能。
开启Batch Delete的方法可以参考help alter table
中的批量删除功能。
- CREATE TABLE `doris_mysql_binlog_demo` (
- `id` int NOT NULL,
- `sex` TINYINT(1),
- `name` varchar(20),
- `address` varchar(255)
- ) ENGINE=OLAP
- UNIQUE KEY(`id`,sex)
- COMMENT "OLAP"
- DISTRIBUTED BY HASH(`sex`) BUCKETS 1
- PROPERTIES (
- "replication_allocation" = "tag.location.default: 3",
- "in_memory" = "false",
- "storage_format" = "V2"
- );
-
- -- enable batch delete
- ALTER TABLE test_2.doris_mysql_binlog_demo ENABLE FEATURE "BATCH_DELETE";
Name: 'CREATE SYNC JOB' Description:
数据同步(Sync Job)功能,支持用户提交一个常驻的数据同步作业,通过从指定的远端地址读取Binlog日志,增量同步用户在Mysql数据库的对数据更新操作的CDC(Change Data Capture)功能。 目前数据同步作业只支持对接Canal,从Canal Server上获取解析好的Binlog数据,导入到Doris内。 用户可通过 SHOW SYNC JOB
查看数据同步作业状态。 语法:
- CREATE SYNC [db.]job_name
- (
- channel_desc,
- channel_desc
- ...
- )
- binlog_desc
job_name
job_name
的作业只能有一个在运行。channel_desc
mysql_db.src_tbl
des_tbl
partitions
column_mapping
binlog_desc
canal.
为前缀
test_db
的 test_tbl
创建一个名为 job1
的数据同步作业,连接本地的Canal服务器,对应Mysql源表 mysql_db1.tbl1
。test_db
的多张表创建一个名为 job1
的数据同步作业,一一对应多张Mysql源表,并显式的指定列映射。注意:
创建同步任务之前,首先要在fe.conf里配置enable_create_sync_job=true,这个默认是false不启用,否则就不能创建同步任务
- CREATE SYNC test_2.doris_mysql_binlog_demo_job
- (
- FROM demo.test_cdc INTO doris_mysql_binlog_demo
- )
- FROM BINLOG
- (
- "type" = "canal",
- "canal.server.ip" = "10.220.146.10",
- "canal.server.port" = "11111",
- "canal.destination" = "demo",
- "canal.username" = "canal",
- "canal.password" = "canal"
- );
SHOW SYNC JOB from test_2;
select * from doris_mysql_binlog_demo;
我们在Mysql 数据表里删除数据,然后看Doris表里的变化
delete from test_cdc where id in (12,13)
我们在去看Doris表里,id是12,13这两条数据已经被删除
多表同步只需要像下面这样写法就可以了
- CREATE SYNC test_2.doris_mysql_binlog_demo_job
- (
- FROM demo.test_cdc INTO doris_mysql_binlog_demo,
- FROM demo.test_cdc_1 INTO doris_mysql_binlog_demo,
- FROM demo.test_cdc_2 INTO doris_mysql_binlog_demo,
- FROM demo.test_cdc_3 INTO doris_mysql_binlog_demo
- )
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。