上海微网站制作,高端网页游戏,网站首页快照应该怎么,浙江网站建设哪家好Datax
简介
DataX 是阿里巴巴集团内被广泛使用的离线数据同步工具/平台#xff0c;实现包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、TableStore(OTS)、MaxCompute(ODPS)、DRDS 等各种异构数据源之间高效的数据同步功能。
DataX本身作为数据同步框架实现包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、TableStore(OTS)、MaxCompute(ODPS)、DRDS 等各种异构数据源之间高效的数据同步功能。
DataX本身作为数据同步框架将不同数据源的同步抽象为从源头数据源读取数据的Reader插件以及向目标端写入数据的Writer插件理论上DataX框架可以支持任意数据源类型的数据同步工作。同时DataX插件体系作为一套生态系统, 每接入一套新数据源该新加入的数据源即可实现和现有的数据源互通。
使用说明
DataX支持数据源链接https://github.com/alibaba/DataX/blob/master/introduction.md
DataX安装
DataX安装 1、上传依赖、 2、解压 tar -zxvf datax.tar.gz 3、配置环境变量 vim /etc/profile 添加环境变量 source /etc/profile 注意DataX所使用的环境为Python2Python3环境下执行会报错 DataX使用
测试案例
Mysql2Mysql
stream2stream.json
{job: {content: [{reader: {name: streamreader,parameter: {sliceRecordCount: 10,column: [{type: long,value: 10},{type: string,value: hello你好世界-DataX}]}},writer: {name: streamwriter,parameter: {encoding: UTF-8,print: true}}}],setting: {speed: {channel: 5}}}
}channel表示 并发数
“print”: true 表示打印到控制台
“sliceRecordCount”: 10 表示每个并发生成的测试数据条数
运行 datax.py stream2stream.json结果显示五十条数据因为通道channel:5和sliceRecordCount: 10
Mysql2Mysql .json
{job: {setting: {speed: {channel: 3},errorLimit: {record: 0,percentage: 0.02}},content: [{reader: {name: mysqlreader,parameter: {username: root,password: 123456,column: [id,name,age,gender,clazz,last_mod],where: clazz like 文科%,splitPk: id,connection: [{table: [student],jdbcUrl: [jdbc:mysql://master:3306/data?useSSLfalse]}]}},writer: {name: mysqlwriter,parameter: {writeMode: insert,username: root,password: 123456,column: [id,name,age,gender,clazz,last_mod],connection: [{jdbcUrl: jdbc:mysql://master:3306/data?useSSLfalseuseUnicodetruecharacterEncodingutf-8,table: [datax_stu]}]}}}]}
}writeMode: insert表示数据插入 update表示根据目标表的主键进行数据的更新操作 replace 表示主键覆盖写入 Mysql2Mysql-qureySQL
{job: {setting: {speed: {channel: 3},errorLimit: {record: 0,percentage: 0.02}},content: [{reader: {name: mysqlreader,parameter: {username: root,password: 123456,splitPk: id,connection: [{querySql: [select t2.*,t1.total_score FROM( SELECT student_id ,sum(score) as total_score from score GROUP BY student_id ) t1 join student t2 on t1.student_id t2.id
],jdbcUrl: [jdbc:mysql://master:3306/student?useSSLfalse]}]}},writer: {name: mysqlwriter,parameter: {writeMode: insert,username: root,password: 123456,column: [id,name,age,gender,clazz,last_mod,total_score],connection: [{jdbcUrl: jdbc:mysql://master:3306/data?useSSLfalseuseUnicodetruecharacterEncodingutf-8,table: [score_student]}]}}}]}
}Mysql2HDFS
mysql2hdfs
{job: {setting: {speed: {channel: 3},errorLimit: {record: 0,percentage: 0.02}},content: [{reader: {name: mysqlreader,parameter: {username: root,password: 123456,splitPk: id,connection: [{querySql: [select t2.*,t1.total_score FROM( SELECT student_id ,sum(score) as total_score from score GROUP BY student_id ) t1 join student t2 on t1.student_id t2.id
],jdbcUrl: [jdbc:mysql://master:3306/student?useSSLfalse]}]}},writer: {name: hdfswriter,parameter: {defaultFS: hdfs://master:9000,fileType: text,path: /user/hive/warehouse/bigdata.db/students_scores,fileName: students_scores,column: [{name: id,type: STRING},{name: name,type: STRING},{name: age,type: INT},{name: gender,type: STRING},{name: clazz,type: STRING},{name: last_mod,type: STRING},{name: total_score,type: INT}],writeMode: append,fieldDelimiter: ,}}}]}
}hdfs2mysql {job: {setting: {speed: {channel: 3},errorLimit: {record: 0,percentage: 0.02}},content: [{reader: {name: hdfsreader,parameter: {path: /datax/stu_score/*,defaultFS: hdfs://master:9000,column: [{index: 0,type: STRING},{index: 1,type: STRING},{index: 2,type: STRING},{index: 3,type: STRING},{index: 4,type: STRING},{index: 5,type: STRING}],fileType: text,encoding: UTF-8,fieldDelimiter: ,}},writer: {name: mysqlwriter,parameter: {writeMode: insert,username: root,password: 123456,column: [id,name,age,gender,clazz,total_score],connection: [{jdbcUrl: jdbc:mysql://master:3306/datax?useSSLfalseuseUnicodetruecharacterEncodingutf-8,table: [stu_score_2]}]}}}]}
}{“message”:“No enum constant com.alibaba.datax.plugin.unstructuredstorage.reader.UnstructuredStorageReaderUtil.Type.INT”,“record”:[{“byteSize” :10,“index”:0,“rawData”:“1500100001”,“type”:“STRING”},{“byteSize”:3,“index”:1,“rawData”:“施笑槐”,“type”:“STRING”}],“type”:“reader”} HDFS中读取文本数据时所有列的类型都可以给定一个String类型在插入数据到Mysql时会做自动类型转换
Hbase2Mysql Mysql2Hbase {job: {setting: {speed: {channel: 3},errorLimit: {record: 0,percentage: 0.02}},content: [{reader: {name: mysqlreader,parameter: {username: root,password: 123456,splitPk: id,connection: [{querySql: [select t2.*,t1.total_score FROM( SELECT student_id ,sum(score) as total_score from score GROUP BY student_id ) t1 join student t2 on t1.student_id t2.id],jdbcUrl: [jdbc:mysql://master:3306/student?useSSLfalse]}]}},writer: {name: hbase11xwriter,parameter: {hbaseConfig: {hbase.zookeeper.quorum: master:2181,node1:2181,node2:2181},table: score_stu,mode: normal,rowkeyColumn: [{index:0,type:string},{index:-1,type:string,value:_},{index:6,type:string}],column: [{index:1,name: info:name,type: string},{index:2,name: info:age,type: string},{index:3,name: info:gender,type: string},{index:4,name: info:clazz,type: string},{index:6,name: info:total_score,type: string}],encoding: utf-8}}}]}
} hbase2mysql {job: {setting: {speed: {channel: 3},errorLimit: {record: 10,percentage: 0.02}},content: [{reader: {name: hbase11xreader,parameter: {hbaseConfig: {hbase.zookeeper.quorum: master:2181},table: api:stu,encoding: utf-8,mode: normal,column: [{name: rowkey,type: string},{name: info: name,type: string},{name: info: age,type: string},{name: info: gender,type: string},{name: info: clazz,type: string}]}},writer: {name: mysqlwriter,parameter: {writeMode: insert,username: root,password: 123456,column: [id,name,age,gender,clazz],connection: [{jdbcUrl: jdbc:mysql://master:3306/datax?useSSLfalseuseUnicodetruecharacterEncodingutf-8,table: [stu]}]}}}]}
}