python實(shí)現(xiàn)MySQL指定表增量同步數(shù)據(jù)到clickhouse的腳本
python實(shí)現(xiàn)MySQL指定表增量同步數(shù)據(jù)到clickhouse,腳本如下:
#!/usr/bin/env python3# _*_ coding:utf8 _*_ from pymysqlreplication import BinLogStreamReaderfrom pymysqlreplication.row_event import (DeleteRowsEvent,UpdateRowsEvent,WriteRowsEvent,)import clickhouse_driverimport configparserimport os configfile=’repl.ini’########## 配置文件repl.ini 操作 ##################def create_configfile(configfile,log_file,log_pos): config = configparser.ConfigParser() if not os.path.exists(configfile): config[’replinfo’] = {’log_file’:log_file,’log_pos’:str(log_pos)} with open(configfile,’w+’) as f: config.write(f) ### repl.ini 寫操作 ##################def write_config(configfile,log_file,log_pos): config = configparser.ConfigParser() config.read(configfile) config.set(’replinfo’,’log_file’,log_file) config.set(’replinfo’,’log_pos’,str(log_pos)) if os.path.exists(configfile): with open(configfile,’w+’) as f: config.write(f) else: create_configfile(configfile) ### 配置文件repl.ini 讀操作 ##################def read_config(configfile): config = configparser.ConfigParser() config.read(configfile) # print(config[’replinfo’][’log_file’]) # print(config[’replinfo’][’log_pos’]) return (config[’replinfo’][’log_file’],int(config[’replinfo’][’log_pos’])) ############# clickhouse 操作 ##################def ops_clickhouse(db,table,sql): column_type_dic={} try: client = clickhouse_driver.Client(host=’127.0.0.1’, port=9000, user=’default’, password=’clickhouse’) # sql='select name,type from system.columns where database=’{0}’ and table=’{1}’'.format(db,table) client.execute(sql) except Exception as error: message = '獲取clickhouse里面的字段類型錯(cuò)誤. %s' % (error) # logger.error(message) print(message) exit(1) MYSQL_SETTINGS = {’host’:’127.0.0.1’,’port’:13306,’user’:’root’,’passwd’:’Root@0101’}only_events=(DeleteRowsEvent, WriteRowsEvent, UpdateRowsEvent)def main(): ## 每次重啟時(shí),讀取上次同步的log_file,log_pos (log_file,log_pos) = read_config(configfile) # print(log_file+’|’+ str(log_pos)) print(’-----------------------------------------------------------------------------’) stream = BinLogStreamReader(connection_settings=MYSQL_SETTINGS, resume_stream=True, blocking=True, server_id=10, only_tables=’t_repl’, only_schemas=’test’, log_file=log_file,log_pos=log_pos, only_events=only_events, fail_on_table_metadata_unavailable=True, slave_heartbeat=10) try: for binlogevent in stream: for row in binlogevent.rows:## delete操作if isinstance(binlogevent, DeleteRowsEvent): info = dict(row['values'].items()) # print('DELETE FROM `%s`.`%s` WHERE %s = %s ;' %(binlogevent.schema ,binlogevent.table,binlogevent.primary_key,info[binlogevent.primary_key]) ) # print('ALTER TABLE `%s`.`%s` DELETE WHERE %s = %s ;' %(binlogevent.schema ,binlogevent.table,binlogevent.primary_key,info[binlogevent.primary_key]) ) sql='ALTER TABLE `%s`.`%s` DELETE WHERE %s = %s ;' %(binlogevent.schema ,binlogevent.table,binlogevent.primary_key,info[binlogevent.primary_key]) ## update 操作elif isinstance(binlogevent, UpdateRowsEvent): info_before = dict(row['before_values'].items()) info_after = dict(row['after_values'].items()) # info_set = str(info_after).replace(':','=').replace('{','').replace('}','') info_set = str(info_after).replace(':', '=').replace('{', '').replace('}', '').replace('’','') # print('UPDATE `%s`.`%s` SET %s WHERE %s = %s ;'%(binlogevent.schema,binlogevent.table,info_set,binlogevent.primary_key,info_before[binlogevent.primary_key] ) ) # print('ALTER TABLE %s.%s UPDATE %s WHERE %s = %s ;'%(binlogevent.schema,binlogevent.table,info_set,binlogevent.primary_key,info_before[binlogevent.primary_key] ) ) sql = 'ALTER TABLE %s.%s UPDATE %s WHERE %s = %s ;'%(binlogevent.schema,binlogevent.table,info_set,binlogevent.primary_key,info_before[binlogevent.primary_key] ) ## insert 操作elif isinstance(binlogevent, WriteRowsEvent): info = dict(row['values'].items()) # print('INSERT INTO %s.%s(%s)VALUES%s ;'%(binlogevent.schema,binlogevent.table , ’,’.join(info.keys()) ,str(tuple(info.values())) ) ) sql = 'INSERT INTO %s.%s(%s)VALUES%s ;'%(binlogevent.schema,binlogevent.table , ’,’.join(info.keys()) ,str(tuple(info.values())) )ops_clickhouse(’test’, ’t_repl’,sql ) # 當(dāng)前l(fā)og_file,log_pos寫入配置文件write_config(configfile, stream.log_file, stream.log_pos) except Exception as e: print(e) finally: stream.close() if __name__ == '__main__': main() ’’’BinLogStreamReader()參數(shù)ctl_connection_settings:集群保存模式信息的連接設(shè)置resume_stream:從位置或binlog的最新事件或舊的可用事件開始log_file:設(shè)置復(fù)制開始日志文件log_pos:設(shè)置復(fù)制開始日志pos(resume_stream應(yīng)該為true)auto_position:使用master_auto_position gtid設(shè)置位置blocking:在流上讀取被阻止only_events:允許的事件數(shù)組ignored_events:被忽略的事件數(shù)組only_tables:包含要觀看的表的數(shù)組(僅適用于binlog_format ROW)ignored_tables:包含要跳過(guò)的表的數(shù)組only_schemas:包含要觀看的模式的數(shù)組ignored_schemas:包含要跳過(guò)的模式的數(shù)組freeze_schema:如果為true,則不支持ALTER TABLE。速度更快。skip_to_timestamp:在達(dá)到指定的時(shí)間戳之前忽略所有事件。report_slave:在SHOW SLAVE HOSTS中報(bào)告奴隸。slave_uuid:在SHOW SLAVE HOSTS中報(bào)告slave_uuid。fail_on_table_metadata_unavailable:如果我們無(wú)法獲取有關(guān)row_events的表信息,應(yīng)該引發(fā)異常slave_heartbeat:(秒)主站應(yīng)主動(dòng)發(fā)送心跳連接。這也減少了復(fù)制恢復(fù)時(shí)GTID復(fù)制的流量(在許多事件在binlog中跳過(guò)的情況下)。請(qǐng)參閱mysql文檔中的MASTER_HEARTBEAT_PERIOD以了解語(yǔ)義’’’
知識(shí)點(diǎn)擴(kuò)展:
MySQL備份-增量同步
mysql增量同步主要使用binlog文件進(jìn)行同步,binlog文件主要記錄的是數(shù)據(jù)庫(kù)更新操作相關(guān)的內(nèi)容。
1. 備份數(shù)據(jù)的意義
針對(duì)不同業(yè)務(wù),7*24小時(shí)提供服務(wù)和數(shù)據(jù)的重要性不同。數(shù)據(jù)庫(kù)數(shù)據(jù)是比較核心的數(shù)據(jù),對(duì)企業(yè)的經(jīng)營(yíng)至關(guān)重要,數(shù)據(jù)庫(kù)備份顯得尤為重要。
2. 備份數(shù)據(jù)庫(kù)
MySQL數(shù)據(jù)庫(kù)自帶的備份命令 `mysqldump`,基本使用方法:語(yǔ)法:`mysqldump -u username -p password dbname > filename.sql`
執(zhí)行備份命令
`mysqldump -uroot -pmysqladmin db_test > /opt/mysql_bak.sql`
查看備份內(nèi)容
`grep -v '#|*|--|^$' /opt/mysql_bak.sql`
到此這篇關(guān)于python實(shí)現(xiàn)MySQL指定表增量同步數(shù)據(jù)到clickhouse的腳本的文章就介紹到這了,更多相關(guān)python實(shí)現(xiàn)MySQL增量同步數(shù)據(jù)內(nèi)容請(qǐng)搜索好吧啦網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持好吧啦網(wǎng)!
相關(guān)文章:
1. 解決vue頁(yè)面刷新,數(shù)據(jù)丟失的問(wèn)題2. python 讀txt文件,按‘,’分割每行數(shù)據(jù)操作3. python logging.info在終端沒(méi)輸出的解決4. vue路由分文件拆分管理詳解5. vue+vuex+axios從后臺(tái)獲取數(shù)據(jù)存入vuex,組件之間共享數(shù)據(jù)操作6. 詳解android adb常見用法7. SpringBoot使用Captcha生成驗(yàn)證碼8. android studio實(shí)現(xiàn)簡(jiǎn)單的計(jì)算器(無(wú)bug)9. android 控件同時(shí)監(jiān)聽單擊和雙擊實(shí)例10. Python 忽略文件名編碼的方法
