CentOS 7.4部署MariaDB Galera Cluster集群架构
Mariadb galera Cluster安装: 操作系统:CentOS 7.4版本 集群数量:3个节点 主机信息: 192.168.153.142 node1 selinux=disabled firewalld关闭 192.168.153.143 node2 selinux=disabled firewalld关闭 192.168.153.144 node3 selinux=disabled firewalld关闭
搭建步骤
1.主机之间互相解析:三台节点都要执行 vim /etc/hosts 192.168.153.142 node1 192.168.153.143 node2 192.168.153.144 node3
2.安装软件包
第一种方法:(yum install -y MariaDB-server MariaDB-client galera) 配置yum安装源和配置mariadb galera安装源 yum源配置挂iso 设置mariadb的yum源并安装(所有节点都要) 修改yum源文件
vi /etc/yum.repos.d/mariadb.repo
[mariadb] name = MariaDB baseurl = http://yum.mariadb.org/10.3.5/centos74-amd64 gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB gpgcheck=1 enabled=0 安装galera软件时需要解决它的依赖包:boost-program-options.x86_64 (直接yum源安装即可)
第二种方法:(rpm包安装)三个节点都需要安装 从网上下载rpm包: galera-25.3.23-1.rhel7.el7.centos.x86_64.rpm MariaDB-10.3.5-centos74-x86_64-client.rpm MariaDB-10.3.5-centos74-x86_64-compat.rpm MariaDB-10.3.5-centos74-x86_64-common.rpm MariaDB-10.3.5-centos74-x86_64-server.rpm rpm -ivh MariaDB-10.3.5-centos74-x86_64-compat.rpm --nodeps rpm -ivh MariaDB-10.3.5-centos74-x86_64-common.rpm rpm -ivh MariaDB-10.3.5-centos74-x86_64-client.rpm yum install -y boost-program-options.x86_64 (解决安装galera的依赖包) rpm -ivh galera-25.3.23-1.rhel7.el7.centos.x86_64.rpm rpm -ivh MariaDB-10.3.5-centos74-x86_64-server.rpm
3.mariadb初始化 (三个节点都需要执行) 安装完成后会提示需要对mariadb进行初始化(设置密码) systemctl start mariadb mysql_secure_installation (按提示设置mysql密码) systemctl stop mariadb
4.配置galera 主节点配置文件server.cnf vim /etc/my.cnf.d/server.cnf [galera] wsrep_on=ON wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_address="gcomm://192.168.153.142,192.168.153.143,192.168.153.144" wsrep_node_name= node1 wsrep_node_address=192.168.153.142 binlog_format=row default_storage_engine=InnoDB innodb_autoinc_lock_mode=2 wsrep_slave_threads=1 innodb_flush_log_at_trx_commit=0 innodb_buffer_pool_size=120M wsrep_sst_method=rsync wsrep_causal_reads=ON 将此文件复制到mariadb-2、mariadb-3,注意要把 wsrep_node_name 和 wsrep_node_address 改成相应节点的 hostname 和 ip。
5.启动集群服务: 启动 MariaDB Galera Cluster 服务: [root@node1 ~]# /bin/galera_new_cluster 剩余两节点启动方式为: [root@node1 ~]# systemctl start mariadb 查看集群状态:(集群服务使用了4567和3306端口)) [root@node1 ~]# netstat -tulpn | grep -e 4567 -e 3306 tcp 0 0 0.0.0.0:4567 0.0.0.0: LISTEN 3557/mysqld tcp6 0 0 :::3306 ::: LISTEN 3557/mysqld
6.验证集群状态: 在node1上执行: [root@node1 ~]# mysql -uroot -p ##进入 数据库 查看是否启用galera插件 连接mariadb,查看是否启用galera插件 MariaDB [(none)]> show status like "wsrep_ready"; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | wsrep_ready | ON | +---------------+-------+ 1 row in set (0.004 sec) 目前集群机器数 MariaDB [(none)]> show status like "wsrep_cluster_size"; +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | wsrep_cluster_size | 3 | +--------------------+-------+ 1 row in set (0.001 sec) 查看集群状态 MariaDB [(none)]> show status like "wsrep%"; +------------------------------+----------------------------------------------------------------+ | Variable_name | Value | +------------------------------+----------------------------------------------------------------+ | wsrep_apply_oooe | 0.000000 | | wsrep_apply_oool | 0.000000 | | wsrep_apply_window | 1.000000 | | wsrep_causal_reads | 14 | | wsrep_cert_deps_distance | 1.200000 | | wsrep_cert_index_size | 3 | | wsrep_cert_interval | 0.000000 | | wsrep_cluster_conf_id | 22 | | wsrep_cluster_size | 3 | ##集群成员 | wsrep_cluster_state_uuid | b8ecf355-233a-11e8-825e-bb38179b0eb4 | ##UUID 集群唯一标记 | wsrep_cluster_status | Primary | ##主服务器 | wsrep_commit_oooe | 0.000000 | | wsrep_commit_oool | 0.000000 | | wsrep_commit_window | 1.000000 | | wsrep_connected | ON | ##当前是否连接中 | wsrep_desync_count | 0 | | wsrep_evs_delayed | | | wsrep_evs_evict_list | | | wsrep_evs_repl_latency | 0/0/0/0/0 | | wsrep_evs_state | OPERATIONAL | | wsrep_flow_control_paused | 0.000000 | | wsrep_flow_control_paused_ns | 0 | | wsrep_flow_control_recv | 0 | | wsrep_flow_control_sent | 0 | | wsrep_gcomm_uuid | 0eba3aff-2341-11e8-b45a-f277db2349d5 | | wsrep_incoming_addresses | 192.168.153.142:3306,192.168.153.143:3306,192.168.153.144:3306 | ##连接中的 数据库 | wsrep_last_committed | 9 | ##sql 提交记录 | wsrep_local_bf_aborts | 0 | ##从执行事务过程被本地中断 | wsrep_local_cached_downto | 5 | | wsrep_local_cert_failures | 0 | ##本地失败事务 | wsrep_local_commits | 4 | ##本地执行的sql | wsrep_local_index | 0 | | wsrep_local_recv_queue | 0 | | wsrep_local_recv_queue_avg | 0.057143 | | wsrep_local_recv_queue_max | 2 | | wsrep_local_recv_queue_min | 0 | | wsrep_local_replays | 0 | | wsrep_local_send_queue | 0 | ##本地发出的队列 | wsrep_local_send_queue_avg | 0.000000 | ##队列平均时间间隔 | wsrep_local_send_queue_max | 1 | | wsrep_local_send_queue_min | 0 | | wsrep_local_state | 4 | | wsrep_local_state_comment | Synced | | wsrep_local_state_uuid | b8ecf355-233a-11e8-825e-bb38179b0eb4 | ##集群ID | wsrep_protocol_version | 8 | | wsrep_provider_name | Galera | | wsrep_provider_vendor | Codership Oy <info@codership.com> | | wsrep_provider_version | 25.3.23(r3789) | | wsrep_ready | ON | ##插件是否应用中 | wsrep_received | 35 | ##数据复制接收次数 | wsrep_received_bytes | 5050 | | wsrep_repl_data_bytes | 1022 | | wsrep_repl_keys | 14 | | wsrep_repl_keys_bytes | 232 | | wsrep_repl_other_bytes | 0 | | wsrep_replicated | 5 | ##随着复制发出的次数 | wsrep_replicated_bytes | 1600 | ##数据复制发出的字节数 | wsrep_thread_count | 2 | +------------------------------+----------------------------------------------------------------+ 58 rows in set (0.003 sec) 查看连接的主机 MariaDB [(none)]> show status like "wsrep_incoming_addresses"; +--------------------------+----------------------------------------------------------------+ | Variable_name | Value | +--------------------------+----------------------------------------------------------------+ | wsrep_incoming_addresses | 192.168.153.142:3306,192.168.153.143:3306,192.168.153.144:3306 | +--------------------------+----------------------------------------------------------------+ 1 row in set (0.002 sec)
7.测试集群mariad数据是否同步 MariaDB [(none)]> create database lizk; Query OK, 1 row affected (0.010 sec)
MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | china | | hello | | hi | | information_schema | | lizk | | mysql | | performance_schema | | test | +--------------------+ 8 rows in set (0.001 sec) 在其他两个节点上可以查看lizk库已经同步。
8.模拟脑裂后的处理 下面模拟在网络抖动发生丢包的情况下,两个节点失联导致脑裂。在192.168.153.143和192.168.153.144两个节点上分别执行: iptables -A INPUT -p tcp --sport 4567 -j DROP iptables -A INPUT -p tcp --dport 4567 -j DROP 以上命令用来禁止wsrep全同步复制4567端口通信 在192.168.153.142节点上查看: MariaDB [(none)]> show status like "ws%"; +------------------------------+--------------------------------------------+ | Variable_name | Value | +------------------------------+--------------------------------------------+ | wsrep_apply_oooe | 0.000000 | | wsrep_apply_oool | 0.000000 | | wsrep_apply_window | 1.000000 | | wsrep_causal_reads | 16 | | wsrep_cert_deps_distance | 1.125000 | | wsrep_cert_index_size | 3 | | wsrep_cert_interval | 0.000000 | | wsrep_cluster_conf_id | 18446744073709551615 | | wsrep_cluster_size | 1 | | wsrep_cluster_state_uuid | b8ecf355-233a-11e8-825e-bb38179b0eb4 | | wsrep_cluster_status | non-Primary | 现在已经出现脑裂的情况,并且集群无法执行任何命令。 为了解决这个问题,可以执行 set global wsrep_provider_options="pc.bootstrap=true"; 通过这个命令来强制恢复出现脑裂的节点。 MariaDB [(none)]> set global wsrep_provider_options="pc.bootstrap=true"; Query OK, 0 rows affected (0.015 sec)
MariaDB [(none)]> select @@wsrep_node_name; +-------------------+ | @@wsrep_node_name | +-------------------+ | node1 | +-------------------+ 1 row in set (0.478 sec) 最后我们将节点192.168.153.143和192.168.153.144恢复一下,只要清理一下iptables表即可(因为我的是测试环境,生产环境需要删除上面的规则即可): [root@node3 mysql]# iptables -F 恢复后验证一下: MariaDB [(none)]> show status like "wsrep_cluster_size"; +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | wsrep_cluster_size | 3 | +--------------------+-------+ 1 row in set (0.001 sec)
9.因故障需要对集群的两个节点进行停机检查,重启服务后是否能同步数据; 对192.168.153.143和192.168.153.144执行停止mariadb的操作: [root@node2 mysql]# systemctl stop mariadb 在192.168.153.142节点上插入数据: MariaDB [test]> select * from test1; +------+ | id | +------+ | 2 | | 2 | | 1 | | 3 | +------+ 4 rows in set (0.007 sec) 现在把集群中另外两个节点重新启动,查看数据一致性情况,跟主节点的数据一样。
10.异常处理:当机房突然停电,所有galera主机都非正常关机,来电后开机,会导致galera集群服务无法正常启动。如何处理? 第1步:开启galera集群的群主主机的mariadb服务。 第2步:开启galera集群的成员主机的mariadb服务。 异常处理:galera集群的群主主机和成员主机的mysql服务无法启动,如何处理? 解决方法一:第1步、删除garlera群主主机的/var/lib/mysql/grastate.dat状态文件 /bin/galera_new_cluster启动服务。启动正常。登录并查看wsrep状态。