需求:为了容灾,需要部署两台服务器,便于主备切换,当主服务器挂掉之后,主辅服务器角色永远对换,即使之前主服务器恢复也不再重新恢复之前主辅角色,直到下次发生主备切换
分析:mongodb两种数据复制方式:1、主备,当主服务器挂掉之后,辅服务器不能够提供写服务,故mongodb主备方案不合适。2、replica set,但只有两个节点时,主节点挂掉,辅节点也不能提供写服务,与主备效果一样,所以至少需要三个mongodb实例,但当前只有两台服务器,所以有一台服务器要安装有两个mongodb实例,一台机器上不需要两份数据拷贝,故第三个节点使用arbiter,只进行投票,不会被选为主节点,也不会拷贝数据。另外考虑到需求中主备逻辑,结合replica set特点,arbiter只能在服务器中运行,考虑到主备角色调换,故两台服务器每台需要安装两个mongodb实例,其中一个为arbiter
两台服务器:
192.168.103.5 (原始主服务器)
192.168.103.28 (原始辅服务器)
一、两台服务器分别安装mongodb
安装步骤可参考:https://www.cnblogs.com/qq931399960/p/10425803.html
二、两台服务器分别安装arbiter
1、新建mongodb日志和数据库存储路径
mkdir -p /home/mongo-arbiter/db
mkdir -p /home/mongo-arbiter/log
2、拷贝配置文件
cp /etc/mongod.conf /etc/mongod-arbiter.conf
3、修改配置文件
vim /etc/mongod-arbiter.conf
systemLog.path
修改为
/home/mongo-arbiter/log/mongod.log
Storage.dbPath
修改为
/home/mongo-arbiter/db
processManagement.pidFilePath
修改为
/var/run/mongodb/mongod-arbiter.pid
net.port
修改为
27018
4、修改权限
chown -R mongod:mongod /home/mongo-arbiter
5、设置arbiter开机启动
cp /usr/lib/systemd/system/mongod.service /usr/lib/systemd/system/mongod-arbiter.service
修改开机启动文件
vim /usr/lib/systemd/system/mongod-arbiter.service
Environment
改为:
"OPTIONS=-f /home/mongo-arbiter/mongod.conf"
PIDFile
改为:
/var/run/mongodb/mongod-arbiter.pid
使配置生效
systemctl enable mongod-arbiter.service
systemctl daemon-reload
初始化用户(四个实例的用户要一致)
cat ./mongdb/initUser.js | mongo --port 27018 --shell
6、登录检测
mongo -uroot -pabc123 --authenticationDatabase "admin" --port 27018
正常登陆,ok
三、配置replica set
1、停止所有mongodb实例
2、生成keyfile文件
openssl rand -base64 756 > /home/replSetKey
chmod 400 /home/replSetKey
chown -R mongod:mongod /home/replSetKey
2、将上述文件拷贝到另外一台服务器/home下(路径可自定义,只要四个实例使用相同kefile文件即可)
3、添加/修改mongodb实例配置(如下配置每个实例相同,4个mongodb实例都需要修改,如下列出的只有新加和修改的,不变的未列出)
security:
keyFile: /home/replSetKey ## 新加
replication:
replSetName: mongoSet ## 新加
bindIp: 0.0.0.0 ## 修改
4、启动主服务器默认27017端口mongodb实例,启动辅服务器中两个mongodb实例
systemctl start mongod
systemctl start mongod-arbiter
5、查看防火墙是否关闭,或者27017和27018端口是否放开
未关闭防火墙或者以上端口不能访问,则初始化replica set失败
关闭防火墙命令
systemctl stop firewalld
systemctl disable firewalld
6、在主服务器(103.5)进入mongodb命令
mongo -u "root" -p "abc123" --authenticationDatabase "admin"
7、初始化两个memeber的replicat set
config = {_id:"mongoSet", members:[{_id:0,host:"192.168.103.5:27017",priority:2},
{_id:1,host:"192.168.103.28:27017",priority:1}]}
rs.initiate(config, true)
此时可通过rs.status()状态
8、添加arbiter仲裁节点
rs.addArb("192.168.103.28:27018")
rs.status()查看各节点状态,如果有节点stateStr显示not reachable/healthy,health显示为0,则重启该节点,再次通过rs.status()查看是否正常
到此,主备服务基本配置结束
java客户端连接地址为:
mongodb://root:abc123@192.168.103.5:27017,192.168.103.28:27017/?authSource=mydb&replicaSet=mongoSet
四、重新配置replica set
当主备服务器切换,java客户端连接由于配置了replicaSet,可以自动去新主节点进行业务处理,不会有问题,当主设备恢复为后,原主服务器mongodb再次成为了主节点,但其他服务已经在新主服务器上
A、mongodb和web服务不在台机器,中间有网络传输,影响性能
B、此时的replica set中,主节点在辅服务器上,主服务器上只有辅节点和arbiter,当主服务器挂掉,只有一个mongodb节点,不能在提供写服务
综上两点,尤其是第二点,需要重新配置replica set,登录到mongodb主节点
mongo -uroot -pabc123 --authenticationDatabase "admin"
或者在主服务器远程登录恢复的mongodb主节点
mongo -u "root" -p "abc123" --authenticationDatabase "admin" --host 192.168.103.5
reconfig = {_id:"mongoSet","protocolVersion" : 1, members:[{_id:0,host:"192.168.103.5:27017",priority:1},
{_id:1,host:"192.168.103.28:27017",priority:2}]}
rs.reconfig(reconfig, true)
添加arbiter节点
rs.addArb("192.168.103.5:27018")
使用rs.status()查看状态,如果有节点stateStr显示not reachable/healthy,health显示为0,则重启该节点,再次通过rs.status()查看是否正常
1、若有节点stateStr出现not reachable/healthy,health显示为0,则该节点可能是arbiter节点,官方不推荐将arbiter节点安装到存在primary和slave节点的服务器上,不过当前只有两台服务器,也只有这样了,可以正常运行。
2、重新配置时,需要保证之前已存在的host的id是相同的
3、重新配置需要添加protocolVersion
五、使用脚本简化操作
1、提前准备好配置文件和arbiter开机启动service文件,需要的字段都添加上,但replica set相关字段为注释状态,因为打开后,所有节点都将无法进行写操作,新建用户后打开,并重启mongodb
2、提前准备好kefile,两台服务器都使用该文件
3、安装mongodb (数据库和日志路径可以自定义)
#!/bin/bash
basedir=$(cd "$(dirname "$0")";pwd)
function initenv(){
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
########inst mongdb 4.0#############
function inst(){
cp -f ./mongdb/mongod.conf /etc/
rpm -ivh ./mongdb/*rpm
cd /home
mkdir -p mongo-db
mkdir -p mongo-log
chown -R mongod:mongod ./mongo-db
chown -R mongod:mongod ./mongo-log
systemctl enable mongod
systemctl start mongod
function initdb(){
cd $basedir
cat ./mongdb/initUser.js | mongo --shell
cat ./mongdb/initIndex.js | mongo -u "root" -p "abc123" --authenticationDatabase "admin"
function cpReplicaSetKey(){
cd $basedir
cp ./mongdb/replSetKey /home/
chmod 400 /home/replSetKey
chown -R mongod:mongod /home/replSetKey
function openReplicaSet(){
sed -i 's/#keyFile/keyFile/g' /etc/mongod.conf
sed -i 's/#replication/replication/g' /etc/mongod.conf
sed -i 's/#replSetName/replSetName/g' /etc/mongod.conf
function stopMongo(){
systemctl stop mongod
function startMongo(){
systemctl start mongod
function main(){
initenv
initdb
stopMongo
cpReplicaSetKey
openReplicaSet
startMongo
main | tee -a ./log/inst_mongodb.log
View Code
4、安装mongodb arbiter节点(在rs.addArb()之前其实还不能说是arbiter节点,只是起了个名字叫arbiter节点)
#!/bin/bash
basedir=$(cd "$(dirname "$0")";pwd)
function initenv(){
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
########inst mongdb 4.0#############
function inst(){
cp -f ./mongdb/mongod-arbiter.conf /etc/
cp -f ./mongdb/mongod-arbiter.service /usr/lib/systemd/system/
mkdir -p /home/mongodb-arbiter/log
mkdir -p /home/mongodb-arbiter/db
chown -R mongod:mongod /home/mongodb-arbiter
systemctl enable mongod-arbiter
systemctl start mongod-arbiter
function initdb(){
cd $basedir
cat ./mongdb/initUser.js | mongo --port 27018 --shell
cat ./mongdb/initIndex.js | mongo --port 27018 -u "root" -p "abc123" --authenticationDatabase "admin"
function openReplicaSet(){
sed -i 's/#keyFile/keyFile/g' /etc/mongod-arbiter.conf
sed -i 's/#replication/replication/g' /etc/mongod-arbiter.conf
sed -i 's/#replSetName/replSetName/g' /etc/mongod-arbiter.conf
function stopMongo(){
systemctl stop mongod
function startMongo(){
systemctl start mongod
function main(){
initenv
initdb
stopMongo
openReplicaSet
startMongo
main | tee -a ./log/inst_mongodb_arbiter.log
View Code
5、初始replica set
#!/bin/bash
basedir=$(cd "$(dirname "$0")";pwd)
source ./app.properties
nowtime=`date --date='0 days ago' "+%Y%m%d%H%M%S"`
daytime=`date --date='0 days ago' "+%Y%m%d"`
logPath=./log/init_repliaset_$daytime.log
function log(){
echo $1 | tee -a $logPath
function logfile(){
echo $1 >> $logPath
function testConfig(){
log "check relate config."
if [ "$masterServerIp" == "masterIp" ];then
log "ERROR: masterServerIp is not correct in the file app.properties"
exit 1
if [ "$slaveServerIp" == "slaveIp" ];then
log "ERROR: slaveServerIp is not correct in the file app.properties"
exit 1
local grepIp=`ifconfig | grep $masterServerIp`
## it is must execute in the master server which you input in the app.properties file when first initiate ths replica set
if [ "$grepIp" == "" ];then
log "the current server is not the master server, please login on to $masterServerIp and execute this script."
exit 1
## test the connection
echo "use admin" > test.js
echo "rs.slaveOk()" >> test.js
echo "show dbs" >> test.js
cat test.js | mongo --host $slaveServerIp -uroot -pabc123 --authenticationDatabase "admin" >> $logPath
if [ $? != 0 ];then
log "mongodb instance which port is 27017 in $slaveServerIp server can not connect."
exit 1;
cat test.js | mongo --host $slaveServerIp --port 27018 -uroot -pabc123 --authenticationDatabase "admin" >> $logPath
if [ $? != 0 ];then
log "mongodb instance which port is 27018 in $slaveServerIp server can not connect."
exit 1;
cat test.js | mongo --host $masterServerIp -uroot -pabc123 --authenticationDatabase "admin" >> $logPath
if [ $? != 0 ];then
log "mongodb instance which port is 27017 in $slaveServerIp server can not connect."
exit 1;
rm -rf test.js
log "check relate config end."
function initTwoMemberReplicaSet(){
log "init two member replica set."
echo "use admin" > init.js
echo "config = {_id:\"mongoSet\", members:[{_id:0,host:\"$masterServerIp:27017\",priority:2},{_id:1,host:\"$slaveServerIp:27017\",priority:1}]}" >> init.js
echo "rs.initiate(config, {force:true})" >> init.js
cat init.js | mongo -uroot -pabc123 --authenticationDatabase "admin" >> $logPath
## sleep 3 second for primary election.
sleep 3s
log "init two member replica set end."
function isElectPrimary(){
log "check is the master node selected."
local masterStr=`mongo -uroot -pabc123 --authenticationDatabase "admin" --eval "rs.status()" | grep "stateStr" -B 4|grep "PRIMARY" -B 4|grep "name" | grep $masterServerIp`
local count=0
while [[ "$masterStr" == "" ]];do
sleep 1s
count=$((count+1))
masterStr=`mongo -uroot -pabc123 --authenticationDatabase "admin" --eval "rs.status()" | grep "stateStr" -B 4|grep "PRIMARY" -B 4|grep "name" | grep $masterServerIp`
if [ "$masterStr" != "" ];then
log "the master node is selected."
break
if [ "$count" == "30" ];then
log "ERROR: arbiter configure timeout, please configure manually with 'rs.addArb(\"$slaveServerIp:27018\")' in $masterServerIp:27017"
exit 1
log "check is the master node selected end."
function addArbiter(){
log "add arbiter ..."
mongo -uroot -pabc123 --authenticationDatabase "admin" --eval "rs.addArb(\"$slaveServerIp:27018\")"
log "add arbiter end."
function chekArbiter(){
log "check arbiter."
local arbiterStr=`mongo -uroot -pabc123 --authenticationDatabase "admin" --eval "rs.status()" | grep "stateStr" -B 4|grep "$slaveServerIp:27018" -A 3| grep "stateStr"`
if [ "$arbiterStr" != "ARBITER" ];then
log "ERROR: abriter $slaveServerIp:27018: $arbiterStr, may be you need restart mongodb with 'systemctl restart mongod-arbiter' command in $slaveServerIp."
log "check arbiter end."
logfile "**************************$nowtime**********************************"
testConfig
initTwoMemberReplicaSet
isElectPrimary
addArbiter
chekArbiter
log "init end..."
View Code
6、重新配置replica set
#!/bin/bash
basedir=$(cd "$(dirname "$0")";pwd)
source ./app.properties
nowtime=`date --date='0 days ago' "+%Y%m%d%H%M%S"`
daytime=`date --date='0 days ago' "+%Y%m%d"`
logPath=./log/init_repliaset_$daytime.log
function log(){
echo $1 | tee -a $logPath
function logfile(){
echo $1 >> $logPath
function testConfig(){
log "check relate config."
if [ "$masterServerIp" == "masterIp" ];then
log "ERROR: masterServerIp is not correct in the file app.properties"
exit 1
if [ "$slaveServerIp" == "slaveIp" ];then
log "ERROR: slaveServerIp is not correct in the file app.properties"
exit 1
## get current replica set master node
local currMasterNode=`mongo -uroot -pabc123 --authenticationDatabase "admin" --eval "rs.status()" | grep "stateStr" -B 4|grep "PRIMARY" -B 4|grep "name" | cut -d ":" -f 2 | sed 's/\"//g'`
if [ "`ifconfig | grep $currMasterNode`" == "" ];then
log "the current server has no mongodb master node, please login on to $currMasterNode execute this script."
exit 1
## test the connection
echo "use admin" > test.js
echo "rs.slaveOk()" >> test.js
echo "show dbs" >> test.js
cat test.js | mongo --host $slaveServerIp -uroot -pabc123 --authenticationDatabase "admin" >> $logPath
if [ $? != 0 ];then
log "mongodb instance which port is 27017 in $slaveServerIp server can not connect."
exit 1;
cat test.js | mongo --host $masterServerIp -uroot -pabc123 --authenticationDatabase "admin" >> $logPath
if [ $? != 0 ];then
log "mongodb instance which port is 27017 in $slaveServerIp server can not connect."
exit 1;
cat test.js | mongo --host $slaveServerIp --port 27018 -uroot -pabc123 --authenticationDatabase "admin" >> $logPath
if [ $? != 0 ];then
log "mongodb instance which port is 27018 in $slaveServerIp server can not connect."
## 有时,当主从服务器都重启之后,有一个仲裁节点由于不在replica set中,但配置文件中却存在replica的配置,造成不能够正常启动,但reconfig replica set后,需要将该节点reconfig到replica set中,故此处不能因为不能连接就执行exit,但输出错误信息给出提示,即使服务未启动,
## 执行rs.addArb()也可以添加仲裁节点,但必须确定添加的仲裁节点会启动,否则即使通过rs.remove删除掉了该仲裁节点,但主节点日志中也会一直打印错误信息,
rm -rf test.js
log "check relate config end."
function initTwoMemberReplicaSet(){
log "reconfig two member replica set."
echo "use admin" > reconfig.js
## the _id must be the same as before
## find the master server id in previous replica set
local masterId=`mongo -uroot -pabc123 --authenticationDatabase "admin" --eval "rs.status()" | grep "stateStr" -B 4|grep "$masterServerIp:27017" -B 1|grep "_id" | cut -d ":" -f 2|sed 's/,//g'`
## find the slave server id in previous replica set
local slaveId=`mongo -uroot -pabc123 --authenticationDatabase "admin" --eval "rs.status()" | grep "stateStr" -B 4|grep "$slaveServerIp:27017" -B 1|grep "_id" | cut -d ":" -f 2|sed 's/,//g'`
echo "config = {_id:\"mongoSet\", "protocolVersion" : 1, members:[{_id:$masterId,host:\"$masterServerIp:27017\",priority:2},{_id:$slaveId,host:\"$slaveServerIp:27017\",priority:1}]}" >> reconfig.js
echo "rs.reconfig(config, {force:true})" >> reconfig.js
cat reconfig.js | mongo -uroot -pabc123 --authenticationDatabase "admin" >> $logPath
## sleep 3 second for primary election.
sleep 3s
log "reconfig two member replica set end."
function isElectPrimary(){
log "check is the master node selected."
local masterStr=`mongo -uroot -pabc123 --authenticationDatabase "admin" --eval "rs.status()" | grep "stateStr" -B 4|grep "PRIMARY" -B 4|grep "name" | grep $masterServerIp`
local count=0
while [[ "$masterStr" == "" ]];do
sleep 1s
count=$((count+1))
masterStr=`mongo -uroot -pabc123 --authenticationDatabase "admin" --eval "rs.status()" | grep "stateStr" -B 4|grep "PRIMARY" -B 4|grep "name" | grep $masterServerIp`
if [ "$masterStr" != "" ];then
log "the master node is selected."
break
if [ "$count" == "30" ];then
log "ERROR: arbiter configure timeout, please configure manually with 'rs.addArb(\"$slaveServerIp:27018\")' in $masterServerIp:27017"
exit 1
log "check is the master node selected end."
function addArbiter(){
log "remove arbiter in the master server."
## master server do not deploy arbiter, when reconfig, after the master node reselected you must specify the --post for delete or add
local beforeArbiterIp=`mongo -uroot -pabc123 --authenticationDatabase "admin" --eval "rs.status()" | grep "stateStr" -B 4|grep "ARBITER" -B 4|grep "name" | cut -d ":" -f 2 | sed 's/\"//g'`
if [ "`echo $masterServerIp | grep $beforeArbiterIp`" != "" ];then
log "remove $masterServerIp:27018 in replica set."
mongo -uroot -pabc123 --authenticationDatabase "admin" --host $masterServerIp --eval "rs.remove(\"$masterServerIp:27018\")"
log "remove arbiter in the master server end."
log "add arbiter ..."
mongo -uroot -pabc123 --authenticationDatabase "admin" --host $masterServerIp --eval "rs.addArb(\"$slaveServerIp:27018\")"
log "add arbiter end."
function chekArbiter(){
log "check arbiter."
local arbiterStr=`mongo -uroot -pabc123 --authenticationDatabase "admin" --eval "rs.status()" | grep "stateStr" -B 4|grep "$slaveServerIp:27018" -A 3 | grep "ARBITER"`
## arbiter connect need time, set to 15 seconds
local count=0
while [[ "$arbiterStr" == "" ]];do
sleep 1s
count=$((count+1))
arbiterStr=`mongo -uroot -pabc123 --authenticationDatabase "admin" --eval "rs.status()" | grep "stateStr" -B 4|grep "$slaveServerIp:27018" -A 3 | grep "ARBITER"`
if [ "$arbiterStr" != "" ];then
log "the arbiter node is selected."
break
## wait for 15 seconds
if [ "$count" == "15" ];then
log "ERROR: abriter $slaveServerIp:27018: $arbiterStr, may be you need restart mongodb with 'systemctl restart mongod-arbiter' command in $slaveServerIp."
break
log "check arbiter end."
logfile "**************************$nowtime**********************************"
testConfig
initTwoMemberReplicaSet
isElectPrimary
addArbiter
chekArbiter
log "init end..."
View Code
脚本中通过rs.addArb()添加未启动或者不存在的arbiter节点,可以添加成功,但主节点日志中会一直打印该arbiter节点相关的错误信息,即使通过rs.remove()删除掉该arbiter节点,错误信息也会打印,该问题,重启主节点可以解决
7、配置文件app.properties
masterServerIp=masterIp
slaveServerIp=slaveIp
View Code
执行初始化和重新配置replica set之前需要确认app.properties中配置是正确的
8、无论是初始化replica set或者重新配置replica set,前几次都会出现not reachable/healthy,此时根据shell页面打印的提示,在对应的服务器上执行systemctl restart mongod-arbiter即可