化妆品购物网站排名,有什么做家纺的网站,专业建设信息化网站资源,网站建设咨询客户话术文章目录 openGauss学习笔记-135 openGauss 数据库运维-例行维护-检查openGauss健康状态135.1 检查办法135.2 操作步骤135.3 异常处理 openGauss学习笔记-135 openGauss 数据库运维-例行维护-检查openGauss健康状态
135.1 检查办法
通过openGauss提供的gs_check工具可以开展o… 文章目录 openGauss学习笔记-135 openGauss 数据库运维-例行维护-检查openGauss健康状态135.1 检查办法135.2 操作步骤135.3 异常处理 openGauss学习笔记-135 openGauss 数据库运维-例行维护-检查openGauss健康状态
135.1 检查办法
通过openGauss提供的gs_check工具可以开展openGauss健康状态检查。
注意事项
扩容新节点检查只能在root用户下执行其他场景都必须在omm用户下执行。必须指定-i或-e参数-i会检查指定的单项-e会检查对应场景配置中的多项。如果-i参数中不包含root类检查项或-e场景配置列表中没有root类检查项则不需要交互输入root权限的用户及其密码。可使用–skip-root-items跳过检查项中包含的root类检查以免需要输入root权限用户及密码。检查扩容新节点与现有节点之间的一致性在现有节点执行gs_check命令指定–hosts参数进行检查其中hosts文件中需要写入新节点ip。
135.2 操作步骤
方式1 以操作系统用户omm登录数据库主节点。 执行如下命令对openGauss数据库状态进行检查。 gs_check -i CheckClusterState其中-i指定检查项注意区分大小写。格式-i CheckClusterState、-i CheckCPU或-i CheckClusterStateCheckCPU。 取值范围为所有支持的检查项名称详细列表请参见《工具与命令参考》中“服务端工具 gs_checkos openGauss状态检查表”用户可以根据需求自己编写新检查项。
方式2 以操作系统用户omm登录数据库主节点。 执行如下命令对openGauss数据库进行健康检查。 gs_check -e inspect其中-e指定场景名注意区分大小写。格式-e inspect或-e upgrade。 取值范围为所有支持的巡检场景名称默认列表包括inspect例行巡检、upgrade升级前巡检、install安装、binary_upgrade就地升级前巡检、health健康检查巡检、slow_node节点、longtime耗时长巡检用户可以根据需求自己编写场景。
openGauss巡检的主要作用是在openGauss运行过程中检查整个openGauss状态是否正常或者重大操作前升级、扩容确保openGauss满足操作所需的环境条件和状态条件。详细的巡检项目和场景请参见《工具与命令参考》中“服务端工具 gs_checkos openGauss状态检查表”。
示例
执行单项检查结果
perfadmlfgp000700749:/opt/huawei/perfadm/tool/script gs_check -i CheckCPU
Parsing the check items config file successfully
Distribute the context file to remote hosts successfully
Start to health check for the cluster. Total Items:1 Nodes:3Checking... [] 1/1
Start to analysis the check result
CheckCPU....................................OK
The item run on 3 nodes. success: 3Analysis the check result successfully
Success. All check items run completed. Total:1 Success:1 Failed:0
For more information please refer to /opt/huawei/wisequery/script/gspylib/inspection/output/CheckReport_201902193704661604.tar.gz本地执行结果
perfadmlfgp000700749:/opt/huawei/perfadm/tool/script gs_check -i CheckCPU -L2017-12-29 17:09:29 [NAM] CheckCPU
2017-12-29 17:09:29 [STD] 检查主机CPU占用率如果idle 大于30%并且iowait 小于 30%.则检查项通过否则检查项不通过
2017-12-29 17:09:29 [RST] OK2017-12-29 17:09:29 [RAW]
Linux 4.4.21-69-default (lfgp000700749) 12/29/17 _x86_64_17:09:24 CPU %user %nice %system %iowait %steal %idle
17:09:25 all 0.25 0.00 0.25 0.00 0.00 99.50
17:09:26 all 0.25 0.00 0.13 0.00 0.00 99.62
17:09:27 all 0.25 0.00 0.25 0.13 0.00 99.37
17:09:28 all 0.38 0.00 0.25 0.00 0.13 99.25
17:09:29 all 1.00 0.00 0.88 0.00 0.00 98.12
Average: all 0.43 0.00 0.35 0.03 0.03 99.17执行场景检查结果
[perfadmSIA1000131072 Check]$ gs_check -e inspect
Parsing the check items config file successfully
The below items require root privileges to execute:[CheckBlockdev CheckIOrequestqueue CheckIOConfigure CheckCheckMultiQueue CheckFirewall CheckSshdService CheckSshdConfig CheckCrondService CheckBootItems CheckFilehandle CheckNICModel CheckDropCache]
Please enter root privileges user[root]:root
Please enter password for user[root]:
Please enter password for user[root] on the node[10.244.57.240]:
Check root password connection successfully
Distribute the context file to remote hosts successfully
Start to health check for the cluster. Total Items:57 Nodes:2Checking... [ ] 21/57
Checking... [] 57/57
Start to analysis the check result
CheckClusterState...........................OK
The item run on 2 nodes. success: 2CheckDBParams...............................OK
The item run on 1 nodes. success: 1CheckDebugSwitch............................OK
The item run on 2 nodes. success: 2CheckDirPermissions.........................OK
The item run on 2 nodes. success: 2CheckReadonlyMode...........................OK
The item run on 1 nodes. success: 1CheckEnvProfile.............................OK
The item run on 2 nodes. success: 2 (consistent)
The success on all nodes value:
GAUSSHOME /usr1/gaussdb/app
LD_LIBRARY_PATH /usr1/gaussdb/app/lib
PATH /usr1/gaussdb/app/binCheckBlockdev...............................OK
The item run on 2 nodes. success: 2CheckCurConnCount...........................OK
The item run on 1 nodes. success: 1CheckCursorNum..............................OK
The item run on 1 nodes. success: 1CheckPgxcgroup..............................OK
The item run on 1 nodes. success: 1CheckDiskFormat.............................OK
The item run on 2 nodes. success: 2CheckSpaceUsage.............................OK
The item run on 2 nodes. success: 2CheckInodeUsage.............................OK
The item run on 2 nodes. success: 2CheckSwapMemory.............................OK
The item run on 2 nodes. success: 2CheckLogicalBlock...........................OK
The item run on 2 nodes. success: 2CheckIOrequestqueue.....................WARNING
The item run on 2 nodes. warning: 2
The warning[host240,host157] value:
On device (vdb) IO Request RealValue 256 ExpectedValue 32768
On device (vda) IO Request RealValue 256 ExpectedValue 32768CheckMaxAsyIOrequests.......................OK
The item run on 2 nodes. success: 2CheckIOConfigure............................OK
The item run on 2 nodes. success: 2CheckMTU....................................OK
The item run on 2 nodes. success: 2 (consistent)
The success on all nodes value:
1500CheckPing...................................OK
The item run on 2 nodes. success: 2CheckRXTX...................................NG
The item run on 2 nodes. ng: 2
The ng[host240,host157] value:
NetWork[eth0]
RX: 256
TX: 256CheckNetWorkDrop............................OK
The item run on 2 nodes. success: 2CheckMultiQueue.............................OK
The item run on 2 nodes. success: 2CheckEncoding...............................OK
The item run on 2 nodes. success: 2 (consistent)
The success on all nodes value:
LANGen_US.UTF-8CheckFirewall...............................OK
The item run on 2 nodes. success: 2CheckKernelVer..............................OK
The item run on 2 nodes. success: 2 (consistent)
The success on all nodes value:
3.10.0-957.el7.x86_64 CheckMaxHandle..............................OK
The item run on 2 nodes. success: 2CheckNTPD...................................OK
host240: NTPD service is running, 2020-06-02 17:00:28
host157: NTPD service is running, 2020-06-02 17:00:06CheckOSVer..................................OK
host240: The current OS is centos 7.6 64bit.
host157: The current OS is centos 7.6 64bit.CheckSysParams..........................WARNING
The item run on 2 nodes. warning: 2
The warning[host240,host157] value:
Warning reason: variable net.ipv4.tcp_retries1 RealValue 3 ExpectedValue 5.
Warning reason: variable net.ipv4.tcp_syn_retries RealValue 6 ExpectedValue 5.CheckTHP....................................OK
The item run on 2 nodes. success: 2CheckTimeZone...............................OK
The item run on 2 nodes. success: 2 (consistent)
The success on all nodes value:
0800CheckCPU....................................OK
The item run on 2 nodes. success: 2CheckSshdService............................OK
The item run on 2 nodes. success: 2Warning reason: UseDNS parameter is not set; expected: noCheckCrondService...........................OK
The item run on 2 nodes. success: 2CheckStack..................................OK
The item run on 2 nodes. success: 2 (consistent)
The success on all nodes value:
8192CheckSysPortRange...........................OK
The item run on 2 nodes. success: 2CheckMemInfo................................OK
The item run on 2 nodes. success: 2 (consistent)
The success on all nodes value:
totalMem: 31.260929107666016GCheckHyperThread............................OK
The item run on 2 nodes. success: 2CheckTableSpace.............................OK
The item run on 1 nodes. success: 1CheckSysadminUser...........................OK
The item run on 1 nodes. success: 1CheckGUCConsistent..........................OK
All DN instance guc value is consistent.CheckMaxProcMemory..........................OK
The item run on 1 nodes. success: 1CheckBootItems..............................OK
The item run on 2 nodes. success: 2CheckHashIndex..............................OK
The item run on 1 nodes. success: 1CheckPgxcRedistb............................OK
The item run on 1 nodes. success: 1CheckNodeGroupName..........................OK
The item run on 1 nodes. success: 1CheckTDDate.................................OK
The item run on 1 nodes. success: 1CheckDilateSysTab...........................OK
The item run on 1 nodes. success: 1CheckKeyProAdj..............................OK
The item run on 2 nodes. success: 2CheckProStartTime.......................WARNING
host157:
STARTED COMMAND
Tue Jun 2 16:57:18 2020 /usr1/dmuser/dmserver/metricdb1/server/bin/gaussdb --single_node -D /usr1/dmuser/dmb1/data -p 22204
Mon Jun 1 16:15:15 2020 /usr1/gaussdb/app/bin/gaussdb -D /usr1/gaussdb/data/dn1 -M standbyCheckFilehandle.............................OK
The item run on 2 nodes. success: 2CheckRouting................................OK
The item run on 2 nodes. success: 2CheckNICModel...............................OK
The item run on 2 nodes. success: 2 (consistent)
The success on all nodes value:
version: 1.0.1
model: Red Hat, Inc. Virtio network deviceCheckDropCache..........................WARNING
The item run on 2 nodes. warning: 2
The warning[host240,host157] value:
No DropCache process is runningCheckMpprcFile..............................NG
The item run on 2 nodes. ng: 2
The ng[host240,host157] value:
There is no mpprc fileAnalysis the check result successfully
Failed. All check items run completed. Total:57 Success:50 Warning:5 NG:2
For more information please refer to /usr1/gaussdb/tool/script/gspylib/inspection/output/CheckReport_inspect611.tar.gz135.3 异常处理
如果发现检查结果异常可以根据以下内容进行修复。
表 1 检查openGauss运行状态
检查项异常状态处理方法CheckClusterState检查openGauss状态openGauss未启动或openGauss实例未启动使用以下命令启动openGauss及实例。gs_om -t startopenGauss状态异常或openGauss实例异常检查各主机、实例状态根据状态信息进行排查。gs_check -i CheckClusterStateCheckDBParams检查数据库参数数据库参数错误通过gs_guc工具修改数据库参数为指定值。CheckDebugSwitch检查调试日志日志级别不正确使用gs_guc工具将log_min_messages改为指定内容。CheckDirPermissions检查目录权限路径权限错误修改对应目录权限为指定数值750/700。chmod 700 DIRCheckReadonlyMode检查只读模式只读模式被打开确认数据库节点所在磁盘使用率未超阈值默认85%且未在执行其他运维操作。gs_check -i CheckDataDiskUsage ps ux使用gs_guc工具关闭openGauss只读模式。gs_guc reload -N all -I all -c default_transaction_read_only off CheckEnvProfile检查环境变量环境变量不一致重新执行前置更新环境变量信息。CheckBlockdev检查磁盘预读块磁盘预读块大小不为16384使用gs_checkos设置预读块大小为16384KB并写入自启动文件。gs_checkos -i B3CheckCursorNum检查游标数检查游标数失败检查数据库能否正常连接openGauss状态是否正常。CheckPgxcgroup检查重分布状态有未完成重分布的pgxc_group表继续完成扩容或缩容的数据重分布操作。gs_expand、gs_shrinkCheckDiskFormat检查磁盘配置各节点磁盘配置不一致将各节点的磁盘规格改为相同。CheckSpaceUsage检查磁盘空间使用率磁盘可用空间不足清理或扩展对应目录所在的磁盘。CheckInodeUsage检查磁盘索引使用率磁盘可用索引不足清理或扩展对应目录所在的磁盘。CheckSwapMemory检查交换内存交换内存大于物理内存将交换内存调小或关闭。CheckLogicalBlock检查磁盘逻辑块磁盘逻辑块大小不为512使用gs_checkos修改磁盘逻辑块大小为512KB并写入开机自启动文件。gs_checkos -i B4CheckIOrequestqueue检查IO请求IO请求值不为32768使用gs_checkos设置IO请求值为32768并写入开机自启动文件。gs_checkos -i B4CheckCurConnCount检查当前连接数111当前连接数超过最大连接数的90%断开未使用的数据库主节点连接。CheckMaxAsyIOrequests检查最大异步请求最大异步请求值小于104857600或当前节点数据库实例数乘以1048576使用gs_checkos设置最大异步请求值为104857600和当前节点数据库实例数乘以1048576中的最大值。gs_checkos -i B4CheckMTU检查MTU值MTU值不一致设置各节点的MTU一致为1500或8192。ifconfig eth* MTU 1500CheckIOConfigure检查IO配置IO配置不是deadline使用gs_checkos设置IO配置为deadline并写入开机自启动文件。gs_checkos -i B4CheckRXTX检查RXTX值网卡RX/TX值不是4096使用checkos设置openGauss使用的物理网卡RX/TX值为4096。gs_checkos -i B5CheckPing检查网络通畅存在openGauss IP无法ping通检查异常ip间网络设置和状态、防火墙状态。CheckNetWorkDrop检查网络丢包率网络通信丢包率高于1%检查对应IP间网络负载、状态。CheckMultiQueue检查网卡多队列未开启网卡多队列并未将网卡中断绑定到不同CPU核心开启网卡多队列并将网卡队列中断绑定到不同的CPU核心。CheckEncoding检查编码格式各节点编码格式不一致在/etc/profile中写入一致的编码信息。echo export LANGXXX /etc/profileCheckFirewall检查防火墙防火墙未关闭关闭防火墙服务。systemctl disable firewalld.service systemctl stop firewalld.service CheckMaxHandle检查最大文件句柄数最大文件句柄数小于1000000设置91-nofile.conf/90-nofile.conf最大文件句柄数软硬限制为1000000。gs_checkos -i B2CheckNTPD检查时间同步服务NTPD服务未开启或时间误差超过一分钟开启NTPD服务并设置时钟一致。CheckSysParams检查操作系统参数操作系统参数设置不满足要求使用gs_checkos进行参数设置或手动设置。gs_checkos -i B1 vim /etc/sysctl.confCheckTHP检查THP服务THP服务未开启使用gs_checkos设置THP服务。gs_checkos -i B6CheckTimeZone检查时区时区不一致设置各节点为同一时区。cp /usr/share/zoneinfo/\$主时区/$次时区\ /etc/localtimeCheckCPU检查CPUCPU占用过高或IO等待过高进行CPU配置升级或磁盘性能升级。CheckSshdService检查SSHD服务未开启SSHD服务启动SSHD服务并写入开机自启动文件。service sshd start echo server sshd start initFileCheckSshdConfig检查SSHD配置SSHD服务配置错误设置SSHD服务PasswordAuthenticationno; MaxStartups1000;UseDNSyes; ClientAliveInterval10800/ClientAliveInterval0并重启服务server sshd startCheckCrondService检查Crond服务Crond服务未启动安装Crond服务并启用。CheckStack检查堆栈大小堆栈大小小于3072使用gs_checkos设置为3072并重启堆栈值过小进程。gs_checkos -i B2CheckSysPortRange检查系统端口设置系统ip端口不在预期范围内或openGauss端口在系统ip端口内设置系统ip端口范围参数到26000-65535之中设置openGauss端口在系统ip端口范围外。vim /etc/sysctl.confCheckMemInfo检查内存信息各节点内存大小不一致使用相同规格的物理内存。CheckHyperThread检查超线程未开启CPU超线程开启CPU超线程。CheckTableSpace检查表空间表空间路径和openGauss路径存在嵌套或表空间路径相互存在嵌套将表空间数据迁移到路径合法的表空间中。 点赞你的认可是我创作的动力 ⭐️ 收藏你的青睐是我努力的方向 ✏️ 评论你的意见是我进步的财富