簡介
從HyperFlex 4.0(2a)開始,新的監視程式服務將監視ESXi和SCVM主機名的可解析性。如果HX無法解析主機名或到達DNS伺服器,監視程式會使用show dns命令來發出警報/事件。本文檔將介紹CSCvt13947的解決方法 — 一個或多個DNS伺服器在HX連線上未響應來自運行狀況監控的DNS警報。
必要條件
解決此問題的先決條件是Hyperflex資料平台4.0(2a)。
背景資訊
在此新框架中,esxi主機名和scvm主機名必須位於DNS中,才能不觸發此事件。
{
"message": "HX Controller VM {HOSTNAME} one or more configured DNS servers not responding",
"type": "NODE",
"name": "DnsServerOfflineEvent",
"severity": "warning"
},
DNS周圍也存在警報:
{
"name": "HXA-NOD-0009",
"description": "Triggered when one or more configured DNS servers on controller VM cannot be reached.",
"category": "warning",
"message": "One or more DNS servers configured on HX controller VM {HOSTNAME} not responding",
"triggeringEvents" : ["DnsServerOfflineEvent"],
"resetEvents" : ["DnsServerOnlineEvent"]
}
以下是HX Connect中出現的故障的示例:
以下是對應的show dns輸出:
root@SpringpathController3G4ZKOQ6SE:~# show dns
+------------------------------------------+------------------+--------------+---------------------------+
| DNS Name | Resolved Address | status | error |
+-------------------------------------------+------------------+--------------+---------------------------+
| HX01.rchs.local | None | Not Resolved | No DNS servers configured |
| HX04.rchs.local | None | Not Resolved | No DNS servers configured |
| HX03.rchs.local | None | Not Resolved | No DNS servers configured |
| HX02.rchs.local | None | Not Resolved | No DNS servers configured |
| SpringpathController3G4ZKOQ6SE.rchs.local | None | Not Resolved | No DNS servers configured |
| SpringpathController5DCAL5X6C2.rchs.local | None | Not Resolved | No DNS servers configured |
| SpringpathControllerWZ2X6H20SF.rchs.local | None | Not Resolved | No DNS servers configured |
| SpringpathControllerGR57QZVDED.rchs.local | None | Not Resolved | No DNS servers configured |
+-------------------------------------------+------------------+--------------+---------------------------+
Name Servers: ['172.16.199.101'], Search Domains: - rchs.local
您可以看到,每個伺服器的狀態均為未解析,錯誤是未配置DNS伺服器。此輸出中的DNS伺服器是172.16.199.101。
如果執行nslookup,我們會看到主機名SpringpathController3G4ZKOQ6SE無法解析。
root@SpringpathController5DCAL5X6C2:~# nslookup SpringpathController3G4ZKOQ6SE
Server: 172.16.199.101
Address: 172.16.199.101#53
** server can't find SpringpathController3G4ZKOQ6SE: SERVFAIL
將show dns命令中的主機名新增到DNS後,show DNS將顯示已解析的地址,狀態將為Resolved:
root@SpringpathController3G4ZKOQ6SE:~# show dns
+-------------------------------------------+------------------+--------------+---------------------------+
| DNS Name | Resolved Address | status | error |
+-------------------------------------------+------------------+--------------+---------------------------+
| HX01.rchs.local | 172.16.10.45 | Resolved | - |
| HX04.rchs.local | 172.16.10.48 | Resolved | - |
| HX03.rchs.local | 172.16.10.47 | Resolved | - |
| HX02.rchs.local | 172.16.10.46 | Resolved | - |
| SpringpathController3G4ZKOQ6SE.rchs.local | 172.16.10.41 | Resolved | - |
| SpringpathController5DCAL5X6C2.rchs.local | 172.16.10.44 | Resolved | - |
| SpringpathControllerWZ2X6H20SF.rchs.local | 172.16.10.43 | Resolved | - |
| SpringpathControllerGR57QZVDED.rchs.local | 172.16.10.42 | Resolved | - |
+-------------------------------------------+------------------+--------------+---------------------------+
Name Servers: ['172.16.199.101'], Search Domains: - rchs.local
因應措施
解決方法是使用以下命令禁用監視功能。
root@hx-02-scvm-01:~# grep -i "monitor_dns_servers" /opt/springpath/hx-diag-tools/watchdog_config.json && sed -ie 's/"monitor_dns_servers": true/"monitor_dns_servers": false/' /opt/springpath/hx-diag-tools/watchdog_config.json && grep -i "monitor_dns_servers" /opt/springpath/hx-diag-tools/watchdog_config.json && restart watchdog
"monitor_dns_servers": true,
"monitor_dns_servers": false,
watchdog start/running, process 6350
root@hx-02-scvm-01:~#
此命令將在/opt/springpath/hx-diag-tools/watchdog_config.json中將「monitor_dns_servers」設置為false並重新啟動監視程式服務。
要恢復更改,請在每個儲存控制器VM上運行以下命令:
root@hx-02-scvm-01:~# grep -i "monitor_dns_servers" /opt/springpath/hx-diag-tools/watchdog_config.json && sed -ie 's/"monitor_dns_servers": false/"monitor_dns_servers": true/' /opt/springpath/hx-diag-tools/watchdog_config.json && grep -i "monitor_dns_servers" /opt/springpath/hx-diag-tools/watchdog_config.json && restart watchdog
"monitor_dns_servers": false,
"monitor_dns_servers": true,
watchdog start/running, process 9473
root@hx-02-scvm-01:~#
在Hyperflex 4.0(2b)中,此功能預設會停用。建議是在進一步通知之前保持禁用。