對"；警告"；或"；過量"；狀態中的Sessmgr/Aaamgr進行故障排除

下載選項

PDF (460.1 KB)
在多種裝置上使用 Adobe Reader 檢視
ePub (163.5 KB)
在 iPhone、iPad、Android、Sony Reader 或 Windows Phone 上的各種應用程式中檢視
Mobi (Kindle) (143.2 KB)
在 Kindle 裝置或多部裝置的 Kindle 應用程式上檢視

已更新: 2023 年 7 月 6 日

文件 ID:220556

無偏見用語

本產品的文件集力求使用無偏見用語。針對本文件集的目的，無偏見係定義為未根據年齡、身心障礙、性別、種族身分、民族身分、性別傾向、社會經濟地位及交織性表示歧視的用語。由於本產品軟體使用者介面中硬式編碼的語言、根據 RFP 文件使用的語言，或引用第三方產品的語言，因此本文件中可能會出現例外狀況。深入瞭解思科如何使用包容性用語。

關於此翻譯

思科已使用電腦和人工技術翻譯本文件，讓全世界的使用者能夠以自己的語言理解支援內容。請注意，即使是最佳機器翻譯，也不如專業譯者翻譯的內容準確。Cisco Systems, Inc. 對這些翻譯的準確度概不負責，並建議一律查看原始英文文件（提供連結）。

簡介

本文檔介紹如何對處於「警告」或「過於」狀態的sessmgr或aamgr進行故障排除。

概觀

階段作業管理員(Sessmgr) -支援多種階段作業型別的訂戶處理系統，負責處理訂戶交易。Sessmgr通常與AAAManagers配對。

Authorization， Authentication， and Accounting Manager (Aaamgr) -負責執行系統中使用者和管理使用者的所有AAA協定操作和功能。

圖1.Staros資源分配圖1 ：： Staros資源分配

記錄/基本檢查

基本檢查

要收集有關問題的詳細資訊，您需要與使用者驗證以下資訊：

sessmgr/aaamgr處於「警告」或「過度」狀態的時間有多長？
此問題影響多少個會話/聚合？
您需要確認sessmgr/aaamgr是否由於記憶體或CPU而處於「警告」或「過度」狀態。
您還需要檢查流量是否突然增加，可以透過檢查每個sessmgr的會話數來評估這一點。

透過獲取此資訊，您可以更好地瞭解和解決當前問題。

記錄檔

獲取顯示支援詳細資訊(SSD)和捕獲有問題的時間戳的syslog。建議至少在問題發生前2小時收集這些日誌，以確定觸發點。
捕獲有問題和無問題的sessmgr/aaamgr的核心檔案。您可以在分析段落中找到此專案的詳細資訊。

分析

步驟 1.透過命令檢查受影響的sessmgr/aaamgr的狀態。

show task resources         -
--------- to check detail of sessmgr/aamgr into warn/over state and from the same you also get to know current memory/cpu utlization

Output :: 

******** show task resources *******
Monday May 29 08:30:54 IST 2023
     task          cputime      memory         files      sessions
cpu facility inst used  alloc used alloc    used allc    used allc   S   status
----------------------- ----------- ------------- --------- ------------- ------
2/0 sessmgr  297  6.48% 100% 604.8M 900.0M  210 500     1651 12000   I     good
2/0 sessmgr  300  5.66% 100% 603.0M 900.0M  224 500     1652 12000   I     good
2/1 aaamgr   155  0.90% 95%  96.39M 260.0M   21 500     -- --        -     good
2/1 aaamgr   170  0.89% 95%  96.46M 260.0M   21 500     -- --        -     good

注意：此命令可以檢查每個sessmgr的會話數，如命令輸出中所示。

這兩個指令都有助於檢查自節點重新載入後的最大記憶體使用量：

show task resources max
show task memory max

******** show task memory max *******
Monday May 29 08:30:53 IST 2023
task heap physical virtual
cpu facility inst max max alloc max alloc status
----------------------- ------ ------------------ ------------------ ------
2/0 sessmgr 902 548.6M 66% 602.6M 900.0M 29% 1.19G 4.00G good
2/0 aaamgr 913 68.06M 38% 99.11M 260.0M 17% 713.0M 4.00G good

注意： memory max命令提供自節點重新載入後使用的最大記憶體。此命令可幫助我們辨識與問題相關的任何模式，例如，問題是在最近重新載入後開始的，或者如果最近重新載入允許我們檢查最大記憶體值。另一方面，「show task resources」和「show task resources max」提供類似的輸出，不同之處在於max命令顯示了自重新載入後特定sessmgr/aaamgr使用的記憶體、CPU和會話的最大值。

show subscriber summary apn <apn name> smgr-instance <instance ID> | grep Total

-------------- to check no of subscribers for that particular APN in sessmg

行動計畫

案例 1.由於記憶體使用率高

1. 在重新啟動/終止sessmgr例項之前收集SSD。
2. 收集任何受影響的sessmgr的核心傾印。

task core facility sessmgr instance <instance-value>

3. 在隱藏模式下，針對相同受影響的sessmgr和aaamgr使用這些命令收集堆輸出。

show session subsystem facility sessmgr instance <instance-value> debug-info verbose
show task resources facility sessmgr instance <instance-value>

Heap outputs:

show messenger proclet facility sessmgr instance <instance-value> heap depth 9
show messenger proclet facility sessmgr instance <instance-value> system heap depth 9
show messenger proclet facility sessmgr instance <instance-value> heap
show messenger proclet facility sessmgr instance <instance-value> system

show snx sessmgr instance <instance-value> memory ldbuf
show snx sessmgr instance <instance-value> memory mblk

4. 使用以下命令重新啟動sessmgr任務：

task kill facility sessmgr instance <instance-value>

注意：如果多個會話處於「警告」或「過度」狀態，則建議以2到5分鐘的間隔重新啟動這些會話。首先僅重新啟動2到3個會話，然後等待10到15分鐘，以觀察這些會話是否恢復正常。此步驟有助於評估重新啟動的影響並監視恢復進度。

5. 檢查sessmgr的狀態。

show task resources facility sessmgr instance <instance-value> -------- to check if sessmgr is back in good state

6. 收集另一個SSD。

7. 收集步驟3中提及的所有CLI命令的輸出。

8. 使用步驟2中提及的命令收集任何運行正常sessmgr例項的核心轉儲。

註：要獲取有問題和無問題設施的核心檔案，您有兩個選擇。第一，重新啟動後恢復正常後，您可以收集同一sessmgr的核心檔案。或者，您可以從其他正常的sessmgr捕獲核心檔案。這兩種方法都為分析和故障排除提供了有價值的資訊。

收集堆輸出後，請與Cisco TAC聯絡以查詢確切的堆消耗表。

從這些堆積輸出中，您需要檢查使用更多記憶體的函式。基於此，TAC調查功能利用的目標用途，並確定其用途是否與增加的流量/交易量或其他任何有問題的原因一致。

使用透過指定為Memory-CPU-data-sorting-tool的連結訪問的工具可以對堆輸出進行排序。

注意：在此工具中，有多種選項適用於不同的工具。但是，您需要選擇「堆消耗表」，在其中上傳堆輸出並運行該工具以按排序格式獲取輸出。

案例 2.由於CPU使用率較高

1. 在重新啟動或終止sessmgr例項之前收集SSD。
2. 收集任何受影響的sessmgr的核心傾印。

task core facility sessmgr instance <instance-value>

3. 針對相同受影響的sessmgr/aamgr，以隱藏模式收集這些命令的堆輸出。

show session subsystem facility sessmgr instance <instance-value> debug-info verbose
show task resources facility sessmgr instance <instance-value>
show cpu table
show cpu utilization

show cpu info ------ Display detailed info of CPU.
show cpu info verbose ------ More detailed version of the above

Profiler output for CPU

This is the background cpu profiler. This command allows checking which functions consume 
the most CPU time. This command requires CLI test command password.

show profile facility <facility instance> instance <instance ID> depth 4
show profile facility <facility instance> active facility <facility instance> depth 8

4. 使用以下命令重新啟動sessmgr任務：

task kill facility sessmgr instance <instance-value>

5. 檢查sessmgr的狀態。

show task resources facility sessmgr instance <instance-value> -------- to check if sessmgr is back in good state

6. 收集另一個SSD。
7. 收集步驟3中提及的所有CLI命令的輸出。