本文介绍如何排除基本优化故障。
基本WAAS优化包括TCP流优化(TFO)、数据冗余消除(DRE)和持久Lempel-Ziv(LZ)压缩。
TCP连接数、其状态和性质可指示特定位置中WAAS系统的运行状况。正常系统将显示大量连接,其中相当大比例的连接正常关闭。show statistics tfo detail命令可指示特定WAAS设备与网络中其他设备之间连接的数量、状态和性质。
您可以使用show statistics tfo detail命令查看全局TFO统计信息,如下所示:
WAE# show statistics tfo detail Total number of connections : 2852 No. of active connections : 3 <-----Active connections No. of pending (to be accepted) connections : 0 No. of bypass connections : 711 No. of normal closed conns : 2702 No. of reset connections : 147 Socket write failure : 0 Socket read failure : 0 WAN socket close while waiting to write : 0 AO socket close while waiting to write : 2 WAN socket error close while waiting to read : 0 AO socket error close while waiting to read : 64 DRE decode failure : 0 DRE encode failure : 0 Connection init failure : 0 WAN socket unexpected close while waiting to read : 32 Exceeded maximum number of supported connections : 0 Buffer allocation or manipulation failed : 0 Peer received reset from end host : 49 DRE connection state out of sync : 0 Memory allocation failed for buffer heads : 0 Unoptimized packet received on optimized side : 0 Data buffer usages: Used size: 0 B, B-size: 0 B, B-num: 0 Cloned size: 0 B, B-size: 0 B, B-num: 0 Buffer Control: Encode size: 0 B, slow: 0, stop: 0 Decode size: 0 B, slow: 0, stop: 0 Scheduler: Queue Size: IO: 0, Semi-IO: 0, Non-IO: 0 Total Jobs: IO: 1151608, Semi-IO: 5511278, Non-IO: 3690931 Policy Engine Statistics ------------------------- Session timeouts: 0, Total timeouts: 0 Last keepalive received 00.5 Secs ago Last registration occurred 15:00:17:46.0 Days:Hours:Mins:Secs ago Hits: 7766, Update Released: 1088 Active Connections: 3, Completed Connections: 7183 Drops: 0 Rejected Connection Counts Due To: (Total: 0) Not Registered : 0, Keepalive Timeout : 0 No License : 0, Load Level : 0 Connection Limit : 0, Rate Limit : 0 <-----Connection limit overload Minimum TFO : 0, Resource Manager : 0 Global Config : 0, TFO Overload : 0 Server-Side : 0, DM Deny : 0 No DM Accept : 0 . . .
活动连接数字段报告当前正在优化的连接数。
在输出的Policy Engine Statistics部分,Rejected Connection Counts部分显示连接被拒绝的各种原因。Connection Limit计数器报告连接因超出最大优化连接数而被拒绝的次数。如果此处显示高数,您应查看过载情况。有关详细信息,请参阅“排除过载情况故障”一文。
此外,对于从其他AO下推而无法优化流量的连接,TFO优化由通用AO处理,该部分在“排除通用AO故障”一文中介绍。
您可以使用show statistics connection命令查看TFO连接统计信息。有关使用此命令的详细信息,请参阅排除过载情况故障文章中的“检查优化的TCP连接”部分。
当预期应用加速但未观察到应用加速时,请验证是否对流量应用了适当的优化,以及DRE缓存是否正在适当减小优化流量的大小。
DRE和LZ优化的策略引擎映射包括:
各种情况都可能导致DRE和/或LZ不应用于连接,即使已配置:
注意:在上述所有情况下,show statistics connection命令将报告“TDL”加速,以用于此为协商策略的连接。查看DRE或LZ绕行流量的大小将告诉您DRE或LZ优化是否实际应用。使用show statistics connection conn-id命令(如后所述),并查看DRE编码号,查看DRE或LZ比率是否接近0%,并且大部分流量被绕过。前三个条件将由“Encode bypass due to”字段报告,最后三个条件由流量数据模式产生,并在报告的DRE和LZ比率中予以说明。
您可以查看特定连接的统计信息,以确定已配置、与对等体协商并通过使用show statistics connection conn-id命令应用的基本优化。首先,您需要使用show statistics connection命令确定特定连接的连接ID,如下所示:
WAE#show stat conn Current Active Optimized Flows: 1 Current Active Optimized TCP Plus Flows: 0 Current Active Optimized TCP Only Flows: 1 Current Active Optimized TCP Preposition Flows: 0 Current Active Auto-Discovery Flows: 0 Current Reserved Flows: 10 Current Active Pass-Through Flows: 0 Historical Flows: 375 D:DRE,L:LZ,T:TCP Optimization RR:Total Reduction Ratio A:AOIM,C:CIFS,E:EPM,G:GENERIC,H:HTTP,M:MAPI,N:NFS,S:SSL,V:VIDEO ConnID Source IP:Port Dest IP:Port PeerID Accel RR 343 10.10.10.10:3300 10.10.100.100:80 00:14:5e:84:24:5f T 00.0% <------
您将找到输出末尾列出的每个连接的连接ID。要查看特定连接的统计信息,请使用show statistics connection conn-id命令,如下所示:
WAE# sh stat connection conn-id 343 Connection Id: 343 Peer Id: 00:14:5e:84:24:5f Connection Type: EXTERNAL CLIENT Start Time: Tue Jul 14 16:00:30 2009 Source IP Address: 10.10.10.10 Source Port Number: 3300 Destination IP Address: 10.10.100.100 Destination Port Number: 80 Application Name: Web <-----Application name Classifier Name: HTTP <-----Classifier name Map Name: basic Directed Mode: FALSE Preposition Flow: FALSE Policy Details: Configured: TCP_OPTIMIZE + DRE + LZ <-----Configured policy Derived: TCP_OPTIMIZE + DRE + LZ Peer: TCP_OPTIMIZE + DRE + LZ Negotiated: TCP_OPTIMIZE + DRE + LZ <-----Policy negotiated with peer Applied: TCP_OPTIMIZE + DRE + LZ <-----Applied policy . . .
“应用名称”和“分类器名称”字段告诉您应用于此连接的应用和分类器。
优化策略列在“策略详细信息”部分。如果已配置和已应用的策略不匹配,则意味着您为此类连接配置了一个策略,但应用了不同的策略。这可能是对等体关闭、配置错误或过载的结果。检查对等WAE及其配置。
以下输出部分显示与DRE编码/解码相关的统计信息,包括已应用DRE、已应用LZ或已绕过DRE和LZ的消息数:
. . . DRE: 353 Conn-ID: 353 10.10.10.10:3304 -- 10.10.100.100:139 Peer No: 0 Status: Active ------------------------------------------------------------------------------ Open at 07/14/2009 16:04:30, Still active Encode: Overall: msg: 178, in: 36520 B, out: 8142 B, ratio: 77.71% <-----Overall compression DRE: msg: 1, in: 356 B, out: 379 B, ratio: 0.00% <-----DRE compression ratio DRE Bypass: msg: 178, in: 36164 B <-----DRE bypass LZ: msg: 178, in: 37869 B, out: 8142 B, ratio: 78.50% <-----LZ compression ratio LZ Bypass: msg: 0, in: 0 B <-----LZ bypass Avg latency: 0.335 ms Delayed msg: 0 <-----Avg latency Encode th-put: 598 KB/s <-----In 4.3.3 and earlier only Message size distribution: 0-1K=0% 1K-5K=0% 5K-15K=0% 15K-25K=0% 25K-40K=0% >40K=0% <-----In 4.3.3 and earlier only Decode: Overall: msg: 14448, in: 5511 KB, out: 420 MB, ratio: 98.72% <-----Overall compression DRE: msg: 14372, in: 5344 KB, out: 419 MB, ratio: 98.76% <-----DRE compression ratio DRE Bypass: msg: 14548, in: 882 KB <-----DRE bypass LZ: msg: 14369, in: 4891 KB, out: 5691 KB, ratio: 14.07% <-----LZ compression ratio LZ Bypass: msg: 79, in: 620 KB <-----LZ bypass Avg latency: 4.291 ms <-----Avg latency Decode th-put: 6946 KB/s <-----In 4.3.3 and earlier only Message size distribution: 0-1K=4% 1K-5K=12% 5K-15K=18% 15K-25K=9% 25K-40K=13% >40K=40% <-----Output from here in 4.3.3 and earlier only . . .
上述示例中突出显示了以下编码和解码统计信息:
如果看到大量旁路流量,DRE压缩比将小于预期。可能是由加密流量、小消息或其他不可压缩的数据造成的。考虑联系TAC以获得进一步的故障排除帮助。
如果您看到大量LZ绕行流量,这可能是由于大量加密流量(通常不可压缩)造成的。
平均延迟数对调试吞吐量问题非常有用。根据平台,编码和解码的平均延迟通常都为毫秒的单位数。如果用户遇到低吞吐量且其中一个或两个数字较高,则表示编码或解码存在问题,通常在延迟较高的一侧。
使用show statistics dre detail命令查看DRE统计数据(如最旧的可用数据、缓存大小、使用的缓存百分比、使用的哈希表RAM等)可能会很有用,如下所示:
WAE# sh stat dre detail Cache: Status: Usable, Oldest Data (age): 10h <-----Cache age Total usable disk size: 311295 MB, Used: 0.32% <-----Percent cache used Hash table RAM size: 1204 MB, Used: 0.00% <-----Output from here is in 4.3.3 and earlier only . . .
如果您没有看到显着的DRE压缩,可能是因为DRE缓存中未填充足够的数据。检查缓存时间是否短,是否使用的缓存不足100%,这表示出现这种情况。当缓存填充更多数据时,压缩比应该提高。如果100%的缓存已使用,且缓存时间较短,则表明WAE的大小可能过小,无法处理流量。
如果您没有看到显着的DRE压缩,请查看命令输出的以下部分中的Nack/R-tx计数器:
Connection details: Chunks: encoded 398832, decoded 269475, anchor(forced) 43917(9407) <-----In 4.3.3 and earlier only Total number of processed messges: 28229 <-----In 4.3.3 and earlier only num_used_block per msg: 0.053597 <-----In 4.3.3 and earlier only Ack: msg 18088, size 92509 B <-----In 4.3.3 and earlier only Encode bypass due to: <-----Encode bypass reasons remote cache initialization: messages: 1, size: 120 B last partial chunk: chunks: 482, size: 97011 B skipped frame header: messages: 5692, size: 703 KB Nacks: total 0 <-----Nacks R-tx: total 0 <-----Retransmits Encode LZ latency: 0.133 ms per msg Decode LZ latency: 0.096 ms per msg . . .
Nacks和R-tx计数器通常应比流量低。例如,每100 MB原始(未优化)流量约1个。如果您看到的计数明显更高,则可能表示DRE缓存同步问题。使用clear cache dre命令清除所有设备上的DRE缓存,或与TAC联系。
编码绕行原因计数器报告由于各种原因而绕过的字节数。这有助于您确定导致旁路流量的原因(非可优化数据模式)。
有时,识别已连接和活动的对等WAE并查看对等体统计信息会很有帮助,您可以使用show statistics peer dre命令进行如下操作:
WAE# sh stat peer dre Current number of connected peers: 1 Current number of active peers: 1 Current number of degrade peers: 0 Maximum number of connected peers: 1 Maximum number of active peers: 1 Maximum number of degraded peers: 0 Active peer details: Peer-No : 0 Context: 65027 Peer-ID : 00:14:5e:95:4a:b5 Hostname: wae7.example.com <-----Peer hostname ------------------------------------------------------------------------------ Cache: Used disk: 544 MB, Age: 14d23h <-----Peer cache details in 4.3.3 and earlier only Cache: Used disk: 544 MB <-----Peer cache details in 4.4.1 and later only Peer version: 0.4 <----- Ack-queue size: 38867 KB | Buffer surge control: |<---In 4.3.3 and earlier only Delay: avg-size 0 B, conn: 0, flush: 0 | Agg-ft: avg-size 20902 B, conn: 388, flush: 0 | remote low-buff: 0, received flush: 0 <----- Connections: Total (cumulative): 3226861, Active: 597 Concurrent Connections (Last 2 min): max 593, avg 575 . . .
此命令的其他输出显示类似于单个连接的编码和解码统计信息。