Contents

Troubleshooting

This chapter describes troubleshooting information and contains the following section:

Installation and Connections


    Step 1   Connect the mesh access point that you want to be the RAP to the controller.
    Step 2   Deploy the radios (MAP) at the desired locations.
    Step 3   On the controller CLI, enter the show mesh ap summary command to see all MAPs and RAPs on the controller.
    Figure 1. Show Mesh AP Summary Page

    Step 4   On the controller GUI, click Wireless to see the mesh access point (RAP and MAP) summary.
    Figure 2. All APs Summary Page

    Step 5   Click AP Name to see the details page and then select the Interfaces tab to see the active radio interfaces.

    The radio slot in use, radio type, subband in use, and operational status (UP or DOWN) are summarized.

    • All APs supports 2 radio slots: slot 0—2.4 GHz and slot 1—5 GHz.

      If you have more than one controller connected to the same mesh network, then you must specify the name of the primary controller using global configuration for every mesh access point or specify the primary controller on every node, otherwise the least loaded controller is the preferred controller. If the mesh access points were previously connected to a controller, they already have learned a controller’s name.

      After configuring the controller name, the mesh access point reboots.

    Step 6   Click Wireless > AP Name to check the mesh access point’s primary controller on the AP details page.

    Debug Commands

    The following two commands are very helpful to see the messages being exchanged between mesh access points and the controller.

    
    (Cisco Controller) > debug capwap events enable
    (Cisco Controller) > debug disable-all
      
    

    You can use the debug command to see the flow of packet exchanges that occur between the mesh access point and the controller. The mesh access point initiates the discovery process. An exchange of credentials takes place during the join phase to authenticate that the mesh access point is allowed to join the mesh network.

    Upon a successful join completion, the mesh access point sends a CAPWAP configuration request. The controller responds with a configuration response. When a Configure Response is received from the controller, the mesh access point evaluates each configuration element and then implements them.

    Remote Debug Commands

    You can log on to the mesh access point console for debugging either through a direct connection to the AP console port or through the remote debug feature on the controller.

    To invoke remote debug on the controller, enter the following commands:

    
    (Cisco Controller) > debug ap enable ap-name
    (Cisco Controller) > debug ap command command ap-name
    

    AP Console Access

    AP1500s have a console port. A console cable is not shipped with the mesh access point. For the 1550 series access points, console ports are easily accessible and you need not open the access point box.

    The AP1500s have console access security embedded in the code to prevent unauthorized access on the console port and provide enhanced security.

    The login ID and password for console access are configured from the controller. You can use the following commands to push the username/password combination to the specified mesh access point or all access points:

    <Cisco Controller> config ap username cisco password cisco ?
    
    all											Configures the Username/Password for all connected APs.
    <Cisco AP>				Enter the name of the Cisco AP.
    
    
    <Cisco Controller> config ap username cisco password cisco all

    You must verify whether the username/password pushed from the controller is used as user-id and password on the mesh access point. It is a nonvolatile setting. Once set, a login ID and password are saved in the private configuration of the mesh access point.

    Once you have a successful login, the trap is sent to the Cisco Prime Infrastructure. If a user fails to log on three times consecutively, login failure traps are sent to the controller and Cisco Prime Infrastructure.


    Caution


    A mesh access point must be reset to the factory default settings before moving from one location to another.




    Cable Modem Serial Port Access From an AP

    Commands can be sent to the cable modem from the privileged mode of the CLI. Use the command to take a text string and send it to the cable modem UART interface. The cable modem interprets the text string as one of its own commands. The cable modem response is captured and displayed on the Cisco IOS console. Up to 9600 characters are displayed from the cable modem. Any text that is greater than 4800 characters is truncated.

    The modem commands are only operational on mesh APs that have devices connected to the UART port originally intended for the cable modem. If the commands are used on a mesh AP that does not have a cable modem (or any other device connected to the UART), the commands are accepted, however, but they do not produce any returned output. No errors are explicitly flagged.

    Configuration

    Enter the following command from the privileged mode of the MAP:

    
    AP#send cmodem timeout-value modem-command
      
    

    The modem command is any command or text to send to the cable modem. The range of timeout value is 1 to 300 seconds. However, if the captured data equals 9600 characters, any text beyond that is truncated and the response, irrespective of the timeout value and is immediately displayed on the AP console.

    Figure 3. Cable Modem Console Access Command

    Figure 4. Cable Modem Console Access Command


    Caution


    The question mark (?) and the exclamation point (!) should not be used in the send cmodem command. These characters have immediate interpreted use in the Cisco IOS CLI. Therefore, they cannot be sent to the modem.


    Enabling the Cable Modem Console Port
    By default, the Cable Modem console port is disabled. This is to prevent users from accessing the console through their residential cable modem. In the AP1572IC, AP1572EC, and AP1552C model, the cable modem console is connected directly to the access point. The console port is required for signaling between the AP and the cable modem. There are two methods to enable the cable modem console port, either through SNMP or by adding the command to the configuration .cm file on the CMTS.

    Note


    For the AP1572EC, AP1572IC, AP1552C, and AP1552CU, the cable modem must be enabled.


    • Enable the cable modem console port through SNMP by entering this command to the IP address of the cable modem:
      snmpset –c private IP_ADDRESS cmConsoleMode.0 i N
      
      
      Using the OID, enter this command:
      snmpset –c private IP_ADDRESS
      1.3.6.1.4.1.1429.77.1.4.7.0 i N
      
      
      Where IP_ADDRESS is any IPv4 address and N is an integer, 2 to enable read-write, 1 for read-only, or 0 to disable.
      Example:
      snmpset -c private 209.165.200.224 cmConsoleMode.0 i 2
      
      
    • Enable the cable modem console port through the configuration file. The configuration file (with a .cm extension) is loaded into the cable modem head end. It is pushed to the cable modem as part of the join process. Enter the following line to the cable modem configuration file:
      SA-CM-MIB::cmConsoleMode.0 = INTEGER: readWrite(2)
      
      
      Using the OID, enter this line:
      SA-CM-MIB::cmConsoleMode.0 = INTEGER: readWrite(2)
      
      

    Resetting the AP1572xC/AP1552C Through the Cable Modem

    An AP can be reset by entering an SNMP command to the Cable Modem, which resides inside the access point. For this feature to work, you must enable the cable modem console port.

    Reset the AP by entering this snmpset command:
    Snmpset -v2c -c public IP ADDRESS 1.3.6.1.4.1.1429.77.1.3.17.0 i 1
    
    
    Where the IP ADDRESS is the IPv4 address of the cable modem.

    Mesh Access Point CLI Commands

    You can enter these commands directly on the mesh access point using the AP console port or you can use the remote debug feature from the controller:





















    Mesh Access Point Debug Commands

    You can enter these commands directly on the mesh access point using the AP console port or you can use the remote debug feature from the controller.

    • debug mesh ethernet bridging—Debugs Ethernet bridging.

    • debug mesh ethernet config—Debugs access and trunk port configuration associated with VLAN tagging.

    • debug mesh ethernet registration—Debugs the VLAN registration protocol. This command is associated with VLAN tagging.

    • debug mesh forwarding table—Debugs the forwarding table containing bridge groups.

    • debugs mesh forwarding packet bridge-group—Debugs the bridge group configuration.

    Defining Mesh Access Point Role

    By default, AP1500s are shipped with a radio role set to MAP. You must reconfigure a mesh access point to act as a RAP.

    Backhaul Algorithm

    A backhaul is used to create only the wireless connection between mesh access points.

    The backhaul interface by default is 802.11a. You cannot change the backhaul interface to 802.11b/g.

    The "auto" data rate is selected by default for AP1500s.

    The backhaul algorithm has been designed to fight against stranded mesh access point conditions. This algorithm also adds a high-level of resiliency for each mesh node.

    The algorithm can be summarized as follows:

    • A MAP always sets the Ethernet port as the primary backhaul if it is UP; otherwise, it is the 802.11a radio (this feature gives the network administrator the ability to configure it as a RAP the first time and recover it in-house). For fast convergence of the network, we recommend that you do not connect any Ethernet device to the MAP for its initial joining to the mesh network.

    • A MAP failing to connect to a WLAN controller on an Ethernet port that is UP, sets the 802.11a radio as the primary backhaul. Failing to find a neighbor or failing to connect to a WLAN controller via any neighbor on the 802.11a radio causes the primary backhaul to be UP on the Ethernet port again. A MAP gives preference to the parent which has the same BGN.

    • A MAP connected to a controller over an Ethernet port does not build a mesh topology (unlike a RAP).

    • A RAP always sets the Ethernet port as the primary backhaul.

    • If the Ethernet port on a RAP is DOWN, or a RAP fails to connect to a controller on an Ethernet port that is UP, the 802.11a radio is set as the primary backhaul. Failing to find a neighbor or failing to connect to a controller via any neighbor on the 802.11a radio makes the RAP go to the SCAN state after 15 minutes and starts with the Ethernet port first.

    Keeping the roles of mesh nodes distinct using the above algorithm greatly helps to avoid a mesh access point from being in an unknown state and becoming stranded in a live network.

    Passive Beaconing (Anti-Stranding)

    When enabled, passive beaconing allows a stranded mesh access point to broadcast its debug messages over-the-air using a 802.11b/g radio. A neighboring mesh access point that is listening to the stranded mesh access point and has a connection to a controller, can pass those messages to the controller over CAPWAP. Passive beaconing prevents a mesh access point that has no wired connection from being stranded.

    Debug logs can also be sent as distress beacons on a nonbackhaul radio so that a neighboring mesh access point can be dedicated to listen for the beacons.

    The following steps are automatically initiated at the controller when a mesh access point loses its connection to the controller:

    • Identifies the MAC address of a stranded mesh access point

    • Finds a nearby neighbor that is CAPWAP connected

    • Sends commands through remote debug

    • Cycles channels to follow the mesh access point

    You only have to know the MAC address of the stranded AP to make use of this feature.

    A mesh access point is considered stranded if it goes through a lonely timer reboot. When the lonely timer reboot is triggered, the mesh access point, which is now stranded, enables passive beaconing, the anti-stranding feature.

    This feature can be divided into three parts:

    • Strand detection by stranded mesh access point

    • Beacons sent out by stranded mesh access point

      • Latch the 802.11b radio to a channel (1,6,11)

      • Enable debugs

      • Broadcast the standard debug messages as distress beacons

      • Send Latest Crash info file

    • Receive beacons (neighboring mesh access point with remote debugging enabled)

    Deployed mesh access points constantly look for stranded mesh access points. Periodically, mesh access points send a list of stranded mesh access points and SNR information to the controller. The controller maintains a list of the stranded mesh access points within its network.

    When the debug mesh astools troubleshoot mac-addr start command is entered, the controller runs through the list to find the MAC address of the stranded mesh access point.

    A message is sent to the best neighbor to start listening to the stranded access point. The listening mesh access point gets the distress beacons from the stranded mesh access point and sends it to the controller.

    Once a mesh access point takes the role of a listener, it does not purge the stranded mesh access point from its internal list until it stops listening to the stranded mesh access point. While a stranded mesh access point is being debugged, if a neighbor of that mesh access point reports a better SNR to the controller than the current listener by some percentage, then the listener of the stranded mesh access point is changed to the new listener (with better SNR) immediately.

    End-user commands are as follows:

    • config mesh astools [enable | disable]—Enables or disables the astools on the mesh access points. If disabled, APs no longer sends a stranded AP list to the controller.

    • show mesh astools stats—Shows the list of stranded APs and their listeners if they have any.

    • debug mesh astools troubleshoot mac-addr start—Sends a message to the best neighbor of the mac-addr to start listening.

    • debug mesh astools troubleshoot mac-addr stop—Sends a message to the best neighbor of the mac-addr to stop listening.

    • clear mesh stranded [all | mac of b/g radio]—Clears stranded AP entries.

    The controller console is swamped with debug messages from stranded APs for 30 minutes.

    Dynamic Frequency Selection

    Previously, devices employing radar operated in frequency subbands without other competing services. However, controlling regulatory bodies are attempting to open and share these bands with new services like wireless mesh LANs (IEEE 802.11).

    To protect existing radar services, the regulatory bodies require that devices wishing to share the newly opened frequency subband behave in accordance with the Dynamic Frequency Selection (DFS) protocol. DFS dictates that to be compliant, a radio device must be capable of detecting the presence of radar signals. When a radio detects a radar signal, it is required to stop transmitting to for at least 30 minutes to protect that service. The radio then selects a different channel to transmit on but only after monitoring it. If no radar is detected on the projected channel for at least one minute, then the new radio service device may begin transmissions on that channel.

    The AP performs a DFS scan on the new DFS channel for 60 seconds. However, if a neighboring AP is already using that new DFS channel, the AP does not perform the DFS scan.

    The process for a radio to detect and identify a radar signal is a complicated task that sometimes leads to incorrect detects. Incorrect radar detections can occur due to a large number of factors, including due to uncertainties of the RF environment and the ability of the access point to reliably detect actual on-channel radar.

    The 802.11h standard addresses DFS and Transmit Power Control (TPC) as it relates to the 5-GHz band. Use DFS to avoid interference with radar and TPC to avoid interference with satellite feeder links.


    Note


    DFS is mandatory in the USA for 5250 to 5350 and 5470 to 5725 frequency bands. DFS and TPC are mandatory for these same bands in Europe.


    Figure 5. DFS and TPC Band Requirements

    DFS in RAP

    The RAP performs the following steps as a response to radar detection:

    1. The RAP sends a message to the controller that the channel is infected with radar. The channel is marked as infected on the RAP and on the controller.

    2. The RAP blocks the channel for 30 minutes. This 30-minute period is called the nonoccupancy period.

    3. The controller sends a TRAP, which indicates that the radar has been detected on the channel. A TRAP remains until the nonoccupancy period expires.

    4. The RAP has 10 seconds to move away from the channel. This period is called the channel move time, which is defined as the time for the system to clear the channel and is measured from the end of the radar burst to the end of the final transmission on the channel.

    5. The RAP enters the quiet mode. In the quiet mode, the RAP stops data transmissions. Beacons are still generated and probe responses are still delivered. The quiet mode exists until the channel move time is over (10 seconds).

    6. The controller picks up a new random channel and sends the channel information to the RAP.

    7. The RAP receives the new channel information and sends channel change frames (unicast, encrypted) to the MAP, and each MAP sends the same information to its lower children down the sector. Each mesh access point sends the channel change frames once every 100 msecs for a total of five times.

    8. The RAP tunes to the new channel and enters into the silent mode. During the silent mode, only the receiver is ON. The RAP keeps scanning the new channel for any radar presence for 60 seconds. This process is called channel availability check (CAC).

    9. The MAP tunes to the new channel and enters into the silent mode. During the silent mode, only the receiver is ON. The MAP keeps scanning the new channel for any radar presence for 60 seconds.

    10. If radar is not detected, the RAP resumes full functionality on this new channel and the whole sector tunes to this new channel.

    DFS in MAP

    The MAP performs the following steps as a response to radar detection:

    1. The MAP sends a radar seen indication to the parent and ultimately to the RAP indicating that the channel is infected. The RAP sends this message to the controller. The message appears to be coming from the RAP. The MAP, RAP, and controller mark the channel as infected for 30 minutes.

    2. The MAP blocks the channel for 30 minutes. This 30-minute period is called the nonoccupancy period.

    3. The controller sends a TRAP, which indicates that the radar has been detected on the channel. The TRAP remains until the nonoccupancy period expires.

    4. The MAP has 10 seconds to move away from the channel. This is called the channel move time, which is defined as the time for the system to clear the channel and is measured from the end of the radar burst to the end of the final transmission on the channel.

    5. The MAP enters the quiet mode. In the quiet mode, the MAP stops data transmissions. Beacons are still generated and probe responses are still delivered. The quiet mode exists until the channel move time is over (10 seconds).

    6. The controller picks up a new random channel and sends the channel to the RAP.

    7. The RAP receives the new channel information and sends channel change frames (unicast, encrypted) to a MAP, and each MAP sends the same information to its lower children down the sector. Each mesh access point sends the channel change frames once every 100 msecs for a total of five times.

    8. Each mesh access point tunes to the new channel and enters into the silent mode. During the silent mode, only the receiver is ON. There is no packet transmission. An AP keeps scanning the new channel for any radar presence for 60 seconds. This process is called the channel availability check (CAC). The MAP should not disconnect from the controller. The network should remain stable during this one-minute period.

    DFS functionality allows a MAP that detects a radar signal to transmit that up to the RAP, which then acts as if it has experienced radar and moves the sector. This process is called the coordinated channel change. This functionally can be turned on or off on the controller. The coordinated channel change is enabled by default.

    To enable DFS, enter the following command:

    
    (Cisco Controller) > config mesh full-sector-dfs enable
      
    

    To verify that DFS is enabled on the network, enter the following command:

    
    (Cisco Controller) > show network summary
    

    Note


    A MAP that detects radar should send a message to the RAP, unless the parent has a different BGN, in which case it does not send messages for a coordinated sector change. Instead, the MAP reenters the SCAN state and searches on nonradar seen channels for a new parent.



    Note


    Ensure that none of your mesh access points are using a default BGN.



    Note


    A repeated radar event on the MAP (radar triggers once, and then almost immediately again), causes the MAP to disconnect.


    Preparation in a DFS Environment

    This section describes how to prepare in a DFS environment:

    • To verify that your controller is set to the correct country domain, enter the following command:

      
      (Cisco Controller) > show country
        
      
    • To check the mesh access point country and the channel setting on the controller, enter the following command:

      
      (Cisco Controller)> show ap config 802.11a ap-name
        
      
    • To identify channels available for mesh, enter the following command:

      
      (Cisco Controller)> show ap config 802.11a ap-name
        
      

      Look for the allowed channel list.

      
      Allowed Channel List....................... 100,104,108,112,116,120,124, ......................................... 128,132,136,140
        
      
    • To identify channels available for mesh on the AP console (or use remote debug from the controller, enter the following command:

      
      ap1520-rap # show mesh channels
        
      HW: Dot11Radio1, Channels:
      100, 104, 108, 112, 116, 120, 124, 128, 132, 136, 140
        
      

      An asterisk next to a channel indicates that radar has been seen on the channel.

    • To invoke remote debug, enter the following commands:

      
      (Cisco Controller) > debug ap enable ap-name
      (Cisco Controller) > debug ap command command ap-name
        
      
    • Debug commands to see radar detection and past radar detections on the DFS channel are as follows:

      
      show mesh dfs channel channel-number
      show mesh dfs history
        
      

      Information similar to the following appears.

      ap1520-rap # show mesh dfs channel 132

      
      Channel 132 is available
      Time elapsed since radar last detected: 0 day(s), 7 hour(s), 6 minute(s), 51 second(s).
        
      

      The RAP should be run through the channels to determine whether there is active radar on each of the channels.

      
      ap1520-rap # show mesh dfs channel 132
        
      Radar detected on channel 132, channel becomes unusable (Time Elapsed: 0 day(s), 7 hour(s), 7 minute(s), 11 second(s)).
      Channel is set to 100 (Time Elapsed: 0 day(s), 7 hour(s), 7 minute(s), 11 second(s)).
      Radar detected on channel 116, channel becomes unusable (Time Elapsed: 0 day(s), 7 hour(s), 6 minute(s), 42 second(s)).
      Channel is set to 64 (Time Elapsed: 0 day(s), 7 hour(s), 6 minute(s), 42 second(s)).
      Channel 132 becomes usable (Time Elapsed: 0 day(s), 6 hour(s), 37 minute(s), 10 second(s)).
      Channel 116 becomes usable (Time Elapsed: 0 day(s), 6 hour(s), 36 minute(s), 42 second(s)).
      

    Monitoring DFS

    The DFS history should be run every morning or more frequently to detect the radar. This information does not get erased and is stored on the mesh access point flash. Therefore, you only need to match the times.

    
    ap1520-rap # show controller dot11Radio 1
      
    

    Information similar to the following appears:

    
    interface Dot11Radio1
    Radio Hammer 5, Base Address 001c.0e6c.9c00, BBlock version 0.00, Software version 0.05.30
    Serial number: FOC11174XCW
    Number of supported simultaneous BSSID on Dot11Radio1: 16
    Carrier Set: ETSI (OFDM) (EU) (-E)
    Uniform Spreading Required: Yes
    Current Frequency: 5540 MHz Channel 108 (DFS enabled)
    Allowed Frequencies: *5500(100) *5520(104) *5540(108) *5560(112) *5580(116) *560
    0(120) *5620(124) *5640(128) *5660(132) *5680(136) *5700(140)
    * = May only be selected by Dynamic Frequency Selection (DFS)
    Listen Frequencies: 5180(36) 5200(40) 5220(44) 5240(48) 5260(52) 5280(56) 5300(6
    0) 5320(64) 5500(100) 5520(104) 5540(108) 5560(112) 5580(116) 5660(132) 5680(136
    ) 5700(140) 5745(149) 5765(153) 5785(157) 5805(161) 5825(165) 4950(20) 4955(21)
    4960(22) 4965(23) 4970(24) 4975(25) 4980(26)
      
    

    Note


    An asterisk indicates that this channel has DFS enabled.


    Frequency Planning

    Use alternate adjacent channels in adjacent sectors. If you have two RAPs deployed at the same location, you must leave one channel in between.

    Weather radars operate within the 5600- to 5650-MHz band, which means that channels 124 and 128 might be affected, but also channels 120 and 132 might suffer from weather radar activity.

    If the mesh access point does detect radar, the controller and the mesh access point both will retain the channel as the configured channel. The controller retains it in volatile memory associated with the mesh access point, and the mesh access point has it stored in its flash as configuration. After the 30 minute quiet period, the controller returns the mesh access point to the static value, regardless of whether the mesh access point has been configured with a new channel or not. In order to overcome this, configure the mesh access point with a new channel, and reboot the mesh access point.

    Once radar is reliably detected on a channel, that channel, and the two surrounding channels, should be added to the RRM exclusion list, as follows:

    
    (Cisco Controller) > config advanced 802.11a channel delete channel
      
    

    A mesh access point goes to a new channel that is picked by RRM, and it does not consider excluded channels.

    If a radar is detected on channel 124, for instance, channels 120, 124, and 128 should be added to the exclusion list. In addition, do not configure RAP to operate on those channels.

    Good Signal-to-Noise Ratios

    For European installations, the minimum recommendation is increased to 20 dB of signal-to-noise ratio (SNR). The extra dBs are used to mitigate the effects of radar interference with packet reception, which is not observed in non-DFS environments.

    Access Point Placement

    Collocated mesh access points should have a minimum of 10 feet (3.048 meters) of vertical separation or 100 (30.48 meters) feet of horizontal separation.

    Check Packet Error Rate

    Mesh access points that have an high error rate, greater than 1 percent, should have mitigation applied to them, by changing the channels for noise and interference, adding additional mesh access points in the transmission path, moving the mesh access points to different sectors, or adding additional mesh access points.

    Bridge Group Name Misconfiguration

    A mesh access point can be wrongly provisioned with a bridgegroupname and placed in a group other than it was intended. Depending on the network design, this mesh access point might or might not be able to reach out and find its correct sector or tree. If it cannot reach a compatible sector, the mesh access point can become stranded.

    To recover a stranded mesh access point, the concept of default bridgegroupname has been introduced in the software. When a mesh access point is unable to connect to any other mesh access point with its configured bridgegroupname, it attempts to connect with the bridgegroupname of default.

    The algorithm of detecting this strand condition and recovery is as follows:

    1. Passively scans and finds all neighbor nodes regardless of their bridgegroupname.

    2. The mesh access point attempts to connect to the neighbors heard with my own bridgegroupname using AWPP.

    3. If Step 2 fails, attempts to connect with default bridgegroupname using AWPP.

    4. For each failed attempt in Step 3, it adds the neighbor to an exclusion list and attempts to connect the next best neighbor.

    5. If the AP fails to connect with all neighbors in Step 4, it reboots the mesh access point.

    6. If connected with a default bridgegroupname for 15 minutes, the mesh access point goes into a scan state.

    When an mesh access point is able to connect with the default bridgegroupname, the parent node reports the mesh access point as a default child/node/neighbor entry on the controller, so that a network administrator is Cisco Prime Infrastructure. Such a mesh access point behaves as a normal (nonmesh) access point and accepts any client, other mesh nodes as its children, and it passes any data traffic through.


    Note


    Do not confuse an unassigned BGN (null value) with DEFAULT, which is a mode that the access point uses to connect when it cannot find its own BGN.


    To check the current state of a mesh access point’s BGN, enter the following command:

    
    (Cisco Controller)> show mesh path Map3:5f:ff:60
    00:0B:85:5F:FA:60 state UPDATED NEIGH PARENT DEFAULT (106B), snrUp 48, snrDown 48, linkSnr 49
    00:0B:85:5F:FB:10 state UPDATED NEIGH PARENT BEACON (86B) snrUp 72, snrDown 63, linkSrn 57
    00:0B:85:5F:FA:60 is RAP
      
    

    To check the current state of a mesh access point’s BGN, check the neighbor information for the mesh access point (GUI) as follows:

    Choose Wireless > All APs > AP Name > Neighbor info .

    Figure 6. Neighbor Information for a Child

    Figure 7. Neighbor Information for a Parent

    Misconfiguration of the Mesh Access Point IP Address

    Although most Layer 3 networks are deployed using DHCP IP address management, some network administrators might prefer the manual IP address management and allocating IP addresses statically to each mesh node. Manual mesh access point IP address management can be a nightmare for large networks, but it might make sense in small to medium size networks (such as 10 to 100 mesh nodes) because the number of mesh nodes are relatively small compared to client hosts.

    Statically configuring the IP address on a mesh node has the possibility of putting a MAP on a wrong network, such as a subnet or VLAN. This mistake could prevent a mesh access point from successfully resolving the IP gateway and failing to discover a WLAN controller. In such a scenario, the mesh access point falls back to its DHCP mechanism and automatically attempts to find a DHCP server and obtains an IP address from it. This fallback mechanism prevents a mesh node from being potentially stranded from a wrongly configured static IP address and allows it to obtain a correct address from a DHCP server on the network.

    When you are manually allocating IP addresses, we recommend that you make IP addressing changes from the furthest mesh access point child first and then work your way back to the RAP. This recommendation also applies if you relocate equipment. For example, if you uninstall a mesh access point and redeploy it in another physical location of the mesh network that has a different addressed subnet.

    Another option is to take a controller in Layer 2 mode with a RAP to the location with the misconfigured MAP. Set the bridge group name on the RAP to match the MAP that needs the configuration change. Add the MAP’s MAC address to the controller. When the misconfigured MAP comes up in the mesh access point summary detail, configure it with an IP address.

    Misconfiguration of DHCP

    Despite the DHCP fallback mechanism, there is still a possibility that a mesh access point can become stranded, if any of the following conditions exist:

    • There is no DHCP server on the network.

    • There is a DHCP server on the network, but it does not offer an IP address to the AP, or if it gives a wrong IP address to the AP (for example, on a wrong VLAN or subnet).

    These conditions can strand a mesh access point that is configured with or without a wrong static IP address or with DHCP. Therefore, you must ensure that when a mesh access point is unable to connect after exhausting all DHCP discovery attempts or DHCP retry counts or IP gateway resolution retry counts, it attempts to find a controller in Layer 2 mode. In other words, a mesh access point attempts to discover a controller in Layer 3 mode first and in this mode, attempts with both static IP (if configured) or DHCP (if possible). The AP then attempts to discover a controller in Layer 2 mode. After finishing a number of Layer 3 and Layer 2 mode attempts, the mesh access point changes its parent node and re-attempts DHCP discovery. Additionally, the software exclusion-lists notes the parent node through which it was unable to obtain the correct IP address.

    Identifying the Node Exclusion Algorithm

    Depending on the mesh network design, a node might find another node “best” according to its routing metric (even recursively true), yet it is unable to provide the node with a connection to the correct controller or correct network. It is the typical honeypot access point scenario caused by either misplacement, provisioning, design of the network, or by the dynamic nature of an RF environment exhibiting conditions that optimize the AWPP routing metric for a particular link in a persistent or transient manner. Such conditions are generally difficult to recover from in most networks and could blackhole or sinkhole a node completely, taking it out from the network. Possible symptoms include, but are not limited to the following:

    • A node connects to the honeypot but cannot resolve the IP gateway when configured with the static IP address, or cannot obtain the correct IP address from the DHCP server, or cannot connect to a WLAN controller.

    • A node ping-pongs between a few honeypots or circles between many honeypots (in worst-case scenarios).

    Cisco mesh software resolves this difficult scenario by using a sophisticated node exclusion-listing algorithm. This node exclusion-listing algorithm uses an exponential backoff and advance technique much like the TCP sliding window or 802.11 MAC.

    The basic idea relies on the following five steps:

    1. Honeypot detection—The honeypots are first detected via the following steps:

      A parent node is set by the AWPP module by:

      • A static IP attempt in CAPWAP module.

      • A DHCP attempt in the DHCP module.

      • A CAPWAP attempt to find and connect to a controller fails.

    2. Honeypot conviction—When a honeypot is detected, it is placed in a exclusion-list database with its conviction period to remain on the list. The default is 32 minutes. Other nodes are then attempted as parents in the following order, falling back to the next, upon failing the current mechanism:

      • On the same channel.

      • Across different channels (first with its own bridgegroupname and then with default).

      • Another cycle, by clearing conviction of all current exclusion-list entries.

      • Rebooting the AP.

    3. Nonhoneypot credit—It is often possible that a node is not a really a honeypot, but appears to be due to some transient back-end condition, such as the following:

      • The DHCP server is either not up-and-running yet, has failed temporarily, or requires a reboot.

      • The WLAN controller is either not up-and-running yet, has failed temporarily, or requires a reboot.

      • The Ethernet cable on the RAP was accidentally disconnected.

        Such nonhoneypots must be credited properly from their serving times so that a node can come back to them as soon as possible.

    4. Honeypot expiration—Upon expiration, an exclusion-list node must be removed from the exclusion-list database and return to a normal state for future consideration by AWPP.

    5. Honeypot reporting—Honeypots are reported to the controller via an LWAPP mesh neighbor message to the controller, which shows these on the Bridging Information page. A message is also displayed the first-time an exclusion-listed neighbor is seen. In a subsequent software release, an SNMP trap is generated on the controller for this condition so that Cisco Prime Infrastructure can record the occurrence.

      Figure 8. Excluded Neighbor

    Because many nodes might be attempting to join or rejoin the network after an expected or unexpected event, a hold-off time of 16 minutes is implemented, which means that no nodes are exclusion-listed during this period of time after system initialization.

    This exponential backoff and advance algorithm is unique and has the following properties:

    • It allows a node to correctly identify the parent nodes whether it is a true honeypot or is just experiencing temporary outage conditions.

    • It credits the good parent nodes according to the time it has enabled a node to stay connected with the network. The crediting requires less and less time to bring the exclusion-list conviction period to be very low for real transient conditions and not so low for transient to moderate outages.

    • It has a built-in hysteresis for encountering the initial condition issue where many nodes try to discover each other only to find that those nodes are not really meant to be in the same network.

    • It has a built-in memory for nodes that can appear as neighbors sporadically so they are not accidentally considered as parents if they were, or are supposed to be, on the exclusion-list database.

    The node exclusion-listing algorithm guards the mesh network against serious stranding. It integrates into AWPP in such a way that a node can quickly reconverge and find the correct network.

    Throughput Analysis

    Throughput depends on packet error rate and hop count.

    Capacity and throughput are orthogonal concepts. Throughput is one user's experience at node N and the total area capacity is calculated over the entire sector of N-nodes and is based on the number of ingress and egress RAP, assuming separate noninterfering channels.

    For example, 4 RAPs at 10 Mbps each deliver 40 Mbps total capacity. So, one user at 2 hops out, logically under each RAP, could get 5 Mbps each of TPUT, but consume 40 Mbps of the backhaul capacity.

    With the Cisco Mesh solution, the per-hop latency is less than 10 msecs, and the typical latency numbers per hop range from 1 to 3 msecs. Overall jitter is also less than 3 msecs.

    Throughput depends on the type of traffic being passed through the network: User Datagram Protocol (UDP) or Transmission Control Protocol (TCP). UDP sends a packet over Ethernet with a source and destination address and a UDP protocol header. It does not expect an acknowledgement (ACK). There is no assurance that the packet is delivered at the application layer.

    TCP is similar to UDP but it is a reliable packet delivery mechanism. There are packet acknowledgments and a sliding window technique is used to allow the sender to transmit multiple packets before waiting for an ACK. There is a maximum amount of data the client transmits (called a TCP socket buffer window) before it stops sending data. Sequence numbers track packets sent and ensure that they arrive in the correct order. TCP uses cumulative ACKs and the receiver reports how much of the current stream has been received. An ACK might cover any number of packets, up to the TCP window size.

    TCP uses slow start and multiplicative decrease to respond to network congestion or packet loss. When a packet is lost, the TCP window is cut in half and the back-off retransmission timer is increased exponentially. Wireless is subject to packet loss due to interference issues and TCP reacts to this packet loss. A slow start recovery algorithm is also used to avoid swamping a connection when recovering from packet loss. The effect of these algorithms in a lossy network environment is to lessen the overall throughput of a traffic stream.

    By default, the maximum segment size (MSS) of TCP is 1460 bytes, which results in a 1500-byte IP datagram. TCP fragments any data packet that is larger than 1460 bytes, which can cause at least a 30-percent throughput drop. In addition, the controller encapsulates IP datagrams in the 48-byte CAPWAP tunnel header as shown in Figure 1. Any data packet that is longer than 1394 bytes is also fragmented by the controller, which results in up to a 15-percent throughput decrease.

    Figure 9. CAPWAP Tunneled Packets


    Troubleshooting

    Troubleshooting

    This chapter describes troubleshooting information and contains the following section:

    Installation and Connections


      Step 1   Connect the mesh access point that you want to be the RAP to the controller.
      Step 2   Deploy the radios (MAP) at the desired locations.
      Step 3   On the controller CLI, enter the show mesh ap summary command to see all MAPs and RAPs on the controller.
      Figure 1. Show Mesh AP Summary Page

      Step 4   On the controller GUI, click Wireless to see the mesh access point (RAP and MAP) summary.
      Figure 2. All APs Summary Page

      Step 5   Click AP Name to see the details page and then select the Interfaces tab to see the active radio interfaces.

      The radio slot in use, radio type, subband in use, and operational status (UP or DOWN) are summarized.

      • All APs supports 2 radio slots: slot 0—2.4 GHz and slot 1—5 GHz.

        If you have more than one controller connected to the same mesh network, then you must specify the name of the primary controller using global configuration for every mesh access point or specify the primary controller on every node, otherwise the least loaded controller is the preferred controller. If the mesh access points were previously connected to a controller, they already have learned a controller’s name.

        After configuring the controller name, the mesh access point reboots.

      Step 6   Click Wireless > AP Name to check the mesh access point’s primary controller on the AP details page.

      Debug Commands

      The following two commands are very helpful to see the messages being exchanged between mesh access points and the controller.

      
      (Cisco Controller) > debug capwap events enable
      (Cisco Controller) > debug disable-all
        
      

      You can use the debug command to see the flow of packet exchanges that occur between the mesh access point and the controller. The mesh access point initiates the discovery process. An exchange of credentials takes place during the join phase to authenticate that the mesh access point is allowed to join the mesh network.

      Upon a successful join completion, the mesh access point sends a CAPWAP configuration request. The controller responds with a configuration response. When a Configure Response is received from the controller, the mesh access point evaluates each configuration element and then implements them.

      Remote Debug Commands

      You can log on to the mesh access point console for debugging either through a direct connection to the AP console port or through the remote debug feature on the controller.

      To invoke remote debug on the controller, enter the following commands:

      
      (Cisco Controller) > debug ap enable ap-name
      (Cisco Controller) > debug ap command command ap-name
      

      AP Console Access

      AP1500s have a console port. A console cable is not shipped with the mesh access point. For the 1550 series access points, console ports are easily accessible and you need not open the access point box.

      The AP1500s have console access security embedded in the code to prevent unauthorized access on the console port and provide enhanced security.

      The login ID and password for console access are configured from the controller. You can use the following commands to push the username/password combination to the specified mesh access point or all access points:

      <Cisco Controller> config ap username cisco password cisco ?
      
      all											Configures the Username/Password for all connected APs.
      <Cisco AP>				Enter the name of the Cisco AP.
      
      
      <Cisco Controller> config ap username cisco password cisco all

      You must verify whether the username/password pushed from the controller is used as user-id and password on the mesh access point. It is a nonvolatile setting. Once set, a login ID and password are saved in the private configuration of the mesh access point.

      Once you have a successful login, the trap is sent to the Cisco Prime Infrastructure. If a user fails to log on three times consecutively, login failure traps are sent to the controller and Cisco Prime Infrastructure.


      Caution


      A mesh access point must be reset to the factory default settings before moving from one location to another.




      Cable Modem Serial Port Access From an AP

      Commands can be sent to the cable modem from the privileged mode of the CLI. Use the command to take a text string and send it to the cable modem UART interface. The cable modem interprets the text string as one of its own commands. The cable modem response is captured and displayed on the Cisco IOS console. Up to 9600 characters are displayed from the cable modem. Any text that is greater than 4800 characters is truncated.

      The modem commands are only operational on mesh APs that have devices connected to the UART port originally intended for the cable modem. If the commands are used on a mesh AP that does not have a cable modem (or any other device connected to the UART), the commands are accepted, however, but they do not produce any returned output. No errors are explicitly flagged.

      Configuration

      Enter the following command from the privileged mode of the MAP:

      
      AP#send cmodem timeout-value modem-command
        
      

      The modem command is any command or text to send to the cable modem. The range of timeout value is 1 to 300 seconds. However, if the captured data equals 9600 characters, any text beyond that is truncated and the response, irrespective of the timeout value and is immediately displayed on the AP console.

      Figure 3. Cable Modem Console Access Command

      Figure 4. Cable Modem Console Access Command


      Caution


      The question mark (?) and the exclamation point (!) should not be used in the send cmodem command. These characters have immediate interpreted use in the Cisco IOS CLI. Therefore, they cannot be sent to the modem.


      Enabling the Cable Modem Console Port
      By default, the Cable Modem console port is disabled. This is to prevent users from accessing the console through their residential cable modem. In the AP1572IC, AP1572EC, and AP1552C model, the cable modem console is connected directly to the access point. The console port is required for signaling between the AP and the cable modem. There are two methods to enable the cable modem console port, either through SNMP or by adding the command to the configuration .cm file on the CMTS.

      Note


      For the AP1572EC, AP1572IC, AP1552C, and AP1552CU, the cable modem must be enabled.


      • Enable the cable modem console port through SNMP by entering this command to the IP address of the cable modem:
        snmpset –c private IP_ADDRESS cmConsoleMode.0 i N
        
        
        Using the OID, enter this command:
        snmpset –c private IP_ADDRESS
        1.3.6.1.4.1.1429.77.1.4.7.0 i N
        
        
        Where IP_ADDRESS is any IPv4 address and N is an integer, 2 to enable read-write, 1 for read-only, or 0 to disable.
        Example:
        snmpset -c private 209.165.200.224 cmConsoleMode.0 i 2
        
        
      • Enable the cable modem console port through the configuration file. The configuration file (with a .cm extension) is loaded into the cable modem head end. It is pushed to the cable modem as part of the join process. Enter the following line to the cable modem configuration file:
        SA-CM-MIB::cmConsoleMode.0 = INTEGER: readWrite(2)
        
        
        Using the OID, enter this line:
        SA-CM-MIB::cmConsoleMode.0 = INTEGER: readWrite(2)
        
        

      Resetting the AP1572xC/AP1552C Through the Cable Modem

      An AP can be reset by entering an SNMP command to the Cable Modem, which resides inside the access point. For this feature to work, you must enable the cable modem console port.

      Reset the AP by entering this snmpset command:
      Snmpset -v2c -c public IP ADDRESS 1.3.6.1.4.1.1429.77.1.3.17.0 i 1
      
      
      Where the IP ADDRESS is the IPv4 address of the cable modem.

      Mesh Access Point CLI Commands

      You can enter these commands directly on the mesh access point using the AP console port or you can use the remote debug feature from the controller:





















      Mesh Access Point Debug Commands

      You can enter these commands directly on the mesh access point using the AP console port or you can use the remote debug feature from the controller.

      • debug mesh ethernet bridging—Debugs Ethernet bridging.

      • debug mesh ethernet config—Debugs access and trunk port configuration associated with VLAN tagging.

      • debug mesh ethernet registration—Debugs the VLAN registration protocol. This command is associated with VLAN tagging.

      • debug mesh forwarding table—Debugs the forwarding table containing bridge groups.

      • debugs mesh forwarding packet bridge-group—Debugs the bridge group configuration.

      Defining Mesh Access Point Role

      By default, AP1500s are shipped with a radio role set to MAP. You must reconfigure a mesh access point to act as a RAP.

      Backhaul Algorithm

      A backhaul is used to create only the wireless connection between mesh access points.

      The backhaul interface by default is 802.11a. You cannot change the backhaul interface to 802.11b/g.

      The "auto" data rate is selected by default for AP1500s.

      The backhaul algorithm has been designed to fight against stranded mesh access point conditions. This algorithm also adds a high-level of resiliency for each mesh node.

      The algorithm can be summarized as follows:

      • A MAP always sets the Ethernet port as the primary backhaul if it is UP; otherwise, it is the 802.11a radio (this feature gives the network administrator the ability to configure it as a RAP the first time and recover it in-house). For fast convergence of the network, we recommend that you do not connect any Ethernet device to the MAP for its initial joining to the mesh network.

      • A MAP failing to connect to a WLAN controller on an Ethernet port that is UP, sets the 802.11a radio as the primary backhaul. Failing to find a neighbor or failing to connect to a WLAN controller via any neighbor on the 802.11a radio causes the primary backhaul to be UP on the Ethernet port again. A MAP gives preference to the parent which has the same BGN.

      • A MAP connected to a controller over an Ethernet port does not build a mesh topology (unlike a RAP).

      • A RAP always sets the Ethernet port as the primary backhaul.

      • If the Ethernet port on a RAP is DOWN, or a RAP fails to connect to a controller on an Ethernet port that is UP, the 802.11a radio is set as the primary backhaul. Failing to find a neighbor or failing to connect to a controller via any neighbor on the 802.11a radio makes the RAP go to the SCAN state after 15 minutes and starts with the Ethernet port first.

      Keeping the roles of mesh nodes distinct using the above algorithm greatly helps to avoid a mesh access point from being in an unknown state and becoming stranded in a live network.

      Passive Beaconing (Anti-Stranding)

      When enabled, passive beaconing allows a stranded mesh access point to broadcast its debug messages over-the-air using a 802.11b/g radio. A neighboring mesh access point that is listening to the stranded mesh access point and has a connection to a controller, can pass those messages to the controller over CAPWAP. Passive beaconing prevents a mesh access point that has no wired connection from being stranded.

      Debug logs can also be sent as distress beacons on a nonbackhaul radio so that a neighboring mesh access point can be dedicated to listen for the beacons.

      The following steps are automatically initiated at the controller when a mesh access point loses its connection to the controller:

      • Identifies the MAC address of a stranded mesh access point

      • Finds a nearby neighbor that is CAPWAP connected

      • Sends commands through remote debug

      • Cycles channels to follow the mesh access point

      You only have to know the MAC address of the stranded AP to make use of this feature.

      A mesh access point is considered stranded if it goes through a lonely timer reboot. When the lonely timer reboot is triggered, the mesh access point, which is now stranded, enables passive beaconing, the anti-stranding feature.

      This feature can be divided into three parts:

      • Strand detection by stranded mesh access point

      • Beacons sent out by stranded mesh access point

        • Latch the 802.11b radio to a channel (1,6,11)

        • Enable debugs

        • Broadcast the standard debug messages as distress beacons

        • Send Latest Crash info file

      • Receive beacons (neighboring mesh access point with remote debugging enabled)

      Deployed mesh access points constantly look for stranded mesh access points. Periodically, mesh access points send a list of stranded mesh access points and SNR information to the controller. The controller maintains a list of the stranded mesh access points within its network.

      When the debug mesh astools troubleshoot mac-addr start command is entered, the controller runs through the list to find the MAC address of the stranded mesh access point.

      A message is sent to the best neighbor to start listening to the stranded access point. The listening mesh access point gets the distress beacons from the stranded mesh access point and sends it to the controller.

      Once a mesh access point takes the role of a listener, it does not purge the stranded mesh access point from its internal list until it stops listening to the stranded mesh access point. While a stranded mesh access point is being debugged, if a neighbor of that mesh access point reports a better SNR to the controller than the current listener by some percentage, then the listener of the stranded mesh access point is changed to the new listener (with better SNR) immediately.

      End-user commands are as follows:

      • config mesh astools [enable | disable]—Enables or disables the astools on the mesh access points. If disabled, APs no longer sends a stranded AP list to the controller.

      • show mesh astools stats—Shows the list of stranded APs and their listeners if they have any.

      • debug mesh astools troubleshoot mac-addr start—Sends a message to the best neighbor of the mac-addr to start listening.

      • debug mesh astools troubleshoot mac-addr stop—Sends a message to the best neighbor of the mac-addr to stop listening.

      • clear mesh stranded [all | mac of b/g radio]—Clears stranded AP entries.

      The controller console is swamped with debug messages from stranded APs for 30 minutes.

      Dynamic Frequency Selection

      Previously, devices employing radar operated in frequency subbands without other competing services. However, controlling regulatory bodies are attempting to open and share these bands with new services like wireless mesh LANs (IEEE 802.11).

      To protect existing radar services, the regulatory bodies require that devices wishing to share the newly opened frequency subband behave in accordance with the Dynamic Frequency Selection (DFS) protocol. DFS dictates that to be compliant, a radio device must be capable of detecting the presence of radar signals. When a radio detects a radar signal, it is required to stop transmitting to for at least 30 minutes to protect that service. The radio then selects a different channel to transmit on but only after monitoring it. If no radar is detected on the projected channel for at least one minute, then the new radio service device may begin transmissions on that channel.

      The AP performs a DFS scan on the new DFS channel for 60 seconds. However, if a neighboring AP is already using that new DFS channel, the AP does not perform the DFS scan.

      The process for a radio to detect and identify a radar signal is a complicated task that sometimes leads to incorrect detects. Incorrect radar detections can occur due to a large number of factors, including due to uncertainties of the RF environment and the ability of the access point to reliably detect actual on-channel radar.

      The 802.11h standard addresses DFS and Transmit Power Control (TPC) as it relates to the 5-GHz band. Use DFS to avoid interference with radar and TPC to avoid interference with satellite feeder links.


      Note


      DFS is mandatory in the USA for 5250 to 5350 and 5470 to 5725 frequency bands. DFS and TPC are mandatory for these same bands in Europe.


      Figure 5. DFS and TPC Band Requirements

      DFS in RAP

      The RAP performs the following steps as a response to radar detection:

      1. The RAP sends a message to the controller that the channel is infected with radar. The channel is marked as infected on the RAP and on the controller.

      2. The RAP blocks the channel for 30 minutes. This 30-minute period is called the nonoccupancy period.

      3. The controller sends a TRAP, which indicates that the radar has been detected on the channel. A TRAP remains until the nonoccupancy period expires.

      4. The RAP has 10 seconds to move away from the channel. This period is called the channel move time, which is defined as the time for the system to clear the channel and is measured from the end of the radar burst to the end of the final transmission on the channel.

      5. The RAP enters the quiet mode. In the quiet mode, the RAP stops data transmissions. Beacons are still generated and probe responses are still delivered. The quiet mode exists until the channel move time is over (10 seconds).

      6. The controller picks up a new random channel and sends the channel information to the RAP.

      7. The RAP receives the new channel information and sends channel change frames (unicast, encrypted) to the MAP, and each MAP sends the same information to its lower children down the sector. Each mesh access point sends the channel change frames once every 100 msecs for a total of five times.

      8. The RAP tunes to the new channel and enters into the silent mode. During the silent mode, only the receiver is ON. The RAP keeps scanning the new channel for any radar presence for 60 seconds. This process is called channel availability check (CAC).

      9. The MAP tunes to the new channel and enters into the silent mode. During the silent mode, only the receiver is ON. The MAP keeps scanning the new channel for any radar presence for 60 seconds.

      10. If radar is not detected, the RAP resumes full functionality on this new channel and the whole sector tunes to this new channel.

      DFS in MAP

      The MAP performs the following steps as a response to radar detection:

      1. The MAP sends a radar seen indication to the parent and ultimately to the RAP indicating that the channel is infected. The RAP sends this message to the controller. The message appears to be coming from the RAP. The MAP, RAP, and controller mark the channel as infected for 30 minutes.

      2. The MAP blocks the channel for 30 minutes. This 30-minute period is called the nonoccupancy period.

      3. The controller sends a TRAP, which indicates that the radar has been detected on the channel. The TRAP remains until the nonoccupancy period expires.

      4. The MAP has 10 seconds to move away from the channel. This is called the channel move time, which is defined as the time for the system to clear the channel and is measured from the end of the radar burst to the end of the final transmission on the channel.

      5. The MAP enters the quiet mode. In the quiet mode, the MAP stops data transmissions. Beacons are still generated and probe responses are still delivered. The quiet mode exists until the channel move time is over (10 seconds).

      6. The controller picks up a new random channel and sends the channel to the RAP.

      7. The RAP receives the new channel information and sends channel change frames (unicast, encrypted) to a MAP, and each MAP sends the same information to its lower children down the sector. Each mesh access point sends the channel change frames once every 100 msecs for a total of five times.

      8. Each mesh access point tunes to the new channel and enters into the silent mode. During the silent mode, only the receiver is ON. There is no packet transmission. An AP keeps scanning the new channel for any radar presence for 60 seconds. This process is called the channel availability check (CAC). The MAP should not disconnect from the controller. The network should remain stable during this one-minute period.

      DFS functionality allows a MAP that detects a radar signal to transmit that up to the RAP, which then acts as if it has experienced radar and moves the sector. This process is called the coordinated channel change. This functionally can be turned on or off on the controller. The coordinated channel change is enabled by default.

      To enable DFS, enter the following command:

      
      (Cisco Controller) > config mesh full-sector-dfs enable
        
      

      To verify that DFS is enabled on the network, enter the following command:

      
      (Cisco Controller) > show network summary
      

      Note


      A MAP that detects radar should send a message to the RAP, unless the parent has a different BGN, in which case it does not send messages for a coordinated sector change. Instead, the MAP reenters the SCAN state and searches on nonradar seen channels for a new parent.



      Note


      Ensure that none of your mesh access points are using a default BGN.



      Note


      A repeated radar event on the MAP (radar triggers once, and then almost immediately again), causes the MAP to disconnect.


      Preparation in a DFS Environment

      This section describes how to prepare in a DFS environment:

      • To verify that your controller is set to the correct country domain, enter the following command:

        
        (Cisco Controller) > show country
          
        
      • To check the mesh access point country and the channel setting on the controller, enter the following command:

        
        (Cisco Controller)> show ap config 802.11a ap-name
          
        
      • To identify channels available for mesh, enter the following command:

        
        (Cisco Controller)> show ap config 802.11a ap-name
          
        

        Look for the allowed channel list.

        
        Allowed Channel List....................... 100,104,108,112,116,120,124, ......................................... 128,132,136,140
          
        
      • To identify channels available for mesh on the AP console (or use remote debug from the controller, enter the following command:

        
        ap1520-rap # show mesh channels
          
        HW: Dot11Radio1, Channels:
        100, 104, 108, 112, 116, 120, 124, 128, 132, 136, 140
          
        

        An asterisk next to a channel indicates that radar has been seen on the channel.

      • To invoke remote debug, enter the following commands:

        
        (Cisco Controller) > debug ap enable ap-name
        (Cisco Controller) > debug ap command command ap-name
          
        
      • Debug commands to see radar detection and past radar detections on the DFS channel are as follows:

        
        show mesh dfs channel channel-number
        show mesh dfs history
          
        

        Information similar to the following appears.

        ap1520-rap # show mesh dfs channel 132

        
        Channel 132 is available
        Time elapsed since radar last detected: 0 day(s), 7 hour(s), 6 minute(s), 51 second(s).
          
        

        The RAP should be run through the channels to determine whether there is active radar on each of the channels.

        
        ap1520-rap # show mesh dfs channel 132
          
        Radar detected on channel 132, channel becomes unusable (Time Elapsed: 0 day(s), 7 hour(s), 7 minute(s), 11 second(s)).
        Channel is set to 100 (Time Elapsed: 0 day(s), 7 hour(s), 7 minute(s), 11 second(s)).
        Radar detected on channel 116, channel becomes unusable (Time Elapsed: 0 day(s), 7 hour(s), 6 minute(s), 42 second(s)).
        Channel is set to 64 (Time Elapsed: 0 day(s), 7 hour(s), 6 minute(s), 42 second(s)).
        Channel 132 becomes usable (Time Elapsed: 0 day(s), 6 hour(s), 37 minute(s), 10 second(s)).
        Channel 116 becomes usable (Time Elapsed: 0 day(s), 6 hour(s), 36 minute(s), 42 second(s)).
        

      Monitoring DFS

      The DFS history should be run every morning or more frequently to detect the radar. This information does not get erased and is stored on the mesh access point flash. Therefore, you only need to match the times.

      
      ap1520-rap # show controller dot11Radio 1
        
      

      Information similar to the following appears:

      
      interface Dot11Radio1
      Radio Hammer 5, Base Address 001c.0e6c.9c00, BBlock version 0.00, Software version 0.05.30
      Serial number: FOC11174XCW
      Number of supported simultaneous BSSID on Dot11Radio1: 16
      Carrier Set: ETSI (OFDM) (EU) (-E)
      Uniform Spreading Required: Yes
      Current Frequency: 5540 MHz Channel 108 (DFS enabled)
      Allowed Frequencies: *5500(100) *5520(104) *5540(108) *5560(112) *5580(116) *560
      0(120) *5620(124) *5640(128) *5660(132) *5680(136) *5700(140)
      * = May only be selected by Dynamic Frequency Selection (DFS)
      Listen Frequencies: 5180(36) 5200(40) 5220(44) 5240(48) 5260(52) 5280(56) 5300(6
      0) 5320(64) 5500(100) 5520(104) 5540(108) 5560(112) 5580(116) 5660(132) 5680(136
      ) 5700(140) 5745(149) 5765(153) 5785(157) 5805(161) 5825(165) 4950(20) 4955(21)
      4960(22) 4965(23) 4970(24) 4975(25) 4980(26)
        
      

      Note


      An asterisk indicates that this channel has DFS enabled.


      Frequency Planning

      Use alternate adjacent channels in adjacent sectors. If you have two RAPs deployed at the same location, you must leave one channel in between.

      Weather radars operate within the 5600- to 5650-MHz band, which means that channels 124 and 128 might be affected, but also channels 120 and 132 might suffer from weather radar activity.

      If the mesh access point does detect radar, the controller and the mesh access point both will retain the channel as the configured channel. The controller retains it in volatile memory associated with the mesh access point, and the mesh access point has it stored in its flash as configuration. After the 30 minute quiet period, the controller returns the mesh access point to the static value, regardless of whether the mesh access point has been configured with a new channel or not. In order to overcome this, configure the mesh access point with a new channel, and reboot the mesh access point.

      Once radar is reliably detected on a channel, that channel, and the two surrounding channels, should be added to the RRM exclusion list, as follows:

      
      (Cisco Controller) > config advanced 802.11a channel delete channel
        
      

      A mesh access point goes to a new channel that is picked by RRM, and it does not consider excluded channels.

      If a radar is detected on channel 124, for instance, channels 120, 124, and 128 should be added to the exclusion list. In addition, do not configure RAP to operate on those channels.

      Good Signal-to-Noise Ratios

      For European installations, the minimum recommendation is increased to 20 dB of signal-to-noise ratio (SNR). The extra dBs are used to mitigate the effects of radar interference with packet reception, which is not observed in non-DFS environments.

      Access Point Placement

      Collocated mesh access points should have a minimum of 10 feet (3.048 meters) of vertical separation or 100 (30.48 meters) feet of horizontal separation.

      Check Packet Error Rate

      Mesh access points that have an high error rate, greater than 1 percent, should have mitigation applied to them, by changing the channels for noise and interference, adding additional mesh access points in the transmission path, moving the mesh access points to different sectors, or adding additional mesh access points.

      Bridge Group Name Misconfiguration

      A mesh access point can be wrongly provisioned with a bridgegroupname and placed in a group other than it was intended. Depending on the network design, this mesh access point might or might not be able to reach out and find its correct sector or tree. If it cannot reach a compatible sector, the mesh access point can become stranded.

      To recover a stranded mesh access point, the concept of default bridgegroupname has been introduced in the software. When a mesh access point is unable to connect to any other mesh access point with its configured bridgegroupname, it attempts to connect with the bridgegroupname of default.

      The algorithm of detecting this strand condition and recovery is as follows:

      1. Passively scans and finds all neighbor nodes regardless of their bridgegroupname.

      2. The mesh access point attempts to connect to the neighbors heard with my own bridgegroupname using AWPP.

      3. If Step 2 fails, attempts to connect with default bridgegroupname using AWPP.

      4. For each failed attempt in Step 3, it adds the neighbor to an exclusion list and attempts to connect the next best neighbor.

      5. If the AP fails to connect with all neighbors in Step 4, it reboots the mesh access point.

      6. If connected with a default bridgegroupname for 15 minutes, the mesh access point goes into a scan state.

      When an mesh access point is able to connect with the default bridgegroupname, the parent node reports the mesh access point as a default child/node/neighbor entry on the controller, so that a network administrator is Cisco Prime Infrastructure. Such a mesh access point behaves as a normal (nonmesh) access point and accepts any client, other mesh nodes as its children, and it passes any data traffic through.


      Note


      Do not confuse an unassigned BGN (null value) with DEFAULT, which is a mode that the access point uses to connect when it cannot find its own BGN.


      To check the current state of a mesh access point’s BGN, enter the following command:

      
      (Cisco Controller)> show mesh path Map3:5f:ff:60
      00:0B:85:5F:FA:60 state UPDATED NEIGH PARENT DEFAULT (106B), snrUp 48, snrDown 48, linkSnr 49
      00:0B:85:5F:FB:10 state UPDATED NEIGH PARENT BEACON (86B) snrUp 72, snrDown 63, linkSrn 57
      00:0B:85:5F:FA:60 is RAP
        
      

      To check the current state of a mesh access point’s BGN, check the neighbor information for the mesh access point (GUI) as follows:

      Choose Wireless > All APs > AP Name > Neighbor info .

      Figure 6. Neighbor Information for a Child

      Figure 7. Neighbor Information for a Parent

      Misconfiguration of the Mesh Access Point IP Address

      Although most Layer 3 networks are deployed using DHCP IP address management, some network administrators might prefer the manual IP address management and allocating IP addresses statically to each mesh node. Manual mesh access point IP address management can be a nightmare for large networks, but it might make sense in small to medium size networks (such as 10 to 100 mesh nodes) because the number of mesh nodes are relatively small compared to client hosts.

      Statically configuring the IP address on a mesh node has the possibility of putting a MAP on a wrong network, such as a subnet or VLAN. This mistake could prevent a mesh access point from successfully resolving the IP gateway and failing to discover a WLAN controller. In such a scenario, the mesh access point falls back to its DHCP mechanism and automatically attempts to find a DHCP server and obtains an IP address from it. This fallback mechanism prevents a mesh node from being potentially stranded from a wrongly configured static IP address and allows it to obtain a correct address from a DHCP server on the network.

      When you are manually allocating IP addresses, we recommend that you make IP addressing changes from the furthest mesh access point child first and then work your way back to the RAP. This recommendation also applies if you relocate equipment. For example, if you uninstall a mesh access point and redeploy it in another physical location of the mesh network that has a different addressed subnet.

      Another option is to take a controller in Layer 2 mode with a RAP to the location with the misconfigured MAP. Set the bridge group name on the RAP to match the MAP that needs the configuration change. Add the MAP’s MAC address to the controller. When the misconfigured MAP comes up in the mesh access point summary detail, configure it with an IP address.

      Misconfiguration of DHCP

      Despite the DHCP fallback mechanism, there is still a possibility that a mesh access point can become stranded, if any of the following conditions exist:

      • There is no DHCP server on the network.

      • There is a DHCP server on the network, but it does not offer an IP address to the AP, or if it gives a wrong IP address to the AP (for example, on a wrong VLAN or subnet).

      These conditions can strand a mesh access point that is configured with or without a wrong static IP address or with DHCP. Therefore, you must ensure that when a mesh access point is unable to connect after exhausting all DHCP discovery attempts or DHCP retry counts or IP gateway resolution retry counts, it attempts to find a controller in Layer 2 mode. In other words, a mesh access point attempts to discover a controller in Layer 3 mode first and in this mode, attempts with both static IP (if configured) or DHCP (if possible). The AP then attempts to discover a controller in Layer 2 mode. After finishing a number of Layer 3 and Layer 2 mode attempts, the mesh access point changes its parent node and re-attempts DHCP discovery. Additionally, the software exclusion-lists notes the parent node through which it was unable to obtain the correct IP address.

      Identifying the Node Exclusion Algorithm

      Depending on the mesh network design, a node might find another node “best” according to its routing metric (even recursively true), yet it is unable to provide the node with a connection to the correct controller or correct network. It is the typical honeypot access point scenario caused by either misplacement, provisioning, design of the network, or by the dynamic nature of an RF environment exhibiting conditions that optimize the AWPP routing metric for a particular link in a persistent or transient manner. Such conditions are generally difficult to recover from in most networks and could blackhole or sinkhole a node completely, taking it out from the network. Possible symptoms include, but are not limited to the following:

      • A node connects to the honeypot but cannot resolve the IP gateway when configured with the static IP address, or cannot obtain the correct IP address from the DHCP server, or cannot connect to a WLAN controller.

      • A node ping-pongs between a few honeypots or circles between many honeypots (in worst-case scenarios).

      Cisco mesh software resolves this difficult scenario by using a sophisticated node exclusion-listing algorithm. This node exclusion-listing algorithm uses an exponential backoff and advance technique much like the TCP sliding window or 802.11 MAC.

      The basic idea relies on the following five steps:

      1. Honeypot detection—The honeypots are first detected via the following steps:

        A parent node is set by the AWPP module by:

        • A static IP attempt in CAPWAP module.

        • A DHCP attempt in the DHCP module.

        • A CAPWAP attempt to find and connect to a controller fails.

      2. Honeypot conviction—When a honeypot is detected, it is placed in a exclusion-list database with its conviction period to remain on the list. The default is 32 minutes. Other nodes are then attempted as parents in the following order, falling back to the next, upon failing the current mechanism:

        • On the same channel.

        • Across different channels (first with its own bridgegroupname and then with default).

        • Another cycle, by clearing conviction of all current exclusion-list entries.

        • Rebooting the AP.

      3. Nonhoneypot credit—It is often possible that a node is not a really a honeypot, but appears to be due to some transient back-end condition, such as the following:

        • The DHCP server is either not up-and-running yet, has failed temporarily, or requires a reboot.

        • The WLAN controller is either not up-and-running yet, has failed temporarily, or requires a reboot.

        • The Ethernet cable on the RAP was accidentally disconnected.

          Such nonhoneypots must be credited properly from their serving times so that a node can come back to them as soon as possible.

      4. Honeypot expiration—Upon expiration, an exclusion-list node must be removed from the exclusion-list database and return to a normal state for future consideration by AWPP.

      5. Honeypot reporting—Honeypots are reported to the controller via an LWAPP mesh neighbor message to the controller, which shows these on the Bridging Information page. A message is also displayed the first-time an exclusion-listed neighbor is seen. In a subsequent software release, an SNMP trap is generated on the controller for this condition so that Cisco Prime Infrastructure can record the occurrence.

        Figure 8. Excluded Neighbor

      Because many nodes might be attempting to join or rejoin the network after an expected or unexpected event, a hold-off time of 16 minutes is implemented, which means that no nodes are exclusion-listed during this period of time after system initialization.

      This exponential backoff and advance algorithm is unique and has the following properties:

      • It allows a node to correctly identify the parent nodes whether it is a true honeypot or is just experiencing temporary outage conditions.

      • It credits the good parent nodes according to the time it has enabled a node to stay connected with the network. The crediting requires less and less time to bring the exclusion-list conviction period to be very low for real transient conditions and not so low for transient to moderate outages.

      • It has a built-in hysteresis for encountering the initial condition issue where many nodes try to discover each other only to find that those nodes are not really meant to be in the same network.

      • It has a built-in memory for nodes that can appear as neighbors sporadically so they are not accidentally considered as parents if they were, or are supposed to be, on the exclusion-list database.

      The node exclusion-listing algorithm guards the mesh network against serious stranding. It integrates into AWPP in such a way that a node can quickly reconverge and find the correct network.

      Throughput Analysis

      Throughput depends on packet error rate and hop count.

      Capacity and throughput are orthogonal concepts. Throughput is one user's experience at node N and the total area capacity is calculated over the entire sector of N-nodes and is based on the number of ingress and egress RAP, assuming separate noninterfering channels.

      For example, 4 RAPs at 10 Mbps each deliver 40 Mbps total capacity. So, one user at 2 hops out, logically under each RAP, could get 5 Mbps each of TPUT, but consume 40 Mbps of the backhaul capacity.

      With the Cisco Mesh solution, the per-hop latency is less than 10 msecs, and the typical latency numbers per hop range from 1 to 3 msecs. Overall jitter is also less than 3 msecs.

      Throughput depends on the type of traffic being passed through the network: User Datagram Protocol (UDP) or Transmission Control Protocol (TCP). UDP sends a packet over Ethernet with a source and destination address and a UDP protocol header. It does not expect an acknowledgement (ACK). There is no assurance that the packet is delivered at the application layer.

      TCP is similar to UDP but it is a reliable packet delivery mechanism. There are packet acknowledgments and a sliding window technique is used to allow the sender to transmit multiple packets before waiting for an ACK. There is a maximum amount of data the client transmits (called a TCP socket buffer window) before it stops sending data. Sequence numbers track packets sent and ensure that they arrive in the correct order. TCP uses cumulative ACKs and the receiver reports how much of the current stream has been received. An ACK might cover any number of packets, up to the TCP window size.

      TCP uses slow start and multiplicative decrease to respond to network congestion or packet loss. When a packet is lost, the TCP window is cut in half and the back-off retransmission timer is increased exponentially. Wireless is subject to packet loss due to interference issues and TCP reacts to this packet loss. A slow start recovery algorithm is also used to avoid swamping a connection when recovering from packet loss. The effect of these algorithms in a lossy network environment is to lessen the overall throughput of a traffic stream.

      By default, the maximum segment size (MSS) of TCP is 1460 bytes, which results in a 1500-byte IP datagram. TCP fragments any data packet that is larger than 1460 bytes, which can cause at least a 30-percent throughput drop. In addition, the controller encapsulates IP datagrams in the 48-byte CAPWAP tunnel header as shown in Figure 1. Any data packet that is longer than 1394 bytes is also fragmented by the controller, which results in up to a 15-percent throughput decrease.

      Figure 9. CAPWAP Tunneled Packets