EttF High Availability Testing


High availability is critical to maximize network and system uptime, thereby meeting predefined SLAs. This section outlines the validation methodology and the corresponding results of the testing.

HA Test Methodology

Each layer in the EttF solution is verified for HA functionality and recovery times. Simulated Layer 3 traffic is traversing end-to-end from the cell/area zone through the manufacturing zone, and from the cell/area zone through the DMZ to the outside world. Various failures are triggered at each layer, and the convergence time is measured to characterize and quantify the impact on availability.

HA Test Topology

Figure D-1 shows the test topology.

Figure D-1 HA Test Topology

Layer 3 traffic is flowing from Tx1 Tx2 and from Tx1 Tx3. Tx2 and Tx3 are injecting 1000 simulated OSPF routes into the network and Tx1 is sending to all the simulated routes. The idea is to simulate TCP-based traffic as if originating from an application server such as Historian.

HA Test Scenarios

Three test suites are explored to characterize different failure/recovery times at different layers in the EttF design. Various disruptions are initiated at the cell/area zone, manufacturing zone, and DMZ levels. With each failure, convergence time is measured using the following formula:

[(Tx - Rx) / packet rate] * 1000

Where:

Tx = Packets transmitted

Rx = Packets received

PPS = 10,000 pps

Test Suite 1—HA in the Cell/Area Zone (Tx1 Tx2)

Use Case 1—Fail master stack with stack-mac persistent enabled

Use Case 2—Fail slave stack with stack-mac persistent enabled

Use Case 3—Fail master stack with HSRP subsecond timers configured

Use Case 4—Fail slave stack with HSRP subsecond timers configured


Note HSRP with only one logical router was selected so that the end nodes would have a VIP and Virtual MAC that remains persistent. It turns out that the better solution is to enable the stack-mac persistent 0 feature as indicated by the results.


Suite 1 Test Results

The Suite 1 test results (all in milliseconds) are provided in the following tables.

Table D-1 Test Case 1—Fail Master Stack with stack-mac persistent Enabled

Run #
Tx1 -> Tx2

1

1988.2

2

1998.6

3

2160.3

Avg

2049.033


Table D-2 Test Case 2—Fail Slave Stack with stack-mac persistent Enabled

Run #
Tx1 -> Tx2

1

1201.4

2

1091.4

3

1183.6

Avg

1158.8


Table D-3 Use Case 3—Fail Master Stack with HSRP Subsecond Timers Configured

Run #
Tx1 -> Tx2

1

14994.5

2

13726.1

3

14106.9

Avg

14275.83333


Table D-4 Use Case 4—Fail Slave Stack with HSRP Subsecond Timers Configured

Run #
Tx1 -> Tx2

1

2136.9

2

2126.9

3

2086.1

Avg

2116.633333


Test Suite 2—HA in the Manufacturing Zone (Tx1 Tx2)

Use Case 1—Fail physical link in EtherChannel to 4500-1

Use Case 2—Fail physical link in EtherChannel to 4500-2

Use Case 3—Supervisor failover on 4500-1

Use Case 4—Supervisor failover on 4500-2

Suite 2 Test Results

The Suite 2 test results (all in milliseconds) are provided in the following tables.

Table D-5 Test Case 1—Fail Physical Link in EtherChannel to 4500-1

Run #
Tx1 -> Tx2

1

1.4

2

1.4

3

1.6

Avg

1.466667


Table D-6 Test Case 2—Fail Physical Link in EtherChannel to 4500-2

Run #
Tx1 -> Tx2

1

1.8

2

1.4

3

1.4

Avg

1.533333


Table D-7 Use Case 3—Supervisor Failover on 4500-1

Run #
Tx1 -> Tx2

1

16.3

2

16.1

3

16

Avg

16.13333


Table D-8 Test Case 4—Supervisor Failover on 4500-2

Run #
Tx1 -> Tx2

1

15.9

2

18.2

3

16.1

Avg

16.73333


Test Suite 3—HA in the DMZ (Tx1 Tx3)

Test suite 3 removes the control network-facing interface on the active ASA, as follows:

1. Active ASA link failure on control network-facing interface

2. Active ASA link failure on DMZ-facing interface

3. Standby ASA link failure on control network-facing interface

4. Standby ASA link failure on DMZ-facing interface

5. Reload of active ASA

6. Reload of standby ASA

7. Run "failover active" on the active ASA

8. Run "failover active" on the standby ASA

9. 4500-1 switchover to standby supervisor (HSRP active)

10. 4500-1 chassis failure (HSRP active)

11. 4500-2 switchover to standby supervisor

12. 4500-2 chassis failure

Suite 3 Test Results

Table D-9 Suite 3 Test Results

Use Case
Tx1 Tx3

1

5610.9

2

6646.3

3

0

4

0

5

7189.9

6

0

7

0

8

134.2

9

270.8

10

25759.6

11

0

12

0


Test Tools

The following equipment is needed for performing these tests:

16 Cisco Catalyst C2955T-12 industrial switches

2 Cisco Catalyst WS-C3750G-24PS (stacked)

2 fully-redundant Cisco Catalyst 4507R switches with Supervisor IV

1 Ixia traffic generator

Various Rockwell Automation (RA) equipment