New and changed information

Table 1 provides an overview of the significant changes to the organization and features in this guide up to this current release. The table does not provide an exhaustive list of all changes made to the guide or of the new features up to this release.

Table 1. New and changed information

Cisco NX-OS for ACI-Mode Switches Release

Feature

Description

14.2(6), 15.1(1)

SQL database not persistent during ungraceful reloads

The SQL database is no longer persistent during ungraceful reloads of the switches.

See Guidelines and limitations for SSD monitoring.

13.2(5)

SSD monitoring

Support was added for the SSD monitoring feature.

SSD monitoring

The solid-state drive (SSD) monitoring feature enables you to override the preconfigured thresholds for the SSD lifetime parameters. When the SSD reaches some percentage of the configured thresholds, the Cisco Application Policy Infrastructure Controller (APIC) raises fault F3525. This fault allows network operators to monitor and proactively replace any switch before the switch fails due to an SSD's lifetime parameter values becoming exceeded.

Fault F3525

Fault F3525 is raised if the program-erase (P/E) cycles increment by more than 21 in 7 days. This fault does not mean that the SSD is worn out, but indicates that there is a lot of churn that may eventually cause the SSD to be worn out. You must work with Cisco TAC to understand what is causing this churn and address it.

These are the details of the fault:

# fault.Inst
code : F3525
ack : no
annotation :
cause : equipment-flash-warning
changeSet : deltape (New: 21), peCycles (New: 1678), tbw (New: 32.465179), warning (New: yes)
childAction :
created : 2019-08-05T18:22:01.455-07:00
delegated : no
descr : High SSD usage observed. Please check switch activity and contact Cisco Technical Support about high SSD usage.
dn : topology/pod-1/node-206/sys/ch/supslot-1/sup/flash/fault-F3525
domain : infra
extMngdBy : undefined
highestSeverity : warning
lastTransition : 2019-08-05T18:24:02.029-07:00
lc : raised
modTs : never
occur : 1
origSeverity : warning
prevSeverity : warning
rn : fault-F3525
rule : eqpt-flash-flash-warning-alarm
severity : warning
status :
subject : flash-warning-alarm
type : operational

Guidelines and limitations for SSD monitoring

These guidelines and limitations apply to the SSD monitoring feature:

  • You cannot use the CLI to configure this feature.

  • You can use only Micron M600 64G SSDs (Micron_M600_MTFDDAT064MBF).

  • Beginning in the Cisco ACI-mode switch releases 14.2(6) and 15.1(1), the SQL database is no longer persistent during ungraceful reloads of the switches. Examples of ungraceful reload include kernel panics and forced power cycles. In the event of an ungraceful reload, the switch will reboot as stateless and must re-download its policies from the Cisco Application Policy Infrastructure Controller (APIC). Graceful reloads, such as manual reloads and hap-resets, are still stateful and the switch will maintain its database across the reload.

SSD monitoring parameters

This table provides the parameters that you can configure to determine the behavior of the SSD monitoring feature.

Table 2. SSD monitoring parameters

Parameter

Description

P/E

Overrides the SSD lifetime's default program erase cycles threshold. The possible values are between 3000 and 10000 cycles, inclusive. The default value is 5000.

The Cisco Application Policy Infrastructure Controller (APIC) raises a minor fault when the drive reaches 80% of the specified value and raises a major fault when the drive reaches 90% of the specified value.

GBB

Overrides the SSD lifetime's default grown bad block threshold. The possible values are between 4 and 15 blocks, inclusive. The default value is 5.

The Cisco APIC raises a minor fault when the drive reaches 80% of the specified value and raises a major fault when the drive reaches 90% of the specified value.

RRE

Overrides the SSD lifetime's default raw read errors threshold. The possible values are between 500 and 2000 blocks, inclusive. The default value is 1000.

The Cisco APIC raises a minor fault when the drive reaches 80% of the specified value and raises a major fault when the drive reaches 90% of the specified value.

Delta P/E

Overrides the SSD lifetime's default delta of the program erase cycles threshold. The delta of the program erase cycles is equal to:

current P/E – P/E from 7 days ago

Possible values are between 21 to 40 cycles, inclusive. The default value is 21.

If the P/E increases by equal to or greater than the specified value in the last 7 days, the Cisco APIC raises a warning to indicate excessive SSD writes. The window resets after 24 hours and delta P/E (as tracked by the Cisco APIC, not the parameter value) is set to 0. The warning is cleared after 24 hours.

Configure SSD monitoring using the GUI

Follow these steps to configure SSD monitoring using the GUI.

Procedure


Step 1

Log into the Cisco Application Policy Infrastructure Controller (APIC), if you are not already logged in.

Step 2

Configure the SSD monitoring parameters.

  1. On the menu bar, choose Fabric > Access Policies.

  2. In the Navigation pane, choose Policies > Switch > Equipment Flash Config Policies.

  3. Right-click Equipment Flash Config Policies and choose Create Equipment Flash Config Policy.

  4. In the Create Equipment Flash Config Policy form, fill in the fields as appropriate to your setup and click Submit.

    These parameters determine the behavior of SSD monitoring. For more information, see SSD monitoring parameters.

Step 3

Create an access switch policy group and associate the equipment flash configuration policy that you created to this policy group.

  1. In the Navigation pane, choose Switches > Leaf Switches > Policy Groups.

  2. Right-click Policy Groups and choose Create Access Switch Policy Group.

  3. In the Create Access Switch Policy Group form, choose the equipment flash configuration policy that you created in the Flash Config Policy drop-down list, fill in the other fields as appropriate to your setup, and click Submit.

Step 4

Create a leaf switch profile that will use the policy group that you created.

  1. In the Navigation pane, choose Switches > Leaf Switches > Profiles.

  2. Right-click Profiles and choose Create Leaf Profile.

  3. In the Create Leaf Profile form, enter a name for the profile.

  4. On the Leaf Selectors table, click the + and fill in the fields.

    Field

    Description

    Name

    Name of the leaf selector.

    Blocks

    The switches that will be used for the leaf selector. You can select multiple switches.

    Policy Group

    The access switch policy group that will be used for the leaf selector. Choose the policy group that you created.

  5. Click Next.

  6. Fill in the fields as appropriate to your setup and click Submit.


Configure SSD monitoring using the REST APIs

Follow these steps to configure SSD monitoring using the REST APIs. This example procedure performs the following actions:

  • Create a node policy named "nodepol1"

  • Create an access port policy named "accportpol1"

  • Create an access switch policy group named "testPortG1002"

  • Create an access node policy group named "accnodepolgrp1" that is associated with the "testFlashConfigPol" equipment flash configuration policy

  • Create an attached entity profile named "aep1"

  • Create an equipment flash configuration policy named "testFlashConfigPol" that sets the program erase (P/E) cycles, grown bad block (GBB), raw read errors (RRE), and delta P/E values

Procedure


Use the REST API to configure SSD monitoring:

Example:

<polUni>
    <infraInfra>
        <infraNodeP name="nodepol1">
            <infraLeafS name="test" type="range">
                <infraNodeBlk name="test" from_="101" to_="101"/>
                <infraRsAccNodePGrp tDn="uni/infra/funcprof/accnodepgrp-test"/>
            </infraLeafS>
            <infraRsAccPortP tDn="uni/infra/accportprof-test"/>
        </infraNodeP>

        <infraAccPortP name="accportpol1">
            <infraHPortS name="ports1Through12" type="range" >
                <infraPortBlk name="blk1" fromCard="1" toCard="1" fromPort="2" toPort="2" />
                <infraRsAccBaseGrp tDn="uni/infra/funcprof/accportgrp-testPortG1002"/>
            </infraHPortS>
        </infraAccPortP>

        <infraFuncP>
            <infraAccPortGrp name="testPortG1002">
                <infraRsAttEntP tDn="uni/infra/attentp-test" />
            </infraAccPortGrp>
            <infraAccNodePGrp name="accnodepolgrp1">
                <infraRsEquipmentFlashConfigPol
                  tnEquipmentFlashConfigPolName="testFlashConfigPol"/>
            </infraAccNodePGrp>
        </infraFuncP>

        <infraAttEntityP name="aep1">
            <infraRsDomP tDn="uni/phys-mininet" />
        </infraAttEntityP>

        <equipmentFlashConfigPol name="testFlashConfigPol" peCycles="6000" gbb="6"
          readErr="600" deltaPe="22"/>

    </infraInfra>
</polUni>