Troubleshooting the DCNM OVA Installation

This chapter contains the following sections:

Troubleshooting the OVA Installation

This section describes troubleshooting procedures for various scenarios that involve the Cisco Prime DCNM OVA installation that is associated with Cisco Dynamic Fabric Automation deployment.

Symptom Cause Resolution

The following error is displayed in the vSphere client while you are deploying the Cisco DCNM OVA:

The OVA package requires support for OVR Properties. Details: Unsupported element "Property".

The vSphere client is connected directly to the ESXi host.

Connect the vSphere client to a vCenter and not directly to the ESXi host.

The VMware vCloud Director (vCD) or Cisco Prime Network Registrar (CPNR) scripts run with connection exceptions.

The vCD, Virtual Supervisor Module (VSM) or CPNR are not reachable.

Make sure that the external entities (vCD, vSM, or CPNR) are reachable and that the vCD web interface is up and running.

Unable to login with default credentials, after the Cisco DCNM OVA is deployed.

Administrative password entered at Management Properties section of OVF deploy contains the '@' character.

Do not use the character with the '@' symbol for password. If used, reinstall to access WebUI with the recommended password criteria.

Troubleshooting Accessibility and Connectivity Issues

This section describes troubleshooting procedures for various scenarios when you are unable to access Cisco Prime DCNM after a successful OVA deployment.

Symptom Cause Resolution

Cisco Prime DCNM and other applications in the OVA have connectivity issues; a noticeable drop in ping packets are directed to the virtual appliances.

The datastore used by Cisco Prime DCNM virtual appliance may be almost full or full. The VM performance is compromised and unpredictable.

  1. Free up datastore space and then reboot the Cisco Prime DCNM OVA.

  2. Make sure that the IP address for the Cisco Prime DCNM management access and Enhanced Fabric Management networks are in different subnets.

    1. If they are not in different subnets, redeploy the OVA.

    2. Enter different subnets for the Cisco Prime DCNM management access and Enhanced Fabric Management networks.

The OVA is successfully deployed, but either the Cisco Prime DCNM management access or Enhanced Fabric Management interface is down.

The IP addresses for Cisco Prime DCNM and Enhanced Fabric Management networks are in the same subnet.

  1. Make sure that the IP address for the DCNM management access and Enhanced Fabric Management networks are in different subnets.

    1. If they are not in different subnets, redeploy the OVA.

    2. Enter different subnets for the Cisco Prime DCNM management access and Enhanced Fabric Management networks.

Cisco Prime DCNM is not accessible from the Web UI. The following message is displayed:

System Message: DCNM(pid 22094) process running, but may not be accessible from Web UI.

Cisco Prime DCNM is starting due to a scheduled restart or due to an appliance reboot.

  1. After waiting a few minutes, use the appmgr status dcnm command to determine if Cisco DCNM is still not accessible.
  2. Use the appmgr start dcnm command to start Cisco Prime DCNM.

  3. Review the server logs at the following location:
/usr/local/cisco/dcnm/

/jboss-4.2.2GA/server/fm/log/server.log

Users are unable to log in to the Cisco Prime DCNM Web UI.

The password does not meet security requirements.

Make sure that the administrative password created during the OVA deployments meets password security criteria. For version 7.0.1, redeploy the OVA with a new password.

The password must meet the following requirements:
  • Eight characters in length

  • A combination of alpha and numeric characters

  • Limited to include only the following special characters

    • . (dot)

    • + (plus)

    • _ (underscore)

    • - (hyphen)

The Cisco DFA view in the Cisco Prime DCNM Web UI is taking too long to load.

A large number of devices are managed with a local/packaged Postgres database; significant delays can occur in load times.

  1. Verify the number of devices managed by Cisco DFA.
  2. If the number of devices managed by Cisco DFA exceeds 50 devices, make sure that you are using an external Oracle Database.

See http:/​/​www.cisco.com/​c/​en/​us/​td/​docs/​switches/​datacenter/​sw/​7_x/​dcnm/​installation/​master_files/​OVA_​Installation_​Guide.html for instructions on configuring the Oracle database for Cisco Prime DCNM.

The IP address field does not display a fourth octet.

Your monitor resolution is not properly set.

Change your monitor display settings to a higher resolution.

DHCPD crashes if a /8 subnet is entered.

An /8 subnet is in use. We recommend that you do not use a /8 subnet; this is a problem with the DHCPD that is packaged as part of the DCNM bundle.

If DHCPD crashes, you must manually restart it by using an SSH terminal session in DCNM.

The creation of a network fails.

DHCPD may already be employed for that network. DHCPD does not allow overlapping subnets, even if they are created across different virtual routing and forwarding (VRF) instances.

Ensure that the new network does not have a subnet that conflicts with an existing network.

Configuration-profile instantiation or application errors

Conflicting Profiles Exists

Use the following commands to debug the log file:

  • Use the show fabric database host to check what profile have been applied successfully.

  • Use the show fabric database host statistics if data is missing. Determine if there are any the statistics

sh system internal config-profile history
debug logfile fabric-autoconfig
debug fabric forwarding auto-config all
copy log:fabric-autoconfig bootflash:
debug logfile port-profile
debug port-profile all
copy log:port-profile bootflash: 

Troubleshooting Database Issues

This section describes troubleshooting procedures for various scenarios when you encounter problems with Cisco Prime DCNM and database connections in Cisco Dynamic Fabric Automation (DFA).

Symptom Cause Resolution

The Cisco Prime DCNM logs report TNSListener exceptions from the Oracle database.

Sessions, process, and open cursors are set incorrectly.

  1. Review the Oracle database configuration.

  2. Make sure that the sessions, processes, and open cursors are set appropriately.

The Cisco DFA view in Cisco Prime DCNM is taking too long to load.

A large number of devices are managed with a local/packaged Postgres database.

  1. Verify the number of devices managed by Cisco DFA.
  2. If the number of devices managed by Cisco DFA exceeds 50, make sure that you are using an external Oracle database.

See http:/​/​www.cisco.com/​c/​en/​us/​td/​docs/​switches/​datacenter/​sw/​7_x/​dcnm/​installation/​master_files/​OVA_​Installation_​Guide.html for instructions on configuring the Oracle database for Cisco DCNM.

Devices are not able to query the network database.

The cause can be one of the following:
  • The Lightweight Directory Access Protocol (LDAP) is not up and running.

  • There is a problem with the connection to the profile database.

  1. Log in to the DCNM web UI.

  2. On the menu bar, choose Admin > DFA Settings.

  3. Determine if the LDAP used as the network/profile database is local or external to the Cisco Prime DCNM virtual appliance.

    • If LDAP is local, use the appmgr status ldap command to verify that LDAP is up and running.

    • If LDAP is external, verify that LDAP is up and running.

  4. Log in to the device and turn on debugging using the debug adbm trace command and the debug fabric forwarding auto-config evt command.

  5. Test the connection to the network database using the test fabric database network segment 12300 command.

  6. Test the connection to the profile database using the test fabric database profile MyTestProfile command.

  7. Use the show fabric database statistics command on the device to review the primary databases that the switch is connected to and the success and failure counts.

  8. Use the show logging logfile command to display messages in the log file.

Troubleshooting XMPP Issues

This section describes troubleshooting procedures for various scenarios that involve the Extensible Messaging and Presence Protocol (XMPP) application within the Cisco Prime DCNM OVA deployment for Cisco Dynamic Fabric Automation (DFA) .

Symptom Cause Resolution

The XMPP application is down.

Either XMPP has not come up shortly after the OVA deployment (it takes a few minutes) or a fully qualified domain name was not used during OVA deployment.

  1. Check the logs at /root/postInstallApps.log.

  2. Make sure that a fully qualified name was entered for the hostname attribute during the OVA deployment.

Devices are having problems connecting to XMPP.

The device user and groups have not been created.

  1. Review the log files for XMPP and Jabber.

    • Log file location for XMPP: /usr/local/

      cisco/dcm/fm/logs/fms_xmpp.log

    • Log file location for Jabber: /opt/jabber/xcp/

      var/log/jabberd.log

  2. If devices cannot connect, make sure that device users and groups have been created.

Devices are not able to join XMPP groups.

XMPP may not be up and running.

  1. Log in to the DCNM web UI.

  2. Determine if XMPP is local or external to the Cisco Prime DCNM virtual appliance.

    • If XMPP is local, use the appmgr status xmpp command to verify that the XMPP is up and running.

    • If XMPP is external, verify that XMPP is up and running:

      1. Use the show fabric access group command to display the groups that a device is subscribed to, or to display a list of members existing in a particular group.

      2. Use the show fabric access group device command to list the groups that are currently logged in to the device.

      3. Use the show fabric access connection command to display the connection status of a device or a user in the fabric access network.

A device is not re-pulling from LDAP when the autoconfiguration network is updated.

The XMPP server is down.

Use the appmgr status xmpp command to make sure the XMPP server is up and running.

The device is pointed to an incorrect XMPP server.

  1. Log in to the DCNM web UI.

  2. Click the View Details link to view more information.

  3. Make sure that the device points to the correct XMPP server.

  4. Choose Admin > Settings.

  5. Make sure the Cisco Prime DCNM uses the XMPP user specified on the DFA Settings page to join the group.

The device is not accepting SVI updates.

Use the clear fabric database host vni segmentID re-apply command to verify that the device accepts the SVI update. This update is sent by Cisco Prime DCNM for device notification.

Cisco Prime DCNM displays fewer devices that have matched virtual machines (VMs) or virtual routing and forwarding (VRF) instances that are associated with them.

The VM or VRF is not associated with the device.

  1. Use the show evb host command to verify that the VM is associated with the device.

  2. Use the show vrf command to verify that the VRF is associated with the device.

The device is pointing to the incorrect XMPP server.

  1. Log in to the DCNM web UI.

  2. Click the View Details link to view more information.

  3. Make sure that the device points to the correct XMPP server.

  4. In the Cisco DCNM web UI, choose Admin > Settings.

  5. Make sure that Cisco Prime DCNM uses the XMPP users specified on the DFA Settings page.

  6. Make sure that Cisco Prime DCNM and the device joined the same XMPP group.

The device could be timed out.

  1. Log in to the DCNM web UI.

  2. On the menu bar, choose Admin > Settings.

  3. Verify if the XMPP response timeout specified on the DFA Settings page allows enough time.

The XMPP server is down.

Use the appmgr status xmpp command to ensure that the XMPP server is up and running.

Troubleshooting DHCP Issues

This section describes troubleshooting procedures for various scenarios that involve DHCP in the Cisco Prime DCNM OVA installation for Cisco Dynamic Fabric Automation (DFA).

Symptom Cause Resolution

DHCP does not come up after you use the appmgr setup ha command.

No free IP address ranges were entered for the default scope.

  1. Log in to the DCNM web UI.

  2. On the menu bar, choose Configuration > POAP > DHCP Scope.

  3. Enter the free IP address ranges for the default scope: enhanced_fabric_mgmt_scope.

Devices do not see a DHCP server or receive a DHCP response.

DHCP is not running; no lease is available.

  1. Use the appmgr status dhcp command to determine if DHCP is running.

  2. Check the /var/log messages for any error message from DHCP or to determine if no lease is available.

The IP address is not active.

Review the following file to determine whether the IP address is active (allocated), aborted (no more address is available), or whether the device is free or has released the allocated address after Power-on Auto Provisioning (POAP):

file/var/lib/dhcpd/dhcpd.leases

The DHCP packet is sent to the incorrect interface.

Use TCPDUMP on port 67 and 68 to make sure that the DHCP packet is sent to the correct interface (eth1): tcpdump-ieth1-vv"port67 or port68"

The DHCP client is not getting an IP address when using Cisco Prime DCNM as the DHCP server.

The backbone VLAN subnet is missing.

  1. Log in to the DCNM Web UI.

  2. On the menu bar, choose Admin >DFA > Settings > DHCP.

  3. Click the Edit scope icon to edit the scope and add a VLAN subnet.

Troubleshooting AMQP Issues

This section describes troubleshooting procedures for various scenarios that involve the AMQP application within the Cisco Prime DCNM OVA deployment for Cisco Dynamic Fabric Automation.
Symptom Cause Resolution

Cisco Prime DCNM is not sending or is missing AMQP notifications.

The AMQP server is not up or the exchange does not exist on the AMQP server.

  1. Use the

    appmgr status amqp command to make sure that the AMQP server is up and running.

  2. Use the rabbitmqctl list exchanges command to make sure the exchange specified on the DFA Settings page in DCNM exists on the AMQP server.

  3. Please ensure that the Use local DHCPd for DFA option is selected.

  4. Enter the subnet address that corresponds to the iBGP or backbone VLAN employed in the DFA cluster managed by this DCNM.

  5. Make sure that a fully qualified domain name was entered for the hostname attribute during the OVA deployment.

Troubleshooting LDAP Issues

This section describes troubleshooting procedures for various scenarios that involve Lightweight Directory Access Protocol ( LDAP) within the Cisco Prime DCNM OVA deployment for Cisco Dynamic Fabric Automation.

Symptom Cause Resolution

The following system error message is displayed:500 Internal Server Error. LDAP server communication failure. Failed to add new scope, because of IP Range values already in use.

The IP address range is already in use or has overlapped with another network.

  1. Log in to the Cisco Prime DCNM web UI.

  2. On the menu bar, choose Configuration > POAP > DHCP Scope.

  3. Update the free IP address ranges for the default scope: enhanced_fabric_mgmt_scope.

A device is not re-pulling from LDAP when the autoconfiguration network is updated.

The device is pointed to an incorrect XMPP server.

  1. Use the appmgr status xmpp command to make sure that the XMPP server is up and running.

  2. Make sure that the device points to the correct XMPP server.

  3. In the Cisco Prime DCNM web UI, choose Admin > Settings.

  4. Make sure that Cisco Prime DCNM uses the XMPP user specified on the DFA Settings page to join the group.

The device is not accepting Switch Virtual Interface (SVI) updates.

Use the clear fabric database host vni segmentID re-apply command to verify that the device accepts the SVI update. This update is sent by Cisco Prime DCNM for device notification.

A device does not autoconfigure SVIs.

The device is pointed to an incorrect LDAP server.

  1. Ensure that the device points to the correct LDAP server for both the network and profile type data.

  2. Enable debugging on the device.

The device is not properly configured in LDAP.

Ensure that the network is properly configured in LDAP. All fields must be filled in.

The device does not have proper licenses installed.

Ensure the device has the proper licenses installed:
  • ENTERPRISE_PKG

  • ENHANCED_LAYER2_PKG

  • LAN_BASE_SERVICES_PKG

  • LAN_ENTERPRISE_SERVICES_PKG

The Org/Partition drop-down list is empty in DCNM.

LDAP is unreachable or the Org/Partition definition is unavailable in the LDAP.

  1. Log in to the DCNM web UI.

  2. Review the DFA Health tab to review the health of the LDAP and its contents.

Troubleshooting High Availability Issues

This section describes troubleshooting procedures for various scenarios that involve a high availability (HA) environment after a Cisco Prime DCNM OVA installation for Cisco Dynamic Fabric Automation (DFA).

Symptom Cause Resolution
     

DHCP does not come up after you use the appmgr setup ha command

No free IP address ranges were entered for the default scope.

  1. Log in to the Cisco Prime DCNM web UI.

  2. On the menu bar, choose Configuration > POAP > DHCP Scope.

  3. Enter the free IP address ranges for the default scope: enhanced_fabric_mgmt_scope.

It is difficult to access or bring up applications or virtual IP addresses after you set up an HA environment.

  • The HA setup has issues.

  • Listening inconsistencies have occurred.

  1. Review the following log for HA setup issues: /root/cluster/ha.log

  2. Determine any listening inconsistencies.

The following example shows that the Cisco Prime DCNM virtual IP address is listening on the HTTP port on the server on which the ipvsadm command is entered (indicated by 'Local') and that AMQP is listening on the peer server (indicated by 'Route').

[root@dcnm139 ~]# ipvsadm

IP Virtual Server version 1.2.1 (size=4096)

IP Virtual Server version 1.2.1 (size=4096)

Prot LocalAddress:Port Scheduler Flags

-> RemoteAddress:Port Forward Weight ActiveConn InActConn

TCP 192.168.57.157:http wlc

-> 192.168.57.139:http Local 1 0 0

TCP 10.77.247.157:amqp wlc

-> dcnm155:amqp Route 1 0 0