Troubleshooting the Cisco Nexus 9000v Platform
General Troubleshooting/Debugging
The following CLI command provides troubleshooting help for both the Nexus 9300v and Nexus 9500v platforms:
show tech-support nexus9000v
The following is an example output of this command:
switch# show tech-support nexus9000v
------------------ Virtual Chassis Manager Debugs ------------------
##############
# /cmn/pss/virt_cmgr.log
##############
[19-12-10 20:42:34.160609]: virt_cmgr_startup_init called
[19-12-10 20:42:34.161351]: virt_cmgr_validate_file returned success
[19-12-10 20:42:34.161390]: Version 1, VNIC_scheme 2
[19-12-10 20:42:34.161404]: VM sup1: Module no 26, upg_version 1, type 1, card_i
ndex 0, image loc None
…
…
…
Common Issues for All Hypervisors
Boot when VM drops into "loader >" prompt
Generally, the initial boot is successful. However, the system boot could fail and drop into the "loader >" prompt on the VGA console or serial console, depending on how you provisioned the VM.
Example:
Loader Version 5.9
Loader > dir
bootflash::
.rpmstore
nxos.9.3.2.20.bin
bootflash_sync_list
.swtam
eem_snapshots
virtual-instance
scripts
platform-sdk.cmd
loader > boot nxos.9.3.2.20.bin
To continue the boot, enter the boot nxos.9.3.2.20.bin command at the "loader >" prompt
Prevent VM from dropping into "loader >" prompt
After you set up your Cisco Nexus 9000v (and following the set-up of the POAP interface), configure the boot image in your system to avoid dropping to the "loader >" prompt after reload/shut down.
Example:
nx-osv9000-2# config t
Enter configuration commands, one per line. End with CNTL/Z.
nx-osv9000-2(config)# boot nxos bootflash:nxos.9.3.2.20.bin
Performing image verification and compatibility check, please wait....
nx-osv9000-2(config)# copy running-config startup-config
Bootup Warning Message
During bootup, you may get a warning message similar to the following:
Checking all filesystems. **Warning** : Free memory available in bootflash is
553288 bytes
need at least 2 GB space for full image installation ,run df -h
This message generally indicates that the Nexus 9000v bootflash doesn't have enough memory space for holding another image. To eliminate this warning message, free up bootflash space to allow for the download of another binary image.
Nexus 9000v Mac-Encoded Mode Network Mapping Check
This check is only relevant if you explicitly enter the platform vnic scheme mac-encoded command on Nexus 9500v platform. This command enables the vNIC mac-encoded scheme. If any data traffic passes, or vNIC-mapped interfaces show the “Link not connected” state, refer to the Nexus 9000v informational show commands to verify correct vNIC mapping.
ESXi Hypervisor Issues
Nexus 9000v boot not seen after powering on the VM
The likely cause of this issue is that the EFI boot isn't set in the VM configuration. To resolve this issue, refer to the ESXi deployment guide to change "BIOS" to "EFI" in Edit virtual machine settings > VM Options > Boot Options after deployment using the distributed OVA virtual artifacts.
Bootup logs not seen after VGA output
A common problem during ESXi bootup is that the VGA console displays output similar to the following:
Sysconf checksum failed. Using default values
console (dumb)
Booting nxos.9.3.2.6.bin...
Booting nxos.9.3.2.bin
Trying diskboot
Filesystem type is ext2fs, partition type 0x83
Image valid
Image Signature verification for Nexus9000v is not performed.
Boot Time: 12/5/2019 10:38:41
The issue is that, in the VGA console, there's no following activity in the bootup process. It's often misunderstood as a switch bootup process hang. To see the output of a switch bootup, connect to the provisioned serial console based on steps provided in the ESXi hypervisor deployment guide.
If nothing happens in the serial console, or you see the "telnet: Unable to connect to remote host: Connection refused" error message, it indicates one or more of the following issues:
-
The serial console provisioning is incorrect in the VM configuration. Read and follow the instructions for serial console connectivity in the ESXi deployment guide.
-
ESXi 8.0 deployment is the only version supported. Make sure that you have a valid license for ESXi vCenter and a valid UCS server license.
-
Make sure that the "Security Profile" in the server has "VM serial port connected over network", both for incoming connections and outgoing connections.
No access to "loader>" prompt after powering down the VM
This issue occurs if you power on the VM and it boots up as expected, but the serial console wasn't correctly provisioned. Then the “config t; boot nxos bootflash:nxos.9.3.2.20.bin” configure is performed and saved. Powering up the VM again results in a drop to the VGA console.
The following recommendations help to avoid this issue in the ESXi hypervisor.
EFI BIOS defaults all input/output to the VM console. When a VM drops to the "loader >" prompt, go to the vSphere client or VGA console to access the "loader >" prompt to boot the image in the hard disk. You can change this behavior by adding an extra configuration in the ESXi VM editing mode. Use one of the following methods:
-
In the vSphere client Configuration Parameters window, add one row in the configuration (Edit Settings > VM Options > Advanced > Edit Configuration).
-
Add efi.serialconsole.enabled = "TRUE" to the .vmx file once the VM is created.
The vCenter or UCS server connectivity is lost as soon as the Cisco Nexus 9000v is up
Caution |
When connecting a vNIC to a vSwitch or bridge, an incorrect network connection might result in losing the connectivity to your hypervisor server or vCenter on ESXi. |
The Cisco Nexus 9000v uses vNICs entered from a graphical representation on ESXi for networking, either externally or internally within a hypervisor server. The first NIC is always used as the Cisco Nexus 9000v management interface.
The first NIC in the Cisco Nexus 9000v VM is the management interface. Connect it directly to your lab LAN physical switch or vSwitch (VM Network). Don't connect any data port vNIC to any physical switch conflicting with your server management connectivity.
Cisco Nexus 9000v data port isn't passing traffic in the ESXi server
To ensure a smooth operation, specific configuration settings on the vSwitch must be enabled:
-
Ensure that all instances of the vSwitch connecting to the Cisco Nexus 9000v are in "Promiscuous Mode" = "Accept", and pointing to the UCS server. You can access this option through "Configuration > Properties > Edit" from the vSphere Client.
-
Ensure that all instances of vSwitch pass through all VLANs. You can access this option through "Configuration > Properties > Edit" from the vSphere Client.
ESXi 8.0 hypervisor often defaults the network interfaces adapter to the “E1000E” type which isn’t supported in the Nexus 9000v platform. After deployment, make sure that all Network adapter types are “E1000”.
KVM/QEMU Hypervisor Issues
Understanding the KVM/QEMU command line options requires a basic Linux background. In order to deploy the Nexus 9000v in this hypervisor, follow the deployment instruction and pay attention to the following areas:
-
Make sure that the user guide recommends bios.bin.
-
If the command line supports multiple disk inputs, check that the bootable disk is set to bootindex=1 so that the VM doesn't try to boot from other devices.
-
If you're attempting to implement a complicated command line, follow basic KVM/QEMU deployment instruction to bring up a simple switch instance first to verify the user environment.
Multicast on KVM or QEMU Hypervisor
The multicast feature on the Cisco Nexus 9000v is supported as broadcast. To make this feature to work properly, disable IGMP multicast snooping in this environment on all bridge interfaces.
The following example shows how to disable vxlan_br1, vxlan_br2, vxlan_br3, and vxlan_br4 from the linux prompt:
echo 0 > /sys/devices/virtual/net/vxlan_br1/bridge/multicast_snooping
echo 0 > /sys/devices/virtual/net/vxlan_br2/bridge/multicast_snooping
echo 0 > /sys/devices/virtual/net/vxlan_br3/bridge/multicast_snooping
echo 0 > /sys/devices/virtual/net/vxlan_br4/bridge/multicast_snooping
Follow the Linux bridge mask setup in the KVM/QEMU deployment guide, for passing L2 packets such as LLDP, LACP, and others.
Vagrant/VirtualBox Issues
Networking on VirtualBox/Vagrant
To use the dataplane interfaces on VirtualBox/Vagrant, ensure the following:
-
The interfaces must be in "Promiscuous" mode.
-
In the VirtualBox network settings, select "Allow All" for the Promiscuous mode.
-
Ensure all instances of Cisco Nexus 9000v in your topology have unique MAC addresses by using the show interface mac command.
VM normal bootup on VirtualBox/Vagrant:
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Clearing any previously set forwarded ports...
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
default: Adapter 1: nat
==> default: Forwarding ports...
default: 22 (guest) => 2222 (host) (adapter 1)
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
default: SSH address: 127.0.0.1:2222
default: SSH username: vagrant
default: SSH auth method: private key
The configured shell (config.ssh.shell) is invalid and unable
to properly execute commands. The most common cause for this is
using a shell that is unavailable on the system. Please verify
you're using the full path to the shell and that the shell is
executable by the SSH user.
The vagrant ssh command will access the Nexus 9000v switch prompt after the successful normal bootup.
The following is an example of one possible VM bootup failure:
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'base'...
==> default: Matching MAC address for NAT networking...
==> default: Setting the name of the VM: n9kv31_default_1575576865720_14975
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
default: Adapter 1: nat
==> default: Forwarding ports...
default: 22 (guest) => 2222 (host) (adapter 1)
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
default: SSH address: 127.0.0.1:2222
default: SSH username: vagrant
default: SSH auth method: private key
Timed out while waiting for the machine to boot. This means that
Vagrant was unable to communicate with the guest machine within
the configured ("config.vm.boot_timeout" value) time period.
If you look above, you should be able to see the error(s) that
Vagrant had when attempting to connect to the machine. These errors
are usually good hints as to what may be wrong.
If you're using a custom box, make sure that networking is properly
working and you're able to connect to the machine. It is a common
problem that networking isn't setup properly in these boxes.
Verify that authentication configurations are also setup properly,
as well.
If the box appears to be booting properly, you may want to increase
the timeout ("config.vm.boot_timeout") value.
To troubleshoot this failure, check the following:
-
Ensure that enough resources, such as memory and vCPU, are available. Close all applications that consume a significant amount of memory in your PC or server. Check the available free memory.
-
Power down VM by entering vagrant halt –f
-
Go to the VirtualBox GUI after powering down the VM. Enable the VM serial console to observe the boot up process and to view possible issues through "Ports" -> "Enable Serial Port".
Alternatively, use the following VBox command to enable this guest serial console. Find your VM name:
VBoxManage list vms "n9kv_default_1575906706055_2646" {0b3480af-b9ac-47a4-9989-2f5e3bdf263f}
Then enable serial console:
VBoxManage modifyvm n9kv_default_1575906706055_2646 --uart1 0x3F8 4
-
Power up the VM again by entering “vagrant up” from the same terminal, where you entered the original “vagrant up”.
-
To access the serial console, enter “telnet localhost 2023” from another terminal on your computer.
-
Check the bootup issue by observing the output from the serial console.
-
Turn off the serial console if the guest serial console is no longer needed. Either use the following VBox command or go to the VirtualBox GUI setting and de-select “Enable Serial Port”.
VBoxManage modifyvm n9kv_default_1575906706055_2646 --uart1 off