Troubleshoot NCS 1010 Setup and Upgrade

Use the procedures in this section to troubleshoot NCS 1010 bring-up, software upgrade or downgrade by understanding the problem, probable cause, and the solution.

The following image shows the tasks involved in finding solutions to NCS 1010 setup and upgrade issues:

This section contains the following topics:

Recover NCS 1010 From Boot Failure

If the command line interface is not accessible, you can recover the NCS 1010 from a boot failure using one of these recovery methods.

Boot the NCS 1010 Using USB Drive

Problem:

After installing the hardware, you boot the NCS 1010 after connecting to the console port and powering ON the NCS 1010. The NCS 1010 initiates the boot process using the pre-installed operating system (OS) image. But the NCS 1010 fails to boot, times out or stops responding after the boot process initializes.

Cause:

The NCS 1010 does not boot if an install image is not present on the NCS 1010 or the image is corrupt.

Solution:

Boot the NCS 1010 using a bootable USB flash drive.

The bootable USB flash drive is used to reimage the NCS 1010 during system upgrade or boot the NCS 1010 in case of boot failure. During the USB boot process, the NCS 1010 is re-imaged with the version available on the USB flash drive.

To boot the NCS 1010 using a USB flash drive, you need the following devices:

  • A local machine (Windows, Linux, or MAC) with USB Type-A.

  • USB flash drive with a storage capacity that is between 8GB (min) and 32 GB (max). USB 2.0 and USB 3.0 are supported.


    Note


    USB Type-C is not supported.

Procedure


Step 1

Create a bootable USB flash drive from your local machine (Windows or MAC):

  1. Connect the USB flash drive to your local machine and format it with File Allocation Table (FAT) 32 file system using the Windows Operating System or Apple MAC Disk Utility. Formatting the USB drive to FAT creates addressable sectors that ensures that each piece of information in the file can be found by the computer.

    After formatting the USB flash drive, right-click on the USB disk and view the properties.

  2. On the Software Download page, navigate to the required Cisco IOS XR product and release. The USB boot image is available in the format <platform>-usb-<version>.zip compressed file. For example, the USB boot image for Cisco NCS 1010s for release 24.3.1 is 1010-x64-usb-24.3.1.zipASR9K-x64-usb-24.3.1.zip file.

  3. Download the compressed USB boot image from the Software Download page to your host computer.

  4. Verify that the copy operation is successful. To verify, compare the file size on the Software Download page and the copied file on your computer. You can also verify the MD5 checksum value. This value ensures that the copied file is valid and untampered.

  5. Unzip the file to extract the content of the compressed boot file inside the USB flash drive. This converts the USB flash drive to a bootable drive.

    Note

     
    The content of the zipped file (EFI and boot directories) should be extracted directly into the root of the USB flash drive. If the unzipping application places the extracted files in a new folder, move the EFI and boot directories to the root folder of the USB flash drive.
  6. Remove the USB flash drive from your computer.

    The USB flash drive is ready to be used as a bootable disk to install and boot the Cisco IOS XR image.

Step 2

Boot the NCS 1010 using the bootable USB flash drive.

  1. Use this procedure only on active RP; the standby RP must either be powered OFF or removed from the chassis. After the active RP is installed with images from USB, insert or power ON the standby RP as appropriate.

  2. Connect to the console.

  3. Insert the USB flash drive in the USB Port Type-A on the NCS 1010.

    Ensure that the NCS 1010 is powered ON. When the USB bootable drive is plugged into an operational NCS 1010, the device is detected as disk2:. Verify using show media location all command.

    RP/0/RP0/CPU0:ios#show media location all
    Fri Jan 27 08:29:00.808 UTC
    
    Media Info for Location: node0_RP0_CPU0
    Partition         Size           Used          Percent        Avail
    --------------------------------------------------------------------
    rootfs:           54.4G          16.5G          30%           38G
    data:             77.3G          20.5G          27%           56.8G
    disk0:            3.9G           12M            1%            3.6G
    /var/lib/docker   6.6G           17M            1%            6.2G
    disk2:            15G            6.1G           42%           8.6G
    log:              5.3G           572M           12%           4.4G
    harddisk:         61G            19G            32%           39G
  4. View the contents of the USB drive.

    Example:

    RP/0/RP0/CPU0:ios#dir disk2:
  5. Initiate the reimage from the USB bootable drive.

    Example:

    RP/0/RP0/CPU0:ios#reload bootmedia usb noprompt
    RP/0/RP0/CPU0:ios#hw-module location all bootmedia usb

    Note

     
    If the NCS 1010 was powered OFF, power ON the NCS 1010. Press the Esc key continuously to pause the boot process and get the RP to the BIOS menu. Use the arrow key and navigate to the USB Flash Memory option in the Boot Manager menu, and press the Enter key. The BIOS GRUB automatically detects the image from the USB flash drive, starts the installation, and displays the progress of the installation operation.

    The NCS 1010 reboots after the reimage with new version available in the USB drive. After the installation is complete, the NCS 1010 reboots and enters the prompt to configure the root username and password.


Boot the NCS 1010 Using iPXE

Problem:

You connect to the console port and power ON the NCS 1010. The NCS 1010 initiates the boot process using the pre-installed operating system (OS) image. But the NCS 1010 fails to boot, times out or stops responding after the boot process initializes.

Cause:

The NCS 1010 does not boot if an install image is not present on the NCS 1010 or the image is corrupt.

Solution:

Boot the NCS 1010 using the image from an iPXE server.

iPXE is a pre-boot execution environment that is included in the network card of the management interfaces. It works at the system firmware (UEFI) level of the NCS 1010. iPXE enables network boot for a NCS 1010 that is offline. The bootloader downloads and installs the ISO image located on an HTTP, FTP, or TFTP server. iPXE boot re-images the NCS 1010. iPXE acts as a boot loader and provides the flexibility to choose the image that the system will boot based on the Platform Identifier (PID), the serial number, or the management MAC address. iPXE must be defined in the DHCP server configuration file.

Procedure


Step 1

Configure the DHCP server for IPv4, IPv6, or both communication protocols before you use the iPXE boot.

  1. Create dhcpd.conf file in /etc/ or /etc/dhcp directory. This configuration file stores the network information such as the path to the script, location of the ISO install file, location of the provisioning configuration file, serial number, MAC address of the NCS 1010. The following example shows a sample dhcpd.conf file.

    Example:

    allow bootp;
    allow booting;
    ddns-update-style interim;
    option domain-name "cisco.com";
    option time-offset -8;
    ignore client-updates;
    default-lease-time 21600;
    max-lease-time 43200;
    option domain-name-servers <ip-address-server1>, <ip-address-server2>;
    log-facility local0;
     :
    subnet <subnet> netmask <netmask> {
      option ncs1010 <ip-address>;
      option subnet-mask <subnet-mask>;
      next-server <server-addr>;
    }
      :
    host <hostname> {
      hardware ethernet e4:c7:22:be:10:ba;
      fixed-address <address>;
      filename "http://<address>/<path>/<image.bin>";
  2. Test the server once the DHCP server is running. For example, for IPv4 protocol:

  • Use the MAC address of the NCS 1010:

    Note

     
    Using the host statement provides a fixed address that is used for DNS, however, verify that option 77 is set to iPXE in the request. This option is used to provide the boot file to the system when required.
    host <platform>
    {
    hardware ethernet <ncs1010-mac-address>;
    if exists user-class and option user-class = "iPXE" {
    	filename = "http://<httpserver-address>/<path-to-image>/<image>";
    }
    
    Ensure that the above configuration is successful.
  • Use the serial number of the NCS 1010:
    host <platform> 
    {
    option dhcp-client-identifier "<ncs1010-serial-number>";
      filename "http://<IP-address>/<path-to-image>/<image>";
      fixed-address <IP-address>;
    }
    The serial number of the NCS 1010 is derived from the BIOS and is used as an identifier.

Step 2

Recover the NCS 1010 using iPXE boot.

  1. Connect to the console.

  2. Power ON the NCS 1010.

  3. Press Esc key continuously to pause the boot process and get the RP to the BIOS menu.

  4. Use the arrow key and navigate to the Built-in EFI iPXE option in the Boot Manager menu, and press the Enter key.

    Example:

    iPXE> ifstat
    net0: 00:a0:c9:00:00:00 using i350-b on PCI01:00.0 (closed)
      [Link:up, TX:0 TXE:0 RX:0 RXE:0]
    net1: 00:a0:c9:00:00:01 using i350-b on PCI01:00.1 (closed)
      [Link:up, TX:0 TXE:0 RX:0 RXE:0]
    net2: 00:a0:c9:00:00:02 using i350-b on PCI01:00.2 (closed)
      [Link:down, TX:0 TXE:0 RX:0 RXE:0]
      [Link status: Down (http://ipxe.org/38086193)]
    net3: 00:a0:c9:00:00:03 using i350-b on PCI01:00.3 (closed)
      [Link:down, TX:0 TXE:0 RX:0 RXE:0]
      [Link status: Down (http://ipxe.org/38086193)]
    net4: 00:00:00:00:00:04 using dh8900cc on PCI02:00.1 (closed)
      [Link:down, TX:0 TXE:0 RX:0 RXE:0]
      [Link status: Down (http://ipxe.org/38086193)]
    net5: 00:00:00:00:00:05 using dh8900cc on PCI02:00.2 (closed)
      [Link:down, TX:0 TXE:0 RX:0 RXE:0]
      [Link status: Down (http://ipxe.org/38086193)]
    net6: 04:62:73:08:57:86 using dh8900cc on PCI02:00.3 (closed)
      [Link:up, TX:0 TXE:0 RX:0 RXE:0]
    
    iPXE> set net6/ip 192.0.2.255
    iPXE> set net6/netmask 255.255.255.0
    iPXE> set net6/gateway 10.48.42.1
    iPXE>
    iPXE> ifopen net6
    
    iPXE> ping 10.48.42.1
    64 bytes from 10.48.42.1: seq=1
    64 bytes from 10.48.42.1: seq=2
    Finished: Operation canceled (http://ipxe.org/0b072095)
    
  5. Boot the image using one of the following options:

    • Option 1: Boot with ISO image. After the reimage is successful, add optional RPMs, bug fixes and update running configuration file.

    • Option 2: [Preferred option] Boot with Golden ISO (GISO) image that contains the ISO image, optional RPMs, bug fixes and configuration file. Booting with GISO saves time by eleminating the need to update the files individually.

You must keep the standby RP in the BIOS while installing the image on the active RP.

BIOS Ver: 09.19 Date: xx/xx/xxxx 17:02:33

Press <DEL> or <ESC> to enter boot manager.                                     iPXE initialising devices...ok

iPXE 1.0.0+ (5fbe7) -- Open Source Network Boot Firmware -- http://ipxe.org
Features: DNS HTTP TFTP VLAN EFI ISO9660 NBI Menu
BootMode : 1
Trying net0...
net0: 00:00:01:1c:00:00 using i350-b on PCI01:00.0 (open)
  [Link:up, TX:0 TXE:0 RX:0 RXE:0]
Configuring (net0 00:00:01:1c:00:00).................. ok
net0: 203.0.113.1/255.255.255.0
net0: fe80::2a0:c9ff:fe00:0/64
net1: fe80::2a0:c9ff:fe00:1/64 (inaccessible)
net2: fe80::2a0:c9ff:fe00:2/64 (inaccessible)
net3: fe80::2a0:c9ff:fe00:3/64 (inaccessible)
net4: fe80::200:ff:fe00:4/64 (inaccessible)
net5: fe80::200:ff:fe00:5/64 (inaccessible)
net6: fe80::662:73ff:fe08:1dba/64 (inaccessible)
Next server: 203.0.113.17
Filename: http://203.0.113.15/system_image.iso
http://203.0.113.15/<image>... ok

The BIOS GRUB automatically detects the image from the iPXE server, starts the installation, and displays the progress of the installation operation. After the installation is complete, the NCS 1010 reboots and enters the prompt to configure the root username and password.

You can also boot the NCS 1010 from the iPXE server by using the hw-module location all bootmedia network reload command.

RP/0/RP0/CPU0:ios# hw-module location all bootmedia network reload
Wed Dec 23 15:29:57.376 UTC
Reload hardware module ? [no,yes]

This command configures the NCS 1010 to perform a network-based boot across all modules in the NCS 1010 before a restart. Upon reload, the NCS 1010 attempts to load the operating system image from the specified iPXE server.


Recover Password

Problem:

Unable to access the NCS 1010 due to incorrect login credentials.

Cause:

A root password is used to login to the NCS 1010. If you forget this root password, you cannot access the NCS 1010.

Solution:

If you lose your admin and root user credentials, the NCS 1010 becomes inaccessible. The system can be recovered using a NCS 1010 reimage using iPXE or USB boot. However, this approach is not scalable.

You can use the system recovery feature to recover the lost password.

With this feature, the system is recovered without the need to reimage the NCS 1010. The system is recovered to its initial state with the current running software. The installed software and SMUs are retained after the system is recovered. The process complies with the Cisco Product Security Baseline (PSB) where user data is securely erased before recovering the NCS 1010. The following data that are generated at run-time are erased:

  • XR and admin configuration including the password data

  • Cryptographic keys on the disk

  • Data on encrypted partition

  • Generated core files

  • SNMP interface index files

  • Third-party application (TPA) software and data

  • Files created by the user

Use the following procedure to recover the password on NCS 1010.


Note


This procedure is applicable only when you have already enabled the password recovery feature on your NCS 1010.
RP/0/RP0/CPU0:ios(config)#system recovery

Procedure


Step 1

Power ON the NCS 1010, and press the ESC on the RP console to enter the BIOS GRUB menu.

This procedure must be executed on each RP individually on a modular system.

Step 2

Boot on the standby RP. Press ESC key to enter the GRUB (bootstrap program) menu.

Step 3

On the RP0 card console select the IOS-XR-recovery option from the GRUB menu and press Enter.

Step 4

Select the IOS-XR-recovery option from the GRUB menu and press Enter on the card console when the Initiating IOS-XR System Recovery... message is displayed on the card console.

Note

 
Do not wait until the card reaches the Enter root-system username: prompt. If you reach this prompt, the card will reload automatically and exit the BIOS GRUB menu. The card will boot up as active post the recovery process.

Step 5

On the RP card, create a new root user and password. Log in to the NCS 1010 using the new root username and password.

The NCS 1010 boots with the default configuration. Proceed with configuring the NCS 1010 or load a configuration from a backup file if you had already taken a backup. It is recommended to backup data and save the configuration on an external server.

Ensure that you see this message in the RP console. If this message is not displayed, then repeat the process from step 1 to step 5 until you see the message:

RP/0/RP1/CPU0:June 10 06:13:24.551 CEST: sys_rec[1188]: %SECURITY-SYSTEM_RECOVERY-1-REPORT : 
System Recovery at 06:10:19 CEST Fri June 10 2022 was successful

RP/0/RP1/CPU0:June 10 06:15:13.967 CEST: sys_rec[1188]: %SECURITY-SYSTEM_RECOVERY-1-REPORT : 
System Recovery 

The password recovery procedure is complete.

The option to recover the system using console port is disabled on bootup because all the previous configurations are erased. With this configuration disabled, if you select IOS-XR-recovery option from GRUB menu to recover the system, the recovery is skipped. Enable the password recovery feature again using the system recovery command.


Rectify Insufficient Disk Space When Installing Software

Problem:

The software installation terminates with the error Error on 0/1/CPU0: Insufficient disk space to install packages.

Cause:

To install the Cisco IOS XR software, an unused disk space of so-and-so must be available on the NCS 1010. If this space is not available before installing the software, the installation process terminates with the error.

Solution:

Identify the required disk space using the show install log or install add command.

View the space consumed by the harddisk: location using the show media location all command.

RP/0/RP0/CPU0:ios#show media location all
Wed Jan 8 08:29:00.808 UTC

Media Info for Location: node0_RP0_CPU0
Partition                            Size     Used  Percent    Avail
--------------------------------------------------------------------
rootfs:                             54.4G    16.5G      30%      38G
data:                               77.3G    20.5G      27%    56.8G
disk0:                               3.9G      12M       1%     3.6G
/var/lib/docker                      6.6G      17M       1%     6.2G
disk2:                                15G     6.1G      42%     8.6G
log:                                 5.3G     572M      12%     4.4G
harddisk:                             61G      19G      32%      39G

Media Info for Location: node0_RP1_CPU0
Partition                            Size     Used  Percent    Avail
--------------------------------------------------------------------
rootfs:                             54.3G    16.5G      30%    37.9G
data:                               77.4G    46.1G      60%    31.4G
disk0:                               3.9G     8.5M       1%     3.6G
/var/lib/docker                      6.6G      19M       1%     6.2G
log:                                 5.3G     492M      10%     4.5G
harddisk:                             61G      44G      78%      14G

Media Info for Location: node0_0_CPU0
Partition                            Size     Used  Percent    Avail
--------------------------------------------------------------------
rootfs:                             54.4G    10.1G      18%    44.4G
data:                               77.3G     1.9G       2%    75.5G
/var/lib/docker                      6.6G      16M       1%     6.2G
disk0:                               3.9G     8.2M       1%     3.6G
harddisk:                             61G     109M       1%      57G
log:                                 5.3G     372M       8%     4.6G
          
Media Info for Location: node0_6_CPU0
Partition                            Size     Used  Percent    Avail
--------------------------------------------------------------------
rootfs:                             54.4G    10.1G      18%    44.4G
data:                               77.3G     1.9G       2%    75.4G
disk0:                               3.9G     8.3M       1%     3.6G
/var/lib/docker                      6.6G      16M       1%     6.2G
harddisk:                             61G     154M       1%      57G
log:                                 5.3G     374M       8%     4.6G

Use the following procedure to free up the disk space to make room for the software installation.

Procedure


Step 1

Remove inactive packages from the system.

Example:

View the inactive packages:
RP/0/RP0/CPU0:ios(admin)#show install inactive
6 inactive package(s) found:
    ncs5500-xr-6.6.1
    ncs5500-k9sec-3.1.0.0-r661
    ncs5500-mpls-2.1.0.0-r661
    ncs5500-isis-2.1.0.0-r661
    ncs5500-mcast-2.1.0.0-r661
    ncs5500-mgbl-3.0.0.0-r661
Remove the inactive packages:
RP/0/RP0/CPU0:ios(admin)#install remove inactive all synchronous
   instdir[198]: %INSTALL-INSTMGR-6-INSTALL_OPERATION_STARTED : 
Install operation 8 '(admin) install remove inactive all' started by user 'user_b' 
Install operation 8 '(admin) install remove inactive all' started by user 'user_b' at
   09:25:41 UTC Fri June 10
Info:     This operation will remove the following package:
ncs5500-xr-6.6.1
    ncs5500-k9sec-3.1.0.0-r661
    ncs5500-mpls-2.1.0.0-r661
    ncs5500-isis-2.1.0.0-r661
    ncs5500-mcast-2.1.0.0-r661
    ncs5500-mgbl-3.0.0.0-r661
Proceed with removing these packages? [confirm]
The install operation will continue synchronously.

Step 2

Remove stale or unnecessary files from the harddisk: location such as cores, debug logs, kdump and showtech data. We recommended that you do not remove files from other partitions because these locations may contain files that are relevant to collecting debug information. Carefully inspect the files to be deleted.

Example:

RP/0/RP0/CPU0:ios#rmdir harddisk:
Remove directory filename []?newdir
Delete harddisk:/newdir[confirm]y

Use the delete command to remove specific directory or files. When a directory contains files such as images, bug fixes or configuration files, you must remove the files before deleting the directory.

RP/0/RP0/CPU0:ios#delete harddisk:/file

Verify that the unwanted directory is removed from the harddisk.

RP/0/RP0/CPU0:ios#dir harddisk:
	Directory of harddisk:
	37146       drwx  4096        Sun Dec 14 15:30:48 2008  malloc_dump
	43030       drwx  4096        Wed Dec 24 11:20:52 2008  tracebacks
	43035       drwx  4096        Thu Jan  8 18:59:18 2009  sau
	51026       drwx  4096        Sat Dec 27 02:52:46 2008  tempA
	51027       drwx  4096        Sat Dec 27 02:04:10 2008  dir.not.del
	-430307552  -rwx  342         Fri Jan 16 10:47:38 2009  running-config
	-430305504  -rwx  39790       Mon Jan 26 23:45:56 2009  cf.dat
39929724928 bytes total (39883235328 bytes free)

Recover Frozen Console Prompt

Problem:

The console access is frozen and does not respond. In this state, no output or input characters are displayed on the console.

Cause:

The Priority Flow Control (PFC) functionality is enabed on the console by default. The PFC is also referred to as Class-based Flow Control (CBFC) or Per Priority Pause (PPP) is a mechanism that prevents frame loss due to congestion. Pressing the Ctrl + S keys enables the flow control and no output will be seen on the XR console until resumed.

Solution:

Reset the console prompt.

Procedure


Press the Ctrl + Q keys to resume the console output.