Revision History
Revision |
Date |
Description |
---|---|---|
A0 |
December 09, 2024 |
Created release notes for 1.0.0.240001 for the following servers:
|
Overview of the Cisco UCS C885A M8 Rack Server
The Cisco UCS C885A M8 Rack Server is a dense-GPU server designed to deliver massive, scalable accelerated compute capabilities to address the most demanding AI workloads, including deep learning/Large Language Model (LLM) training, model fine-tuning, large model inferencing, and Retrieval-Augmented Generation (RAG).
To deliver massive accelerated compute performance in a single server, the server offers a choice of eight GPUs of the following types:
-
NVIDIA® H100 SXM or NVIDIA® H200 Server PCI Express Module (SXM) GPUs. SXM is a socket-based GPU interconnect method used by NVIDIA GPUs.
-
AMD MI300X OCP Accelerator Model (OAM) GPUs. OAM is an Open Compute GPU interconnect standard that avoids GPU vendor lock in.
For north-south traffic, the server also supports one NVIDIA Bluefield-3 B3220 DPU to scale AI model training across a cluster of dense-GPU servers. Up to eight NVIDIA ConnectX-7 or Bluefield-3 B3140H SuperNIC are supported for east-west traffic between GPUs.
Introduction
The Cisco Baseboard Management Controller (Cisco BMC) web GUI is HTML5 based and added security with SSL (HTTPS). It helps you manage the Cisco UCS C885A M8 Rack Server using the following options:
Hardware and Component Management
The Inventory feature enables administrators to record hardware devices and components on each server, such as central processing units (CPUs), memory modules, hard drives, network cards, and more.
Status and Checks
The Inventory feature also provides status and checks for hardware and software devices. This information can include device health status, temperature, voltage, connection status, and more.
Supported Platforms
The following servers are supported in release 1.0.0.240001:
-
Cisco UCS C885A M8 Rack Server
Operating System and Browser Requirements
Cisco recommends the following browsers:
Recommended Browser |
Version Tested |
Minimum Recommended Operating System |
---|---|---|
Mozilla Firefox |
132.0.2 (AArch64) |
macOS 15.1 (24B83) |
132.0 (64-bit) |
Ubuntu 20.04.3 LTS |
|
132.0.2 (64-bit) |
Microsoft Windows 11 Enterprise |
|
Apple Safari |
Version 18.1 (20619.2.8.11.10) |
macOS 15.1 (24B83) |
Google Chrome |
131.0.6778.71 (64-bit) |
Microsoft Windows 11 Enterprise |
Microsoft Edge |
131.0.2903.51 (64-bit) |
Microsoft Windows 11 Enterprise |
Default Ports
Following is a list of server ports and their default port numbers:
Port Name |
Port Number |
---|---|
HTTP |
80 |
HTTPS |
443 |
SSH |
22 |
SSH (SSH based SOL) |
2200 |
IPMI |
623 |
Firmware Files
The 1.0.0.240001 software release includes the following software files:
CCO Software Type |
File Name |
---|---|
Firmware for OOB (BMC, BIOS, GPU, and FPGA) and In Band (ConnectX7, Bluefield, and OCP) components. |
ucs-c885a-m8-1.0.0.240001.tar.gz |
Upgrade script and Readme file |
ucs-c885a-m8-upgrade-script-v1.0.tar.gz |
Open Caveats
Open Caveats in Release 1.0.0.240001
The following defects are open in Release 1.0.0.240001:
Defect ID |
Symptom |
Workaround |
First Affected Release |
---|---|---|---|
CSCwn01691 |
Certain BMC configurations such as NTP, LDAP, and Power Cap revert to their default disabled state following a BMC update. |
There is no known workaround. |
1.0.0.240001 |
CSCwn34288 |
GPU firmware updates for both AMD and Nvidia models may fail across all servers when attempted through the Cisco IMC Web UI or Redfish API. This issue is observed during firmware updates. |
Update GPU firmware using the host-based utility provided by the GPU manufacturer. |
1.0.0.240001 |
Known Behaviors and Limitations
Known Behaviors and Limitations in Release 1.0.0.240001
The following caveats are known limitations in release 1.0.0.240001:
Defect ID |
Symptom |
Workaround |
First Affected Release |
---|---|---|---|
CSCwn16450 |
Cisco UCS C885A M8 Rack ServerH100 variant, updating the HIB FPGA firmware may sometimes result in the host failing to power on, due to a power unit failure. This issue has been observed with certain firmware updates. |
Perform an AC power cycle to ensure the activation of the FPGA and GPU firmware updates. |
1.0.0.240001 |