Caveated Support for VMware CPU Reservations and Distributed Resource Scheduler

Go back to: Cisco Collaboration Virtualization


    NOTE: Caveated support for VMware CPU reservations and Distributed Resource Scheduler is limited to the following applications:
    • Cisco Unified Communications Manager (CUCM) 11.5(1) or greater
    • Cisco Unified Communications Manager - IM & Presence (CUCM IM&P) 11.5(1) or greater
    • Cisco Unity Connection (CUC) 11.5(1) or greater

    NOTE: Support for a deployment leveraging this CPU Reservations policy is granted only following the successful review of a detailed design, including the submission of sizing results from the Cisco Collaboration Sizing Tool and VM placement from Cisco VM Placement Tool. For more information about these tools and the sizing of Unified CM clusters, see the Cisco Collaboration Solution Reference Design Guides. Customers who wish to pursue such a deployment must engage either their Cisco Account Team, Cisco Advanced Services, or their certified Cisco Partner to submit the design review request. CPU Reservations design reviews leverage the same team and process used for UCM megaclusters.


General Support Caveats

(top)

  • Recall Cisco Collaboration workloads are resource-intensive and latency-sensitive because they process real-time signaling and media for voice, video, messaging and other forms of live communication. Compared to traditional workloads this introduces restrictions on design, deployment, management and operations to ensure real-time software will perform properly and stably under load.
  • Using VMware's terminology (see vSphere Resource Management Guide in Technical References), Cisco is defining "Reservations", not "Limits" or "Shares" (which are not used or supported).
  • Caveated support for CPU Reservations is intended for virtualized Collaboration installed base customers with datacenter-centric deployments , skilled/experienced VMware/compute admins and extensive baselining of their deployment's CPU utilization (during both steady-state and spike situations such as group bootup, BHCA, upgrades, backups and CDR writes). Not recommended for greenfield deployments (where no baseline exists), deployments having admins with low VMware/compute skillsets or deployments of Business Edition 6000/7000 appliances (since not tested with CPU Reservations and if post-sales hardware changes ever needed, would not be supported).
  • NOTE: Cisco does NOT test VMware CPU Reservations. Cisco Business Edition 6000/7000 appliances and UConUCS TRCs are only tested with the "1vcpu:1pcore" sizing approach as described in the Collaboration Virtualization Sizing. Customers own any required testing of CPU Reservations in their environment.
  • NOTE: Cisco does NOT provide prescriptive guidance for CPU Reservations other than the Required CPU Reservation described later in this policy. Cisco Collaboration Sizing Tool and Virtual Machine Placement Tool are still expressed in terms of fixed-configuration VMs using the "1vcpu:1pcore" sizing approach described in the Collaboration Virtualization Sizing.
  • For minimum required CPU pcore speeds, this new policy only affects supported applications. Required minimum CPU pcore speed for all other Collab apps is UNCHANGED.
  • Caveated support for CPU Reservations does NOT relax sizing rules for other hardware such as Memory, Storage GB / latency / IOPS, network access, etc. Be careful not to use CPU Reservations to place so many VMs on a host that one of these other resources becomes overcommitted.
  • Conservative designs are encouraged. Resist the temptation to over-consolidate VMs, which makes change management more complex, creates single points of failure and thwarts redundancy.
  • When using CPU reservations with these applications, TAC support for committed real-time performance is only achieved when all VMs are at the Required CPU Reservation, Cisco Collaboration Sizing Tool rules are followed and hardware meets TRC requirements. If ANY are not met, the deployment is treated as Specs-based (see TAC support clarifications in TAC TechNote #115955).


Required Physical Hardware

(top)

  • Physical CPUs models must be on the supported specs-based list and are restricted to Intel Xeon E5v1 or later and Intel Xeon E7v2 or later on that list.
  • Physical CPU speed ("Base Frequency") must be 2.50 GHz or higher. For UCM and IM&P only, if a slower Base Frequency is desired, customer must work with their channel partner, advanced services team, or account team to model their deployment in Cisco Collaboration Sizing Tool. If the "Call Processing Capacity Utilized per Call Processing VM" on "Solution Sizing Summary" is ≤ 40%, then for supported application VMs only, the Base Frequency may be as low as 2.00 GHz. Customers who choose to run production on a slower base frequency should frequently re-assess their "as-built" deployment vs. "Call Processing Capacity Utilized per Call Processing VM".

  • NOTE: Cisco TAC has seen variance in physical CPU speeds displayed on ESXi and/or reported to guest OS down to as low as 99.75% of Intel's advertised frequency value. E.g. a CPU with advertised 2.40 GHz base frequency may appear in admin GUIs as 2.394 GHz per pcore. This is normal and expected behavior, not a hardware problem.
  • Hyperthreading and Logical Cores continue to NOT be supported for these applications. Recall hyperthreading does not increase usable capacity (see also VMware Resource Management Guide in Technical References).
  • CPU Affinity continues to NOT be supported for these applications, as it can interfere with CPU Reservations operations (see also VMware Resource Management Guide in Technical References).


Maximum GHz capacity of an ESXi host / physical server

(top)

  • To identify the usable GHz capacity of a physical server acting as an ESXi host, look in vSphere ESXi's screen for Resource Allocation, CPU section, Total Capacity and take 90% of that number (reflecting VMware recommendation to leave 10% per host unreserved - see vSphere ESXi 6.0 Resource Management Guide in Technical References). E.g. in the screenshot below, the physical server with either no VMs deployed or all VMs powered off has 46.852 GHz of Total Capacity, of which 90% = 42.166 GHz can be used for VMs.

  • NOTE: Total Capacity is hardware and ESXi version dependent, and can vary by physical hardware specs, by ESXi version and between physical servers of otherwise identical configurations. CPU Reservations are therefore not recommended for greenfield deployments where no baseline of supported applications CPU utilization exists. For brownfield / installed base deployments, if you plan to upgrade ESXi version or change hardware, be aware that Total Capacity could reduce, and this cannot be predicted, controlled or prevented by Cisco. Conservative designs are encouraged.
  • Here are some Total Capacity examples from Cisco's labs (Total Capacity in your deployment will likely vary from below, even if using identical hardware).

    Hardware Configuration
    (with Physical CPU Configuration and Intel Advertised Base Frequency)
    VMware vSphere ESXi version Total Capacity before adding any VMs Notes
    Hyperflex cluster of five HX240c M4SX nodes each with
    2S / 14C / 2.60 GHz
    6.0
    (per HX cluster node)
    325.57 GHz
    (for entire HX cluster of 5 HX nodes)
    All VMs powered off except HXDP Controller VMs (1 per node).
    UCS C240 M4SX with
    2S/ 10C / 2.60 GHz
    5.5.2.2 46.852 GHz All VMs powered off.
    UCS C220 M4S with
    2S / 14C / 2.0 GHz
    6.0.1.1 50.134 GHz All VMs powered off.
    UCS C220 M4S with
    2S / 8C / 2.40 GHz
    5.5.0 34.92 GHz All VMs powered off.
    UCS C220 M4S with
    1S / 8C / 2.40 GHz
    6.0.1 16.303 GHz All VMs powered off.
    5.5.0 16.248 GHz All VMs powered off.


Supported Applications and Required CPU Reservations

(top)

  • The Required CPU Reservation is intended to be as close as possible to what Cisco actually tests for VM placements on TRC hardware with 2.50+ GHz pcores using 1 vcpu to 1 pcore and other rules at Collaboration Virtualization Sizing. The closer the deployment is to TRC hardware and VM placement based on 1vcpu to 1 pcore of a minimum base frequency, the lower the risk of unexpected real-time performance issues requiring resolution via hardware change, VM placement change or CPU reservation change.
  • Required minimum versions: UCM must be release 11.5(1) or greater. IMP must be release 11.5(1) or greater. VMware vSphere ESXi must be 5.5 or greater.
  • Deployments must continue to use the fixed VM configurations deployed from the Cisco-provided OVAs for supported apps. Creating VM configurations "from scratch" for supported apps continues to be NOT supported. Note VMs from the Cisco-provided OVAs define small default CPU reservation values that are only used to ensure VMs will boot up.
  • See the table below for supported VM configurations:

    VM Configuration Requirement for "1vcpu:1pcore" Supported for CPU Reservations?
    UCM 11.5, 150 user VM 2vcpu, 2C / 2.0 GHz Not supported
    UCM 11.5, 1K user VM 2vcpu, 2C / 2.0 GHz Supported
    UCM 11.5, 2.5K user VM 1vcpu, 1C / 2.5 GHz Not supported
    UCM 11.5, 7.5K user VM 2vcpu, 2C / 2.5 GHz Supported
    UCM 11.5, 10K user VM 4vcpu, 4C / 2.5 GHz Supported
     
    IM&P 11.5, 150 user VM 1vcpu, 1C / 2.0 GHz Not supported
    IM&P 11.5, 1K user VM 1vcpu, 1C / 2.0 GHz Supported
    IM&P 11.5, 5K user VM 2vcpu, 2C / 2.5 GHz Supported
    IM&P 11.5, 15K user VM 4vcpu, 4C / 2.5 GHz Supported
    IM&P 11.5, 25K user VM 6vcpu, 6C / 2.8 GHz Supported
     
    CUC 11.5, 200 user VM 1vcpu, 1C / 1.8 GHz Not supported
    CUC 11.5, 1K user VM 1vcpu, 1C / 2.0 GHz Supported
    CUC 11.5, 1K user VM 2vcpu, 2C / 2.0 GHz Supported
    CUC 11.5, 5K user VM 2vcpu, 2C / 2.5 GHz Supported
    CUC 11.5, 10K user VM 4vcpu, 4C / 2.5 GHz Supported
    CUC 11.5, 20K user VM 7vcpu, 6C / 2.5 GHz Supported

  • For VMS of applications that permit this CPU reservation policy, the Required CPU Reservation can be calculated as follows:
      (VM's vcpu count) * ((Customer-observed Total Capacity * 90%) / (pcore count))
      Where pcore count is the total number of pcores on the physical server. E.g. for an esxihost with 2S/ 10C/ 2.50 GHz, the pcore count would be 20.
  • For unsupported VMs, or applications that do not permit this CPU reservation policy, CPU Reservation values in the Cisco-provided OVA are only used to ensure VMs will boot up; deployments with these VMs may only use the "1vcpu:1pcore" sizing approach described in Collaboration Virtualization Sizing.
  • Increasing CPU Reservation beyond the Required CPU Reservation is allowed, but note solely increasing CPU Reservation for a VM of a supported app does NOT increase application capacity (i.e density per VM or scale per UCM cluster) beyond what is rated for that VM configuration. E.g. Cisco Collaboration Sizing Tool limits for a UCM 7.5K user VM would not change with an increased CPU reservation, but would change with a migration to a UCM 10K user VM and revised CPU reservation.
  • For UCM and IM&P only, Decreasing CPU Reservation below the Required CPU Reservation is allowed but with these caveats:
    • Customers should only do this if they have good historical baselines of CPU utilization for supported applications during both steady-state and spike situations (such as group bootup, BHCA, upgrades, backups and CDR writes). If baselines are incomplete, not recommended to decrease CPU Reservation below the Required value.
    • A slower physical CPU core speed may be required - see policy rules at Required Physical Hardware andMax GHz capacity of an ESXi host / physical server.
    • TAC support for committed real-time performance is affected - see General Caveats.
    • To debug or fix a real-time performance or other issue, TAC may require problem reproduction at Required CPU Reservation. Issues resolved by changing the reservation will not be treated as bugs.


Capacity Planning

(top)

  • All VM's on a host, Collaboration and non-Collaboration, must use the same CPU sizing approach. Either all VMware CPU Reservations or all "1vcpu:1pcore" as defined in Collaboration Virtualization Sizing ... can't mix approaches.
  • For an ESXi host running latency-sensitive workloads such as Cisco Collaboration, VMware recommends:
    • Overprovisioning physical CPUs to provide extra pcores(see "Best Practices" in Deploying Extremely Latency-Sensitive Applications in vSphere 5.5 in Technical References).
    • Keeping total vcpu count as 1 less than the ESXi host's total pcore count to minimize contention (see "VM Settings" in Best Practices for Performance Tuning of Latency-Sensitive Workloads in vSphere VMs in Technical References).
    • Leaving 10% of Total Capacity GHz unreserved when adding VMs (see vSphere ESXI 6.0 Resource Management Guide in Technical References).
  • Deployments must never overcommit a host's usable GHz capacity. I.e. sum of CPU Reservations must never exceed the host's Total Capacity.
  • Cisco recommends NOT overcommitting physical CPUs - the total vcpu count should not exceed the ESXi host's total pcore count even if using CPU Reservations. As the ratio of vcpus to pcores increases (particularly for small pcore counts), the chance of contention and performance problems (causing symptoms like jitter or instability) increases for situations where one or more VMs experiences high application load. Regardless of CPU Reservations vs. Total Capacity reported by ESXi...
    • For UCM and IMP VMs only, deployments with vcpu count of more than 1x but less than 3x the pcore count may be considered, but are marginal and not recommended by Cisco or VMware. If problems encountered, TAC may require problem reproduction where vcpu count is less than or equal to pcore count, and issues thereby resolved will not be treated as bugs.
    • Deployments with vcpu count of 3x the pcore count or higher are not supported regardless of CPU reservation value.
  • Support policy for use of the "Latency Sensitivity" setting in ESXi 5.5+ with Collaboration applications is unchanged when using CPU Reservations (see Collaboration Virtualization Sizing for which Collaboration workloads must use vs. must not use this setting).

Examples

  • Example of not using CPU Reservations: Here is an (oversimplified) example of a deployment of supported applications using today's "1vcpu to 1 pcore" approach from Collaboration Virtualization Sizing for 5000 phones, a few 1000 mailboxes and 5000 Jabber softclients (using the CVD of the Preferred Architecture for Enterprise Collaboration 11.6):

  • Example 1 - Using Required CPU Reservations: Here is the same deployment model as above using Required-CPU Reservation for each VM (instead of the "1vcpu to 1 pcore" rule).
    • Physical CPUs are E5v1+/E7v2+ from "Full UC Performance CPUs", with pcores of 2.50+ GHz.
    • #vcpu ≤ #pcores
    • Using Required CPU Reservation for each VM = (2vcpu) * (26.93 GHz * 90%) / (12 pcores) = 4.0395 GHz.

  • Example 2- Using Reduced CPU Reservations: Here is the same deployment as above using a Reduced CPU Reservation for each UCM and IM&P VM based on customer's baseline of their CPU utilization. Unity Connection VMs do not permit Reduced CPU Reservations so must run on different esxihosts than those shown.
    • Physical CPUs are E5v1+/E7v2+ from "Full UC Performance CPUs", with pcores of 2.50+ GHz.
    • #vcpu ≤ #pcores
    • Using CPU reservations. Required CPU Reservation for each VM = (2vcpu) * (26.93 GHz * 90%) / (12 pcores) = 4.0395 GHz. But customer consults their baseline and decides to instead use a Reduced CPU Reservation of 3.78 GHz.

  • Example 3 - Using Slower pcores and Reduced CPU Reservations: Here is the same deployment as above using a slower (2.30 GHz) pcore in addition to a Reduced CPU Reservation for each UCM and IM&P VM (based on customer's baseline of their CPU utilization). Unity Connection VMs do not permit slower pcores or Reduced CPU Reservations so must run on different esxi hosts than those shown.
    • Physical CPUs are E5v1+/E7v2+ from "Restricted UC Performance CPUs", with pcores of 2.30 GHz.
    • #vcpu ≤ #pcores
    • Using CPU reservations. Customer's account team, partner or advanced services consults Cisco Collaboration Sizing Tool to verify deployment's CPU load is low enough for a 2.30 GHz pcore.
    • If 2.30 GHz pcore ok, calculate Reduced CPU Reservation = (2vcpu) * (24.78 GHz * 90%) / (12 pcores) = 3.717 GHz. Customer consults their baseline to double check that this Reduced CPU Reservation makes sense.

  • Example 4 - Marginal Deployment using Overcommitted pcores and Reduced CPU Reservation: Here is the same deployment as above using Reduced CPU Reservation for each UCM and IM&P VM but not following recommendation of keeping #vcpu ≤ #pcores. Unity Connection VMs do not permit overcommitted pcores or Reduced CPU Reservations, so must run on different esxihosts than those shown.
    • Physical CPUs are E5v1+/E7v2+ from "Full UC Performance CPUs", with pcores of 3.40 GHz (faster than requried).
    • Moderately overcommitted pcores (8 vcpu to 6 pcores is 1.33 ratio, greater than 1:1 but less than 3:1 so marginal but allowed).
    • Using CPU Reservation from customer's baseline (3.71 GHz) with deployment's CPU load previously vetted on Collaboration Sizing Tool.

  • Example 5 - Unsupported Deployment using Overcommitted pcores: Here is the same deployment as above but with overcommitting pcores beyond Marginal (and not following recommendation of redundant physical servers).
    • Physical CPUs are E5v1+/E7v2+ from "Full UC Performance CPUs", with pcores of 3.50 GHz (faster than required).
    • Excessively overcommitted pcores (14 vcpu to 4 pcores is 3.5 ratio, which is 3:1 or higher so not supported regardless of hardware or CPU reservations used).

  • Example 6 - Hypothetical situations that would not be supported.
    • 2vcpu VM running on one pcore whose base frequency = CPU Reservation
    • 1vcpu VM running on two pcores with base frequency = 50% of CPU Reservation


Caveated Support for VMware Distributed Resource Scheduler

(top)

General Caveats

  • NOTE: Unless specifically indicated, "DRS" in this document refers to VMware Distributed Resource Scheduler and not Cisco Disaster Recovery Solution.
  • Cisco does NOT test VMware DRS with its applications. Cisco Business Edition 6000/7000 appliances and UConUCS TRCs are only tested with static VM placement (see VM Placement Tool and Collaboration Virtualization Sizing). Customers own any required testing of DRS in their environment.
  • Use of DRS with supported application VMs has the same caveats as using vMotion with the same VMs because DRS's live migration also temporarily freezes the live VM to move it to a new host. Freezing a live real-time communications VM can have variable, unpredictable and unpreventable adverse effects, so use of DRS for live migration is NOT advised during business hours (particularly during your busy-hour when BHCA occurs).

Use of VMware DRS for Caveated Applications Virtual Machines

  • Supported application VMs must use CPU Reservations, and all caveats and pre-requisites for supported applications' CPU Reservations apply to use of DRS with applications. Including source and destination ESXi hosts following usable capacity rules in Max GHz capacity of an ESXi host / physical server.
  • VMware licensing must entitle the DRS feature. NOTE: Collaboration embedded OEM virtualization licenses do NOT entitle DRS.
  • Use of DRS must meet all of VMware's DRS+HA+vMotion pre-requisites (see vSphere Resource Mgmt Guide in Technical Resources).
  • Customer owns their DRS configuration. Cisco is not responsible for consulting or debugging the DRS configuration with respect to Collaboration applications. Cisco TAC not obligated to root-cause these application issues isolated to DRS.
  • VM migration via DRS must NOT result in the following:
  • Allowed DRS settings:
    • DRS Mode / Automation Level (at both host and VM levels) = Manual and Partially Automated for all supported applications. Fully Automated for UCM and IMP only.
    • DRS Migration Threshold = all options allowed but "Conservative" is recommended.
    • Dynamic Power Mgmt = not supported (frequently implies CPU throttling to conserve power).
    • DRS Affinity & Anti-affinity rules = allowed to help customers comply with SRND rules, but customer's responsibility to setup and debug.


Technical References from VMware

(top)