This section lists the following IPAM enhancements.
IPAM Quarantine Timer
The IP quarantine logic enhancements are as follows:
-
The maximum quarantine configuration is increased to 1 hour (Range: 4 to 3600 seconds).
-
If the configured quarantine time is <= 15min, additional buffer of 60 seconds is added to the configured quarantine time.
-
If the configured quarantine time is > 15min, additional buffer of 5 minutes is added to the configured quarantine time.
-
Default quarantine time processing thread interval is changed from 5 to 60 seconds.
-
The IP is moved to the free-list after ~(configured-qTime + buffer + delay-from-qt-thread-processing).
-
Upon Node Manager pod restart, quarantine time of all older IPs in the quarantine time-queue is reset and will restart from
beginning.
-
After Node Manager pod restart, all IPs released as part of reconciliation are moved to the quarantine-queue before moving
to the free-bitmap (this includes pre-reserved IPs).
Address-Range Level Quarantine
If an address-range is removed from the UPF after releasing all the IPs in a proper manner (that is, each released IP went
through quarantine time) then the address-range is moved directly to free-list.
If an address-range is removed from the UPF due to the UPF-release with some of the addresses allocated, then the complete
address-range is put under quarantine for the configured time and then moved to free-list.
The show ipam pool command displays quarantine-chunks with a special ‘alloc-context’.
Pool and UPF Threshold Monitoring
The UPF threshold monitoring enhancements are as follows:
-
Upper threshold: Default = 80%, configurable. This is used to add new chunks to the pool or UPF.
-
SafeCutOff: Default = (upper-threshold-5%), not-configurable. After hitting upper-threshold, new chunks are allocated to the pool or UPF
to bring down the current-utilization to safecutoff level , that is, upper-threshold – 5%.
-
Lower threshold: Default = 90% of upper-threshold, not-configurable. This is used to remove a chunk from the pool or UPF.
Each Node Manager runs a pool level threshold monitoring. When a chunk is assigned to the UPF, the Node Manager checks the
pool-threshold hit and reserves additional chunks from the cache-pod for future use.
For pool threshold calculation, the total number of IPs left in free-chunks are considered; not the actual number of allocated
IPs on an assigned chunk. That is, after a chunk is assigned to the UPF, it is considered as fully used for pool-threshold
monitoring purpose. A complete free address-range can be released back to the cache-pod based on lower-threshold calculation.
For UPF threshold monitoring, the actual number of total IPs and allocated IPs are considered; more chunks are reserved for
the UPF when the upper-threshold hits. The Node Manager adds the route to the UPF whenever a new chunk is assigned to it due
to the threshold hit. For performance reasons, the route is not deleted if it was added in at the last minute.
The upper threshold is configurable (default=80%), when this threshold hits, new chunks are added until the current-utilization
falls back to the safe-cutoff level. That is, 75% is safe cutoff if the upper-threshold is 80%.
Lower threshold is 90% of the upper-threshold. Thati is, if the upper-threshold is 80%, then the lower-threshold is 72%, a
chunk can be removed from the UPF only when the remaining threshold is below 72%. Otherwise, the chunk remains in the UPF
assigned list. This logic is applied to avoid frequent route-add and route-delete operations around boundary condition. The
UPF threshold monitoring is triggered during events such as address-allocate, address-release, and config-change. On idle-system,
the behavior may differ, however, in a running system, the threshold calculation occurs regularly.
Marking a pool or address-range as offline overrides the lower-threshold logic. That is, if an offline chunk is completely
free, it is removed from the UPF irrespective of the lower-threshold calculation.
Multiple Replica Handling
IPAM is part of the Node Manager (nodemgr) pod. A maximum of two nodemgr pods are supported per BNG cluster.
During UPF-registration, one of the nodemgr pod gets all the static-pool-routes for the UPF and all the dynamic-pool-routes
from both the nodemgr pod if anything is allocated earlier and programs it.
During IP-allocation, the IPC request goes to one of the nodemgr pods. If no routes were assigned earlier, a new route is
assigned and if successful, an IP is returned to FSOL. Even if one nodemgr pod goes down, the other nodemgr can handle IP-allocations,
provided enough IPs are available. Chunks that are reserved by one nodemgr cannot be used by the other nodemgr for address-allocations.
During IP-release, the IPC request should go to the IP owner nodemgr as best-effort. If the IPC fails due, then the IP become
stale on the IPAM. During nodemgr bring-up, the CDL reconciliation occcurs, which recovers the stale IPs. In addition, a new
CLI is added reconcile-ipam to manually trigger IPAM-CDL reconciliation on a need basis. This command should be executed only during maintenance because
it is a heavy operation.
During the UPF release, the N4 release comes to one of the nodemgrs. It sends an internal-IPC to the other nodemgr and both
clean-up all the routes assigned to the respective UPF. If one of the nodemgr is down during that time, the other nodemgr
takes over the content and releases the chunks on behalf of its peer.