Introduction
This document describes a scenario with failed diameter route on StarOS (Aggregation Services Router (ASR) 5500, QvPC – SI and QvPC-DI products).
Contributed by Jean Smetz and Dennis Lanov, Cisco TAC Engineers.
Problem
These logs are reported on MME:
[mme-app 147036 error] [3/1/11383 <sessmgr:30> e_attach_proc.c:3390] [callid 0000000] [context: mme, contextID: 2]
[software internal user syslog] imsi 012341234567890, procedure MME Attach procedure , Error sending HSS-S6a message.
Solution
Whenever there is a failure in the selected route (e.g. Tx-timeout), the number of failures on that route are incremented. Once the number of failures reaches the configured "route-failure threshold <>", the route is considered FAILED.
- A failed route is never ignored while doing route-lookup for routing a message. “Available” routes are given higher priority than the “failed” ones. If there is no “available” route, then the “failed” route is selected.
- The failed route has max-deadtime until which it remains failed. Use, “route-failure deadtime <>” in endpoint config to set the time for the dead route recovery. By default, this is set to 60 seconds.
- We can forcefully reset the failed route with the command, diameter reset route failure.
It is recommended to configure a route-failure deadtime value in case of any diameter route failure under the diameter peers to clear them automatically after some time (value configured in seconds). The 'route-failure deadtime' configures the time duration for which you keep the FAILED status of a route. When this time expires, you change the status to AVAILABLE.
configure
context <context_name >
diameter endpoint <endpoint_name>
route-failure deadtime 86400
Please refer to Command Line Interface Reference for details on the CLIs.