This document describes the steps to troubleshoot overwritedb issues with the possible solutions in Cisco Unity Connection (CUC).
Cisco recommends that you have knowledge of this topic:
The information in this document is based on this software and hardware versions:
The overwritedb failure will end with this statement.
Cluster overwritedb failed.
The overwritedb log cuc-cluster-overwritedb_yyyy-mm-dd_hh.mm.ss.log can be found in the installation log location through the CLI or Real-Time Monitoring Tool (RTMT).
In order to get this log:
From CLI (You will need Secure FTP (SFTP) server to transfer the log file),
file get install cuc-cluster-overwritedb_yyyy-mm-dd_hh.mm.ss.log
Or
From RTMT,
Choose Trace & Log Central > Collect Install Logs > Select the Node > Proceed.
In a few scenarios, the last 10 lines of the log will provide the error message and hence it can be viewed on the CLI itself with this command: file tail install cuc-cluster-overwritedb_yyyy-mm-dd_hh.mm.ss.log
This section provides various scenarios in order to troubleshoot the overwritedb issues.
Problem: Scenario 1
Overwritedb fails at the first step. The first step tries to establish connection to the remote server.
Logs
+ sudo -u cucluster ssh cuc01 ' sh -lc '\''source /usr/local/cm/db/informix/local/ids.env && t=$(mktemp);ontape -r -v -t STDIO > $t 2>&1; rc=$?; cat $t; exit $rc'\'''@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!Someone could be eavesdropping on you right now (man-in-the-middle attack)!It is also possible that the Rivest-Shamir-Addleman (RSA) host key has just been changed.The fingerprint for the RSA key sent by the remote host isb0:f8:19:75:63:f7:30:aa:e4:ec:3b:dc:20:4a:d9:92.Please contact the system administrator.Add correct host key in /home/cucluster/.ssh/known_hosts to get rid of this message.Offending key in /home/cucluster/.ssh/known_hosts:5RSA host key for 10.1.1.100 has changed and you have requested strict checking.Host key verification failed.Physical restore failed - function read archive backup failed code 1 errno 0
Program over.TERM environment variable not set.+ ontape_rc=1+ [[ 1 -eq 0 ]]+ echo ontape returned 1.ontape returned 1.+ [[ 1 -ne 0 ]]+ echo Failed to restore database on cuc01. Ontape returned 1.
Failed to restore database on cuc01. Ontape returned 1.+ exit 1++ error++ echo 'Overwritedb failed.'++ echo 'The overwritedb log cuc-cluster-overwritedb_2014-01-22_20.20.44.log can be found in the installation log location through the CLI or RTMT.'++ exit
Solution
Contact Cisco TAC. This Solution requires Root Access.
Problem: Scenario 2
Overwritedb fails with the error:
Failed to restore database on <server_name>
Overwritedb failedThe overwritedb log cuc-cluster-overwritedb_yyyy-mm-dd_hh.mm.ss.log can be found in the installation log location through the CLI or RTMT
Solution
The server could be affected by CSCto87784. Check the fixed version of this defect. If the current version does not have the fix run the utils cuc cluster overwritedb command again.
Problem: Scenario 3
Overwritedb fails with this error at this step:
yy/mm/dd hh:mm:ss Synchronize Unity Connection databases...
command failed -- Enterprise Replication already defined (92)
Logs
+ local primary_server=g_ciscounity_na_cucn01+ sudo_informix cdr define server -A /var/opt/cisco/connection/spool/ats/ -c g_ciscounity_na_cucn01 -I g_ciscounity_na_cucn01+ [[ cucluster != \i\n\f\o\r\m\i\x ]]+ sudo -u informix cdr define server -A /var/opt/cisco/connection/spool/ats/ -c g_ciscounity_na_cucn01 -I g_ciscounity_na_cucn01command failed -- Enterprise Replication already defined (92)++ error++ echo 'Overwritedb failed.'++ echo 'The overwritedb log cuc-cluster-overwritedb_2012-11-16_02.32.09.log can be found in the installation logging location through the CLI or RTMT.'++ exit 1
Solution
Here are a few options to fix this issue.
Option 1:
Run these commands one at a time. Do not proceed to the next command until the current one completes.
Option 2:
This issue occurs because the Enterprise Replication queue is full. In order to resolve this issue, you can restart the Publisher server, wait approximately 30 minutes in order to ensure services are started, and restart the Subscriber server. When the services are up on the Subscriber, overwritedb should complete successfully.
Here is the Server Role Manager log that points to this issue:
SRM,3,<CM> Command: /opt/cisco/connection/bin/db-replication-control status cuc02 execution completed abnormally. Error number: 6|
SRM,3,<Timer-0> Replication queue size: 90.0 has exceeded the maximum threshold value. Stopping replication.|
SRM,5,<evt> [PUB_PRIMARY] [replication_failed] ignored|
Option 3: If the issue still exists, contact Cisco TAC.
Problem: Scenario 4
Overwritedb fails with this error at this step,
yy/mm/dd hh:mm:ss Synchronizing Unity Connection databases...
Overwritedb failed
Logs
sudo -u cucluster ssh cuc02 ' sh -lc '\''source /usr/local/cm/db/informix/local/ids.env && dbaccess' 'unitydyndb'\'''
329: Database not found or no system permission.
Solution
Contact Cisco TAC for workaround which requries root access.
Problem: Scenario 5
Failure in either of the following scenarios:
The CLI "utils cuc cluster overwritedb" fails either on PUB or SUB.
The CLI "utils cuc cluster renegotiate" fails on SUB.
The upgrade fails on Subscriber server.
The root cause of this issue is while trying to establish replication it fails at the define server step.
Logs
For Cluster renegotiation / OverwriteDB failure,
+ sudo -u informix cdr define server -A /var/opt/cisco/connection/spool/ats/ -cg_ciscounity_sub1 -I g_ciscounity_sub1 -S g_ciscounity_pubcommand failed -- fatal server error (100)++ error++ '[' 0 -eq 1 ']'++ echo 'Cluster renegotiation failed.'Or the same errors with the last line as
++ echo 'Cluster overwritedb failed.'
For Subscriber install failure,
Thu Oct 17 06:09:47 GMT+2 2013 + sudo -u informix cdr define server -A /var/opt/cisco/connection/spool/ats/ -c g_ciscounity_pub -I g_ciscounity_pubThu Oct 17 06:13:07 GMT+2 2013 command failed -- fatal server error (100)Thu Oct 17 06:13:07 GMT+2 2013 + LOADDBRC=100Thu Oct 17 06:13:07 GMT+2 2013 + '[' 100 -ne 0 ']'Thu Oct 17 06:13:07 GMT+2 2013 + echo 'loaddb.sh return code was 100'Thu Oct 17 06:13:07 GMT+2 2013 loaddb.sh return code was 100Thu Oct 17 06:13:07 GMT+2 2013 + exit 1Thu Oct 17 06:13:07 GMT+2 2013 /opt/cisco/connection/lib/install/post.d/06_load-database had an exit code of 1error: %post(cuc-9.1.1.10000-32.i386) scriptlet failed, exit status 1
Solution
Server is affected by CSCue78730 . Contact Cisco TAC for workaround. Alternatively, upgrade the server to a fixed version of the defect.
Problem: Scenario 6
Overwritedb fails with this error at this step,
yy/mm/dd hh:mm:ss Syncronizing platform and LDAP database...
Overwritedb failed.
Logs
+ sudo -u informix cdr delete server -f -c g_ciscounity_pub g_ciscounity_pubconnect to g_ciscounity_pub failed Attempt to connect to database server (g_ciscounity_pub) failed.(-908)command failed -- unable to connect to server specified (5)+ true
Solution
Contact Cisco TAC. The issue is most likely with the SQL hosts file due to incorrect or corrupt entries. Also seen after change in IP address/hostname and this change does not reflect in the SQL hosts file.
Problem: Scenario 7
This problem is seen when there is a hostname/IP address change on the server.
Overwritedb fails at this step,
yy/mm/dd hh:mm:ss Syncronizing platform and LDAP database...
Overwritedb failed.
"utils service list" shows this service down,
A Cisco DB[NOTRUNNIG] Component is not running
Logs
ssh: connect to host 192.168.1.2 port 22: No route to hostPhysical restore failed - function read archive backup failed code 1 errno 0
Program over.+ ontape_rc=1+ [[ 1 -eq 0 ]]+ echo ontape returned 1.ontape returned 1.+ [[ 1 -ne 0 ]]+ echo Failed to restore database on cuc02. Ontape returned 1.Failed to restore database on cuc02. Ontape returned
Solution
Contact Cisco TAC. TAC will check on the vmsserver table entries and hosts files from root. Also ensure the A Cisco DB service is up before overwriteDB can be run on the subscriber.
Problem: Scenario 8
In this scenario the failure is due to NTP issues.
Logs
+ sudo -u informix cdr define server -A /var/opt/cisco/connection/spool/ats/ -cg_ciscounity_sub1 -I g_ciscounity_sub1 -S g_ciscounity_pubcommand failed -- System clocks difference is too large. (90)++ error++ echo 'Overwritedb failed.'
Solution
In order to resolve this issue, you must fix any Network Time Protocol (NTP) issues, and assign an NTP with a good stratum value. For Unity Connection, a stratum 1 or 2 source is preferred.
Problem: Scenario 9
In this scenario the server cannot access the remote server due to permission issues.
Logs
+ sudo -u cucluster ssh cuc01 ' sh -lc '\''source /usr/local/cm/db/informix/local/ids.env && onstat' '-'\'''Permission denied (publickey,password).+ return -1+ exit 255++ error++ echo 'Overwritedb failed.'
Or
+ sudo -u cucluster ssh cuc01 ' sh -lc '\''source /usr/local/cm/db/informix/local/ids.env && t=$(mktemp); ontape -r -v -t STDIO > $t 2>&1; rc=$?; cat $t; exit $rc'\'''Permission denied (publickey,password).Physical restore failed - function read archive backup failed code 1 errno 0
Program over.
TERM environment variable not set.
Solution
Contact Cisco TAC to synchronize passwords from root.
Problem: Scenario 10
In this scenario the failure is due to missing DNS / Domain Name entry on the server or when the subscriber server has not been defined on the Publisher Server.
Logs
connect to g_ciscounity_sub1 failed Incorrect password or user g_ciscounity_sub1 is not known on the database server. (-951)command failed -- unable to connect to server specified (5)
Solution
Ensure the subscriber server IP address/hostname details are provided under System Settings > Cluster page.
Ensure DNS and Domain Name information is correct for both servers if configured.
If the issue still exists, contact CISCO TAC. TAC will check on the SQL Hosts file from root.
Problem: Scenario 11
OverwriteDB fails with this error:
SSH trust renegotiation failed.The security password on the publisher and subscriber servers do not match. Run the the CLI command "set password user security" on one or both servers to update the security password, then re-run "utils cuc cluster overwritedb".Overwritedb failed.
Solution
Run the set password user security command on one or both servers to update the security password.
This error is also seen when the Subscriber's IP address/hostname is not entered in the Publisher's System Settings > Cluster page.
If the issue still exists contact Cisco TAC.
Revision | Publish Date | Comments |
---|---|---|
1.0 |
21-Apr-2014 |
Initial Release |