Resizing the Broadworks Datastore (DSN)

Posted: 19th October 2009 by Mark in BroadWorks, SIP, Unix

As the database grows on the Broadworks Application and Network servers there will be a need to change the memory allocation for the TimesTen datastore. The Maintenance Guide does not contain all the required steps. The rule of thumb is the allocated “perm” size should not exceed more than 25% of total system memory and the “temp” size should be equal to 25% of the perm size.

The following example assumes 8GB of memory on both AS1 and AS2.

1. SSH to AS1 as bwadmin
2. stopbw
3. repctl stop
4. su as root
5. cd /usr/local/broadworks/bw_base/bin
6. timesten.pl unload
7. ./resizeDSN (perm=2048; temp=512)
8. exit (return to bwadmin)
9. repctl start
10. startbw

– Wait 10 minutes for buffered replication changes from AS2 –

1. SSH to AS2 as bwadmin
2. stopbw
3. repctl stop
4. su as root
5. cd /usr/local/broadworks/bw_base/bin
6. timesten.pl unload
7. ./resizeDSN (perm=2047; temp=512)
8. exit (return to bwadmin)
9. importdb.pl AppServer as1 AppServer (replace as1 with your primary AS hostname or IP)
10. repctl start
11. startbw

If everything went smoothly you should be able to run sychcheck_basic.pl -a on AS2 and the database should show synchronized. If the importdb.pl command in step 9 was unable to import the database, you will need to manually perform the backup and restore procedure.

1. On AS1: bwBackup.pl AppServer dbBackup.db
2. scp the file to AS2: scp dbBackup.db bwadmin@as2:dbBackup.db
3. On AS2: stopbw
4. repctl stop
5. bwRestore.pl AppServer dbBackup.db
6. repctl start
7. startbw

On one other occasion AS1 would not start replication after resizing the DSN due to an error which stated AS2 was on a different patch version than AS1. The two nodes were patched identical, but the patch tool was not responding on AS2 and therefore AS1 could not verify appropriately thus reporting the error. The solution was as simple as restarting the patch tool. However, the Maintenance Guide does not explain how to do this so I spent more time trying to find the procedure than it actually took to execute the commands.

as2$ stoppt.pl
as2$ startpt.pl

  1. Dan says:

    Thanks for posting this as we hit the same problem with our slightly old and creaking Broadworks platform.

    Here’s a few comments from our experience:

    1. We run Broadworks on Solaris so checking the available memory can be achieved using:

    bash-2.05$ prtconf | grep Mem
    Memory size: 2048 Megabytes

    So our servers have 2GB of memory. Of course your guide does state to check but someone else might gain some use for this if they also run under Solaris.

    2. Step 6 on both as1 and as2 was the following for us:

    ./timesten.pl unload

    I would put this down to OS choice etc

    3. On step 7 we found the following to be better by itself:

    ./resizeDSN

    This then provides for checking what the allocations are, making interactive changes, and checking before implementing the change.

    We also had issues running the replication so also had to use the manual backup and restore procedure.

    Finally, when completing the changes on as1 we recieved errors regarding the “execution server not running” and the “replication service not running”. It seems this was down to the speed we went through the commands and the server eventually ‘fixed itself’ since Broadworks sent out broadcast messages that it was starting itself.

    We learnt that:

    healthmon -l

    … is your friend and to take your time.

    Again, thanks for posting this since it proved to be a lifesaver for us.

  2. Mark says:

    Dan,

    Thank you for sharing your experience. It appears the process for Solaris is very much like Linux. To clarify Step 7, the only input is ./resizeDSN. The values in parenthesis ( ) are just a reference as to what values will be used once prompted after the WARNING message. It’s true that once you start Broadworks and it says all services have been started, it still takes time for the server to be functional. Typical complaints from Healthmon are the SNMP service is not running (server not reachable on 127.0.0.1), the server is locked and needs to be unlocked, or the execution server is not running. Running Healthmon too quickly after startup can cause these warning to be reported when running Helathmon. Best practice is to wait 5 minutes before doing anything.