msgbartop
Tips and Tricks site for advanced HP-UX Engineers
msgbarbottom

04 Oct 18 Glance/midaemons wont start

Troublehooting steps:

1- Remove the /var/opt/perf/ttd.pid and try to start glance again

 

#rm /var/opt/perf/ttd.pid

#glance

 

2- If the above fails to fix it then stop and restart Glance as follows

 

# mwa stop

# midaemon -smdvss 4M -kths 1000 -pids 5000 -p # ps -ef | grep midaemon Make sure the midaemon is running, # mwa start

 

Modify MWA_START_COMMAND variable in /etc/rc.config.d/ovpa as follows to keep the changes across system reboot.

 

# grep MWA_START_COMMAND /etc/rc.config.d/ovpa MWA_START_COMMAND=”/opt/perf/bin/midaemon -smdvss 4M -kths 1000 -pids 5000 -p ; /opt/perf/bin/mwa start”

 

Tags: , , , ,

30 Nov 17 No downtme migration of mounted filesystem to new storage type

On LVM 1.0 Volume group, the task is no downtime storage migration.
 Hitachi to Pure Solid State storage. Mirror/UX required.
 Disks are almostthe same size:

dbrestore:root > diskinfo /dev/rdisk/disk42
 SCSI describe of /dev/rdisk/disk42:
 vendor: HITACHI
 product id: OPEN-V
 type: direct access
 size: 16777216 Kbytes
 bytes per sector: 512
 dbrestore:root > diskinfo /dev/rdisk/disk52
 SCSI describe of /dev/rdisk/disk52:
 vendor: PURE
 product id: FlashArray
 type: direct access
 size: 10485760 Kbytes
 bytes per sector: 512

pvcreate /dev/rdisk/disk52
 vgextend /dev/vgtest /dev/disk/disk52

Before state:
 dbrestore:root > vgdisplay -v vgtest
 --- Volume groups ---
 VG Name /dev/vgtest
 VG Write Access read/write
 VG Status available
 Max LV 255
 Cur LV 1
 Open LV 1
 Max PV 16
 Cur PV 2
 Act PV 2
 Max PE per PV 4095
 VGDA 4
 PE Size (Mbytes) 4
 Total PE 6654
 Alloc PE 1024
 Free PE 5630
 Total PVG 0
 Total Spare PVs 0
 Total Spare PVs in use 0
 VG Version 1.0
 VG Max Size 262080m
 VG Max Extents 65520

--- Logical volumes ---
 LV Name /dev/vgtest/lvtest
 LV Status available/syncd
 LV Size (Mbytes) 4096
 Current LE 1024
 Allocated PE 1024
 Used PV 1

--- Physical volumes ---
 PV Name /dev/disk/disk42
 PV Status available
 Total PE 4095
 Free PE 4095
 Autoswitch On
 Proactive Polling On

PV Name /dev/disk/disk52
 PV Status available
 Total PE 2559
 Free PE 1535
 Autoswitch On
 Proactive Polling On

dbrestore:root > ioscan -NfnCdisk /dev/disk/disk42
 Class I H/W Path Driver S/W State H/W Type Description
 ===================================================================
 disk 42 64000/0xfa00/0x21 esdisk CLAIMED DEVICE HITACHI OPEN-V
 /dev/disk/disk42 /dev/rdisk/disk42
 dbrestore:root > ioscan -NfnCdisk /dev/disk/disk52
 Class I H/W Path Driver S/W State H/W Type Description
 ===================================================================
 disk 52 64000/0xfa00/0x35 esdisk CLAIMED DEVICE PURE FlashArray
 /dev/disk/disk52 /dev/rdisk/disk52
 dbrestore:root > bdf | grep test
 /dev/vgtest/lvtest 4194304 19544 3913845 0% /test
 dbrestore:root > lvdisplay -v /dev/vgtest/lvtest
 --- Logical volumes ---
 LV Name /dev/vgtest/lvtest
 VG Name /dev/vgtest
 LV Permission read/write
 LV Status available/syncd
 Mirror copies 0
 Consistency Recovery MWC
 Schedule parallel
 LV Size (Mbytes) 4096
 Current LE 1024
 Allocated PE 1024
 Stripes 0
 Stripe Size (Kbytes) 0
 Bad block on
 Allocation strict
 IO Timeout (Seconds) default

--- Distribution of logical volume ---
 PV Name LE on PV PE on PV
 /dev/disk/disk42 1024 1024

--- Logical extents ---
 LE PV1 PE1 Status 1
 00000 /dev/disk/disk42 00000 current
 00001 /dev/disk/disk42 00001 current
 00002 /dev/disk/disk42 00002 current
 ...
 01022 /dev/disk/disk42 01022 current
 01023 /dev/disk/disk42 01023 current

dbrestore:root > lvextend -m 1 /dev/vgtest/lvtest /dev/disk/disk52
 The newly allocated mirrors are now being synchronized.This operation will
 take some time. Please wait ....
 Logical volume "/dev/vgtest/lvtest" has been successfully extended.
 Volume Group configuration for /dev/vgtest has been saved in /etc/lvmconf/vgtest.conf
 dbrestore:root > lvdisplay -v /dev/vgtest/lvtest
 --- Logical volumes ---
 LV Name /dev/vgtest/lvtest
 VG Name /dev/vgtest
 LV Permission read/write
 LV Status available/syncd
 Mirror copies 1
 Consistency Recovery MWC
 Schedule parallel
 LV Size (Mbytes) 4096
 Current LE 1024
 Allocated PE 2048
 Stripes 0
 Stripe Size (Kbytes) 0
 Bad block on
 Allocation strict
 IO Timeout (Seconds) default

--- Distribution of logical volume ---
 PV Name LE on PV PE on PV
 /dev/disk/disk42 1024 1024
 /dev/disk/disk52 1024 1024

--- Logical extents ---
 LE PV1 PE1 Status 1 PV2 PE2 Status 2
 00000 /dev/disk/disk42 00000 current /dev/disk/disk52 00000 current
 00001 /dev/disk/disk42 00001 current /dev/disk/disk52 00001 current
 00002 /dev/disk/disk42 00002 current /dev/disk/disk52 00002 current
 ...
 01023 /dev/disk/disk42 01023 current /dev/disk/disk52 01023 current

dbrestore:root > bdf | grep test
 /dev/vgtest/lvtest 4194304 19544 3913845 0% /test
 dbrestore:root > lvreduce -m 0 /dev/vgtest/lvtest /dev/disk/disk42
 Logical volume "/dev/vgtest/lvtest" has been successfully reduced.
 Volume Group configuration for /dev/vgtest has been saved in /etc/lvmconf/vgtest.conf
 dbrestore:root > bdf | grep test
 /dev/vgtest/lvtest 4194304 19544 3913845 0% /test
 dbrestore:root > lvdisplay -v /dev/vgtest/lvtest
 --- Logical volumes ---
 LV Name /dev/vgtest/lvtest
 VG Name /dev/vgtest
 LV Permission read/write
 LV Status available/syncd
 Mirror copies 0
 Consistency Recovery MWC
 Schedule parallel
 LV Size (Mbytes) 4096
 Current LE 1024
 Allocated PE 1024
 Stripes 0
 Stripe Size (Kbytes) 0
 Bad block on
 Allocation strict
 IO Timeout (Seconds) default

--- Distribution of logical volume ---
 PV Name LE on PV PE on PV
 /dev/disk/disk52 1024 1024

--- Logical extents ---
 LE PV1 PE1 Status 1
 00000 /dev/disk/disk52 00000 current
 00001 /dev/disk/disk52 00001 current

...
 01023 /dev/disk/disk52 01023 current

dbrestore:root > bdf | grep test
 /dev/vgtest/lvtest 4194304 19544 3913845 0% /test
 dbrestore:root >

Tags: , ,

16 May 16 NFS with lots of small files:pump up performance

Three kernel parameters that might pump up NFS throughput. Your mileage may vary.

kctune nfs_enable_write_behind=1
kctune nfs_enable_ufc_threshold=1
kctune nfs3_ufc_threshold_percentage=50

Tags: , , ,

03 Mar 15 Making sure MWA is running properly

What follows is a health check script that checks the installation status of HP Operations Agent and the run status of the two mwa daemons that measure performance.

When run with the -y parameter the script will attempt to correct installed status of HP Operations Agent.

If you want the script, please email me via the sites response form. Cutting and pasting from this site can be done, but may be a very frustrating endeavor.

I have added commentary to the script, which may introduce run errors if screen scraped.

myserva:root > cat 247_mwarun
#!/bin/ksh
############################################################################
# make sure scopeux is running, if not run if not installed install.ed

# Load common environment
. /var/adm/bin/.scriptenv
echo “. Checking for mwa software installed and running on ${hn}.”

is=myserva
if [ “${hn}” = “myserva” ]; then is=”myservb”;fi

ps -ef >/tmp/plist.txt

srun=$(awk ‘/scopeux/{print $NF}’ /tmp/plist.txt | wc -l);
mrun=$(awk ‘/midaemon/{print $NF}’ /tmp/plist.txt | wc -l);
swlist -l bundle TC097EA > /tmp/swlist.txt
mwainst=$(awk ‘/TC097EA/{ print $NF}’ /tmp/swlist.txt| wc -l);

#echo “scopeux procs running: $srun mwa installed: $mwainst”
if [ “$1” = “-y” ];then
CHANGES=1
fi

if [ ${srun} -eq 0 ] || [ ${mrun} -eq 0 ] ;then
if (($CHANGES));then
if [ ${mwainst} -ne 1 ]
then
### depot server location is in variable ${is}. This is an ignite depot server.
swinstall -x mount_all_filesystems=false -s ${is}:/Depots/B.11.31/2014midyear_depot TC097EA
rc=$?
echo “mwa TC097EA install succeeded checking sd on ${hn}…”
swlist -l bundle TC097EA > /tmp/swlist.txt
mwainst=$(awk ‘/TC097EA/{ print $NF}’ /tmp/swlist.txt| wc -l);
if [ ${mwainst} -eq 1 ];then echo ” pass – mwa NOW installed.” ;fi
optstat=$(/var/adm/bin/bdfmegs “/opt ” |awk ‘!/File-System/{print $5}’);
echo “${hn} /opt is ${optstat} full remediate if above 85% …”
else
mwa start all
fi
else
echo ” NOTICE – mwa not installed or scopeux/midaemon is not running on ${hn} .(-y will fix).”
fi
else
echo ” pass – mwa installed. scopeux/midaemon is running on ${hn}.”
fi
optstat=$(/var/adm/bin/bdfmegs “/opt ” |awk ‘!/File-System/{print $5}’);
echo “${hn} /opt is ${optstat} full remediate if above 85% …”
rm -f /tmp/plist.txt
rm -f /tmp/swlist.txt
echo “#### end report $0 ${sn} ####”

Script depends on Bill Hassell’s bdfmegs script. bdf can be made to work.
Typical output is:

myserv0:root > ./247_mwarun
Executing HP-UX specific environment parameters…
. Checking for mwa software installed and running on myserv0.
pass – mwa installed. scopeux/midaemon is running on mserv0.
myserv0 /opt is 68% full remediate if above 85% …
#### end report ./247_mwarun myserv0 ####
myserv0:root > mwa stop all

Shutting down Perf Agent collection software
Shutting down scopeux, pid(s) 28345
The Perf Agent collector, scopeux has been shut down successfully.
NOTE: The ARM registration daemon ttd will be left running.

OVOA is running. Not shutting down coda
myserv0:root > ./247_mwarun
Executing HP-UX specific environment parameters…
. Checking for mwa software installed and running on myserv0.
NOTICE – mwa not installed or scopeux/midaemon is not running on myserv0 .(-y will fix).
myserv0 /opt is 68% full remediate if above 85% …
#### end report ./247_mwarun myserv0 ####

Tags: ,

03 Mar 15 scopeux and midaemon don’t want to run

midaemon and scopeux combine to collect performance data on HP-UX.

They both need to be running to properly collect data.

These are part of a depot called measureware which is part of the base OS.

To see if it is installed:
swlist -l bundle TC097EA
myserv0:root > swlist -l bundle TC097EA
# Initializing…
# Contacting target “myserv0″…
#
# Target: myserv0:/
#

TC097EA 11.20.000 HP Operations Agent

If not installed, HP Operations Agent can be downloaded from HP if you have a software contract with HP.

It is also delivered as part of openview, which is a separately licensed product.

I recently implemented performance data collection on a fleet of 100+ servers where I work.

On three of the servers, the daemons refused to run normally.

The following error was recorded in the file /var/opt/perf/status.mi
Unable to find newly enabled CPU.
Please use -prealloc to allocate bufsets for all CPUs.

Here are the steps to implement.
mwa stop all
/opt/perf/bin/ovpa stop
/opt/perf/bin/pctl stop
perfstat

kill any processes gently identified as running in perfstat output.

Edit the file /etc/rc.config.d/ovpa
MIPARMS=”-prealloc=2 -pids 10000 -kths 10000 -smdvss 512M”
export MIPARMS

2 is the number of physical cpus in the box.
If present the file /var/opt/perf/datafiles/RUN should be deleted.


mwa start all
perfstat

Check back in 1 hour and one day that midaemon and scopeux are still running.
Check /var/opt/perf/datafiles for updated log files.

Tags: , , , ,

20 Mar 14 Performance Measurement using measureware

Measureware Extract Documentation

 

 

 

Necessary Processes:

root@myserver:/tmp/fog> ps -ef | grep scopeux

    root  3246     1  0  Mar 19  ?         0:40 /opt/perf/bin/scopeux

List of possible reporting parameters:

/var/opt/perf/reptfile

 

Running datafiles live here:

root@myserver:/root> ll /var/opt/perf/datafiles

total 154528

-rw-r–r–   1 root       sys             31 Feb 21 08:50 RUN

-rw-r–r–   1 root       root           105 Sep 27  2012 agdb

-rw-r–r–   1 root       root             0 Sep 27  2012 agdb.lk

-rw-rw-rw-   1 root       root           168 Feb 21 18:28 classinfo.db

-rw-r–r–   1 root       root       4206652 Feb 21 18:20 logappl

-rw-r–r–   1 root       root       24054172 Feb 21 18:25 logdev

-rw-r–r–   1 root       root       6464936 Feb 21 18:25 logglob

-rw-r–r–   1 root       root        352232 Feb 21 10:55 logindx

-rw-r–r–   1 root       root            15 Sep 27  2012 logpcmd0

-rw-r–r–   1 root       root       32673802 Feb 21 18:28 logproc

-rw-r–r–   1 root       root       9740096 Feb 21 18:25 logtran

drwxr-xr-x   2 root       root            96 Sep 27  2012 lost+found

-rw-r–r–   1 root       root       1504540 Oct 31  2012 mikslp.db

 

Here is a typical template to generate data into a spreadsheet.

cat mwatemplate

REPORT “MWA Export on !SYSTEM_ID”

FORMAT ASCII

HEADINGS ON

SEPARATOR=”|”

SUMMARY=60

MISSING=0

DATA TYPE GLOBAL

YEAR

DATE

TIME

GBL_CPU_TOTAL_UTIL

GBL_CPU_SYS_MODE_UTIL

GBL_CPU_USER_MODE_UTIL

GBL_CPU_SYSCALL_UTIL

GBL_CPU_INTERRUPT_UTIL

GBL_PRI_QUEUE

GBL_CPU_CSWITCH_UTIL

GBL_SWAP_SPACE_UTIL

GBL_DISK_UTIL_PEAK

GBL_DISK_SUBSYSTEM_QUEUE

GBL_MEM_UTIL

GBL_MEM_CACHE_HIT_PCT

GBL_MEM_PAGEIN_RATE

GBL_MEM_PAGEOUT_RATE

GBL_MEM_SWAPIN_RATE

GBL_MEM_SWAPOUT_RATE

GBL_MEM_QUEUE

GBL_NET_PACKET_RATE

GBL_NET_OUTQUEUE

GBL_NETWORK_SUBSYSTEM_QUEUE

 

Here is a script to process the measureware output and generate a spreadsheet using the above template file:

 

#################### Begin Sample Extract Script ####################

#!/usr/bin/ksh

#

# Extract to spreadsheet midnight to 6 am

/opt/perf/bin/extract -xp -r /root/mwatemplate -g -b today 0:00 -e today 06:00 -f testfile.txt

####################  End Sample Extract Script  ####################

 

It is simple but effective. The command above looks at data between midnight and 6 am today. A look at the man page for extract will show you how to look at different data sets. There are an endless number of options. Choose template options based on the nature of the problem you are facing.

 

 

 

Tags: ,

06 Nov 13 Getting EMC disk ID’s.

We want storage to check performance on three possibly problematic LUNS.

Need to get the 4 character LUN ID’s on three disks:

disk82 disk83 and disk123

/usr/bin/inq -nodots -sym_wwn | egrep “disk82|disk83|disk123″| awk ‘{print $3}’ |awk ‘{ print substr( $0, length($0) – 3, length($0) ) }’

Output:

5422

5423

5826

HP-UX 11.31 September 2011 OE.

A good day is an awkful day.

Tags: , , , , ,

29 Oct 10 HP-UX APA help guide

HP APA Commands using lanadmin and nwmgr

Task Legacy Command nwmgr Command
Display command help lanadmin -X -H 900 nwmgr –help -S apa
View link aggregate status lanadmin -x -v 900 nwmgr -c lan900
Create a MANUAL mode link aggregate lanadmin -X -a 1 2 900 nwmgr -a -A links=1,2 -A mode=MANUAL -I 900 -S apa
Create a failover group lanapplyconf nwmgr -a -A links=1,2 -A mode=LAN_MONITOR -I 900 -S apa
Remove all ports from a link aggregate lanadmin -X -c 900 nwmgr -d -A links=all -I 900 -S apa
Remove all ports from a failover group landeleteconf -g lan900 nwmgr -d -A links=all -c lan900
Remove specific ports from a link aggregate lanadmin -X -d 1 2 900 nwmgr -d -A links=1,2 -I 900 -S apa
Update the load balancing algorithm and group
capability for a link aggregate
lanadmin -X -l LB_MAC 900
lanadmin -X -g 900 900 900
nwmgr -s -A lb=LB_MAC, gc=900 -I 900 -S apa
Update the group capability and configuration
mode for a port
lanadmin -X -p 3 900 900
lanadmin -X -p 3 FEC_AUTO 900
nwmgr -s -A gc=900, mode=FEC_AUTO -I 3 -S apa
Update the group capability for a link aggregate lanadmin -X -g 900 900 900 nwmgr -s -A gc=900 -I 900 -S apa
Update the administrative key and load
balancing for a link aggregate
lanadmin -X -k 900 900 900
lanadmin -X -l LB_IP 900
nwmgr -s -A key=900 -A lb=LB_IP -I 900 -S apa
Update the administrative key and
configuration mode for a port
lanadmin -X -k 4 900 900
lanadmin -X -p 4 LACP_AUTO 900
nwmgr -s -A key=900 -A mode=LACP_AUTO -I 4 -S apa
Update the administrative key for a port lanadmin -X -k 4 900 900 nwmgr -s -A key=900 -I 4 -S apa
Update the load balancing lanadmin -X -l LB_IP 900 nwmgr -s -A lb=LB_IP -I 900 -S apa
Set the configuration mode on a port lanadmin -X -p 5 MANUAL 900 nwmgr -s -A mode=MANUAL -I 5 -S apa
Set the system priority on a port lanadmin -X -s 5 10 900 nwmgr -s -A sys_pri=10 -I 5 -S apa
Display the MAC address lanadmin -a 900 nwmgr -A mac -c lan900
Display the speed lanadmin -s 900 nwmgr -A speed -c lan900
Display the MTU, MAC address, and speed lanadmin -m -a -s 900 nwmgr -A mtu,mac,speed -c lan900
nwmgr -A all -c lan900
Display group capability lanadmin -x -g 5 900 nwmgr -A gc -I 5 -S apa
Display aggregate port status lanadmin -x -i 900 nwmgr -A all -c lan900
Display administrative key lanadmin -x -k 5 900 nwmgr -A key -I 5 -S apa
Display load balancing algorithm lanadmin -x -l 900 nwmgr -A lb -c lan900 -S apa
Display port status lanadmin -x -p 5 900 nwmgr -A mode -I 5 -S apa
Display system priority lanadmin -x -s 5 900 nwmgr -A sys_pri -I 5 -S apa
Display current port priority lanadmin -x -t 5 900 nwmgr -A port_pri -I 5 -S apa
Display aggregate status lanadmin -x -v 900 nwmgr -v -c lan900
Check network connectivity linkloop -i 900 0xaabbccddeeff nwmgr –diag -A dest=0xaabbccddeeff -c lan900
Get statistics lanadmin -g 900 nwmgr –st -c lan900
Monitoring statistics apa-monitor -p 5 nwmgr –st monitor -S apa -I 900
Reset an APA interface lanadmin -r 900 nwmgr -r -c lan900
Reset statistics lanadmin -c 900 nwmgr -r –st -c lan900
View basic help lanadmin -x -h 900 nwmgr -h -S apa
View verbose help lanadmin -X -H 900 nwmgr -h -v -S apa
Clear data flows on a link aggregate lanadmin -X -o 900 nwmgr -r -q data_flow -c lan900
List all interfaces on the system lanscan nwmgr
List all APA interfaces lanscan -q nwmgr -S apa

nwmgr -s -f -c lan1 -A mtu=1500 –cu

### change mtu on lan1 to 1500 (lanadmin -M 1 1500)

Found some really useful information on APA. So good I won’t risk it disappearing. Pretty much here for my own reference.

Tags: ,

09 Sep 09 Case Study: Capacity & Migration planning for a small organization

This is our first case study. The events leading up to it occur between 1998 and 2002. It is a real life case study based on my experience. For legal reasons, I can not identify the organization. It is a charity that raises now around $100 million, 92% of funds raised go to actual charitable work. 8% is overhead. IT infrastructure is overhead, even though it is critical to actually raising funds.

From 1991-2005 I worked at this charity in IT, first as a programmer analyst, then as a dba, finally becoming the backup Unix Admin in 1998 and the full time Unix Admin in 2000. The organization ran its legacy fund raising systems on a pair of D class HP-UX systems. The back end database was Software AG adabas. The user fund raising community wanted to have an sql like ability to look into the database and run queries. they wanted flexible use of strategic data. An attempt was made in early 1997 to install a sql front end, but it did not provide acceptable results.

An internal study was done and it was decided in late 1997 to migrate legacy systems to a web based front end, with Oracle as the back end database, Oracle Application Server using forms and reports to build applications. Initially no plan was made to migrate to stronger hardware, due to the assurance from Oracle that their software would run on the existing infrastructure.

By 2000 it was obvious that this was not true. Though the database server itself ran acceptably, there was not sufficient memory or disk capacity to run the application server. So I was asked to prepare a plan to migrate legacy systems. Here were the guidelines:

  • To run three environments, to be described below, each with a database server, an application server and forms and reports development tools on them.
  • Sandbox was to be used to test OS patches, Oracle patches, and tools upgrades. It was to belong to the systems administrator who was permitted to restart this system on short or no notice.
  • The development environment was to be where the developers were to develop code. It needed to be stable and available 100% during normal development hours 8 a.m. to 6 p.m. Any changes made to his system were first to be vetted on the sandbox system.
  • The production system had the same uptime requirements accept that all changes needed to be vetted first on the other two systems.
  • The hardware was to be the same model for all the systems. This was defined to avoid hardware surprises. Only the production system needed to be at full capacity. the other systems were to be the same to permit realistic load testing.
  • Databases would be hosted on SAN disk with an HBA fiber channel connection. Systems were to boot locally.

Overall, I thought this was a solid foundation. Some of the points were made by management, some were suggested by me.

The following basic technical requirements were developed:

  • Overall database needed to be approximately 5 GB for server. Actual use hit 15 GB by 2005. This growth factor was planned.
  • Oracle Server, one instance had to run on each server.
  • Oracle Application server one instance had to run on each server.
  • Legacy applications Natural/Software AG Adabas needed to run on each server.
  • Server configuration needs to be manged and tracked responsibly.
  • HP-UX bi-annual updates needed to be installed in a timely basis after quality assurance.
  • The replacement cycle on hardware would be 3-5 years to maintain cost savings provided by being under warranty (First three years)

Deployment Diagram

Server Deployment

Other Relevant facts on the decision making process.

  • HP Hardware and Software agreements were running over $30,000 per year on existing infrastructure.
  • Much of the cost was hardware support due to the age and near obsolescence of the hardware.
  • Significant savings could be obtained by using current hardware that was under warranty.
  • Systems would be configured and used to provide a disaster recovery solution.

Three vendors were picked to provide proposals. All ended up recommending HP-9000 L2000(later renamed rp5450) servers. Here are the highlights:

  • rp5450 systems with 2 GB system memory.
  • 146 GB dual disks to server as boot disks with software mirroring.
  • 2 CPU would be installed per server.
  • Memory capacity and purchase was planned to enable an upgrade to 8 GB without replacing exiting memory.
  • Two HBA Fiber channel cards provided per machine to provide redundancy and fail over.
  • A capital budget request was made showing that support cost savings would over the course of 4 years, completely recover the cost of the systems.
  • Systems would each have a Ultirum tape drive, for locally provided backups and Ignite-UX make_tape_recovery backups as part of the DR plan.
  • Systems had two Gigabit Network Interface cards.
  • Systems would have a private network for use in Ignite backup, recovery and system replication.
  • Systems were to be delivered with HP-UX 11.11
  • HP provided RAC and UPS and PDU were specified.

How it went:

  • Systems were delivered in May of 2002.
  • Initial OS install began immediately. Systems were initially delivered with HP-UX 11.00. We delayed start of installation until correct media was provided.
  • All three systems were installed with a base OS to insure that hardware was working.
  • OS patch requirements for Oracle, security and bi-annual updates were installed on the sand box. It was decided that Ignite Golden Image would be used to replicate the sand box configuration, once a stable configuration was found.
  • Significant problems were encountered with the Oracle and Oracle Application Server installations. The version was changed twice. Several major Oracle patch sets had to be installed to deal with “show stopper” bugs that were encountered.
  • After the September 11 attacks in New York City in 2001, a security review was conducted and the deployment plan was modified to include improved security. Several rounds of patching and tools testing occurred on the OS level.
  • In December of 2002, the application development team notified us that they were satisfied with the sandbox and asked that an Ignite image be made and transferred to the development system.
  • In January-February of 2003 Imaging was done and the system was replicated. There were OS problems with the Ignite replication that took several weeks to work out.
  • Several changes were requested by the development staff. They were tested on the sand box and then deployed on the development system.
  • An Ignite central server was built on the sand box to handle images which were shared on NFS and available for use after booting of the sandbox Ignite configuration.
  • In June of 2003 after several change cycles the configuration was approved for deployment.
  • Ignite replication was completed on the production environment using the sandbox, which had been frozen for this purpose as the image template.
  • In August of 2003 all legacy systems were cut over to the rp5450 systems. HR would be migrated 18 months later due to Integration issues.
  • In the early of 2004 due to performance and memory use issues all systems were upgraded to 8 GB of system RAM.
  • For the year 2004 there was no downtime in production systems during normal business hours.
  • Weekly Ignite tape backups were taken on all systems and network based backup to shared NFS was used as a secondary DR method.
  • In February of 2004 a DR test was run at the HP Performance center and we successfully migrated a sandbox image to an rp5470 server in the HP infrastructure. Legacy systems were tested and approved as functional.

Note: This document was designed entirely using the wordpress interface and a Linux system. The diagram was created with a free Linux alternative to visio called dia. The tool is in evaluation, and might be replaced. Still a pretty good start. Cost to produce this environment in licensing fees?: Zero dollars.

Tags: , , , , , , , ,

01 Oct 07 Memory leak detector

Memory leak detector. Capable of running on HP-UX, Linux and SunOS.

We will give credit and weblinks to authors that give us modules for other OS. That will however obligate you to support your code.

The utility
The pdf man page
The html (loads faster) version of the man page

This should be considered late beta code. We need input to know find bugs. We really want to add other OS to the list.

Future Plans: We hope to build an analyzer that parses the log file and presents candidates as possible memory leaks.

WhatsApp chat