msgbartop
Tips and Tricks site for advanced HP-UX Engineers
msgbarbottom

17 Jun 20 HP-UX Serviceguard Missing node

If you have a hardware fault or other calamity in a HP-UX serviceguard cluster you lose the ability to make incremental changes to the cluster until that node comes back.

If you need to make a change to a cluster in this state and you don’t want to bring down the cluster, you have to do all your changes with one gigantic command line.

Lets say you have a 4 node cluster named cnode1,cnode2,cnode3, and cnode4.

cnode4 suffers a hardware fault and you packages fail over to cnode1-3. But your usage has grown and you have a package that is beating the hardware down and you want to move it from cnode2 to cnode1.

Well you can’t do it incrementally. You have to do it all at once. I recently ran into a situation where I had to modify 37 cluster environment files and the cluster configuration to remove a node cnode4.

That requires you to correctly type a command line that could easily be in excess of 4000 characters. Anybody who knows my typing skills knows this is beyond my abilities on my best day.

So I wrote a little assistant program.

It consists of three files two of which are scripts.

pkg-mod-list (A list of all the package configuration files, full path that need to be modified. It is your choice how to handle the editing. We used ansible last night when we did it in a DR cluster.

Contents …

/etc/cmcluster/nc-package-name/nc-package-name.env

/etc/cmcluster/sc-package-name/sc-package-name.env

Then we have helper scripts which put the command line together.

myclusterV6_prod.conf is the main cluster configuration file with the references to node cnode4 commented out.

cat missing-node-checkconf
MAIN=”cmcheckconf -C /etc/cmcluster/configs/myclusterV6_Prod.conf”
PCMD=””
cat pkg-mod-list | while read -r pfile
do
PCMD=”${PCMD} -P ${pfile}”
### echo “$PCMD”
done
MYCMD=”${MAIN} ${PCMD}”
echo $MYCMD

exec ${MYCMD}

MAIN=”cmapplyconf -C /etc/cmcluster/configs/myclusterV6_Prod.conf”
PCMD=””
cat pkg-mod-list | while read -r pfile
do
PCMD=”${PCMD} -P ${pfile}”
### echo “$PCMD”
done
MYCMD=”${MAIN} ${PCMD}”
echo $MYCMD

exec ${MYCMD}

13 Nov 19 Using File Descriptors other than stdin/stdout/stderr in Shell Scriptings

For a longish ime, I had to jump through major hoops to script around the issue of my standard input getting clobbered when inside a loop that was iterating over something that coming from stdin. I don’t have an exact example that recreates the issue but something like this would generate lots of headaches:

cat <some file> | awk '{<some fancy awk-type hingys>}' | \
  while read entry; do
    bla; bla; bla
    some_command_here_that_would_whack_stdin
    bla; bla; bla
  done

Sorry I cannot provide an actual snippet of code to recreate the issue but what I would find in these situations is that my loop would end after one iteration (when I knew there should have been a lot more) and I coudn’t figure out why which made me haz a sad.

Well, I found a way to steer clear of all that by assigning a <some file> to a file descriptor different from stdin. To wit:

exec 3< /path/to/file
while read entry <&3; do
   bla; bla; bla
   some_command_here_that_would_whack_stdin
   bla; bla; bla
done

There are no doubt other ways to solve this conundrum but this is the way I have avoided it for quite some time now. Of course, make sure you do not associate a file with file descriptors 0,1,2 (unless you are quite sure that is what you want to do!).

10 Nov 19 How to be a yes man

Learning something new is great. Joe Geiger taught me something cool that I should have learned years ago.

Serviceguard users ever wanted to script a cluster change such as a node add.

cmapplyconf -v -P <package file>

Ends with a y/n prompt do you want to apply? Normally that requires input. Not with the yes command:

cmcheckconf -v -P <package file>

rc=$?

# Check return code if not zero stop

if [ ${rc} -ne 0 ]

then

echo “Checkconf error ${rc}”

exit ${rc}

fi

yes | cmapplyconf -v -P <package file>

# Check return code here as well

Tags: , , , ,

15 Aug 19 APA network pairings: How to find out fast what they are

Script for detecting APA network bonded pairs. It is already built into the cinam21t drd image. It will save you 3-5 hours of guess work on future builds.

Networking was changed to protect the innocent.

Here is an example:

[root@cinam21t]:/home/root # ./apanetwork_discover 142.18.1.26 142.18.1.96 ——————————————————— -This script figures out which NIC cards are APA paired.- -It has two inputs:……………………………….- -1- The assigned IP address of the APA Group lan90#…..-
-2- The known network address of an HP-UX server on net.- -ex ./apanetwork_discover 142.18.1.26 142.18.1.96 ……-
– These are cinam21t and stlam31t…………………..-
– The system must be OFF network for this to work ……-
– Instruction: …………………………………..-
– /sbin/init.d/net stop …………………………..-
– /sbin/init.d/vlan stop ………………………….-
– /sbin/init.d/hplm stop ………………………….-
– /sbin/init.d/hpapa stop (You may need to ctrl-break…-
– netstat -rn (ifconfig lan# down then unplumb any lans.-
– Wash,rinse and repeat for lan901,lan902,lan903 …….-
———————————————————
The LAN is lan0 Success lan0 as 142.18.1.26 was able to ping 142.18.1.96 The LAN is lan8 NO JOY lan8 as 142.18.1.26 was able NOT to ping 142.18.1.96
The LAN is lan16 NO JOY lan16 as 142.18.1.26 was able NOT to ping 142.18.1.96
The LAN is lan19 NO JOY lan19 as 142.18.1.26 was able NOT to ping 142.18.1.96
The LAN is lan2 NO JOY lan2 as 142.18.1.26 was able NOT to ping 142.18.1.96
The LAN is lan49 NO JOY lan49 as 142.18.1.26 was able NOT to ping 142.18.1.96
The LAN is lan52 NO JOY lan52 as 142.18.1.26 was able NOT to ping 142.18.1.96
The LAN is lan56 Success lan56 as 142.18.1.26 was able to ping 142.18.1.96 [root@cinam21t]:/home/root #

In this case lan0 are in the bonded pair (lan900)

Take a nwmgr output before bringing network down. Run from console only

Here is the script code

/root/build # cat apanetwork_discover

!/bin/ksh

#
echo “———————————————————“
echo “-This script figures out which NIC cards are APA paired.-“
echo “-It has two inputs:……………………………….-“
echo “-1- The assigned IP address of the APA Group lan90#…..-“
echo “-2- The known network address of an HP-UX server on net.-“
echo “-ex ./apanetwork_discover 172.19.1.26 172.19.1.96 ……-“
echo “- These are stlam34t and stlam31t…………………..-“
echo “- The system must be OFF network for this to work ……-“
echo “- Instruction: …………………………………..-“
echo “- /sbin/init.d/net stop …………………………..-“
echo “- /sbin/init.d/vlan stop ………………………….-“
echo “- /sbin/init.d/hplm stop ………………………….-“
echo “- /sbin/init.d/hpapa stop (You may need to ctrl-break…-“
echo “- netstat -rn (ifconfig lan# down then unplumb any lans.-“
echo “- Wash,rinse and repeat for lan901,lan902,lan903 …….-“
echo “———————————————————“
IP2=$2
IPADDY=$1

nwmgr | awk ‘!/hp_apa/{ printf “%s %s\n”, $1,$2 }’ | awk ‘/UP/{print $1}’ | while read -r LN
do

 sleep 1
 echo "The LAN is ${LN}"
 ifconfig ${LN} ${IPADDY} netmask 255.255.255.0 up > /dev/null
 ping ${IP2} -n 1 -m 5 > /dev/null
 rc=$?
 if [ $rc -eq 0 ]
 then
   echo "Success $LN as $IPADDY was able to ping $IP2"
 else
   echo "NO JOY $LN as $IPADDY was able NOT to ping $IP2"
 fi
 ifconfig ${LN} down
 ifconfig ${LN} unplumb

done

28 Nov 17 An xpinfo that works in hpvm guests and on non Hitachi storage

Hitachi shops faced annoyance times two:
1. xpinfo does not work on non-Hitachi storage for example Pure storage
2. xpinfo does not work on hpvm guests depending on how the storage is passed through from the hpvm host

I now present xpinfonew which though raw and unfnished
The output:

myserv0:root > ./xpinfonew
Device path ldev
==========================================================================
/dev/rdisk/disk111 =:=
/dev/rdisk/disk12 30:86
/dev/rdisk/disk172 03:f3
/dev/rdisk/disk215 46:2c
/dev/rdisk/disk216 46:30
/dev/rdisk/disk217 46:34
/dev/rdisk/disk218 46:38
/dev/rdisk/disk219 46:28
/dev/rdisk/disk220 46:25
/dev/rdisk/disk221 46:27
/dev/rdisk/disk222 46:2a
/dev/rdisk/disk223 46:2e
/dev/rdisk/disk224 46:32
/dev/rdisk/disk225 46:2b
/dev/rdisk/disk226 46:2f
/dev/rdisk/disk227 46:33
/dev/rdisk/disk237 46:37
/dev/rdisk/disk238 46:36
/dev/rdisk/disk239 46:26
/dev/rdisk/disk240 46:29
/dev/rdisk/disk241 46:2d
/dev/rdisk/disk242 46:31
/dev/rdisk/disk243 46:35
/dev/rdisk/disk244 46:39
/dev/rdisk/disk4 aa:bf
/dev/rdisk/disk5 8b:c3
/dev/rdisk/disk6 03:a6
/dev/rdisk/disk9 01:00

myserv0:root > ./xpinfonew raw
Device path ldev
==========================================================================
/dev/rdisk/disk111 =
/dev/rdisk/disk12 3086
/dev/rdisk/disk172 03f3
/dev/rdisk/disk215 462c
/dev/rdisk/disk216 4630
/dev/rdisk/disk217 4634
/dev/rdisk/disk218 4638
/dev/rdisk/disk219 4628
/dev/rdisk/disk220 4625
/dev/rdisk/disk221 4627
/dev/rdisk/disk222 462a
/dev/rdisk/disk223 462e
/dev/rdisk/disk224 4632
/dev/rdisk/disk225 462b
/dev/rdisk/disk226 462f
/dev/rdisk/disk227 4633
/dev/rdisk/disk237 4637
/dev/rdisk/disk238 4636
/dev/rdisk/disk239 4626
/dev/rdisk/disk240 4629
/dev/rdisk/disk241 462d
/dev/rdisk/disk242 4631
/dev/rdisk/disk243 4635
/dev/rdisk/disk244 4639
/dev/rdisk/disk4 aabf
/dev/rdisk/disk5 8bc3
/dev/rdisk/disk6 03a6
/dev/rdisk/disk9 0100

cat xpinfonew
#!/bin/ksh
# Get ldev from any disk regardless of storage provider
#
# 10/26/2017 Steven “Shmuel” Protter steven.protter@hcl.com
#

echo “Device path \t\t ldev ”
echo “==========================================================================”

ioscan -NfnCdisk | awk ‘/rdisk/{ print $(NF) }’ | awk -F_ ‘{ print $1 }’ | sort -u |while read -r dv
do
ldev=$(/var/adm/bin/getldev.ksh ${dv} ${1} );
echo “${dv} \t ${ldev}”
done

The code:
cat /var/adm/bin/getldev.ksh
#!/bin/ksh
# Get ldev from any disk regardless of storage provider
#
# 10/26/2017 Steven “Shmuel” Protter steven.protter@hcl.com
#
argies=$#
if [ $argies -eq 0 ]
then
echo “———— 1 argument required device path ex: /dev/rdisk/disk101 ————-”
exit 1
fi
dv=$1
fmt=$2
## /usr/sbin/scsimgr lun_map -D ${dv} | awk ‘/World Wide Identifier/{ print $(NF) }’
rldev=$(/usr/sbin/scsimgr lun_map -D ${dv} | awk ‘/World Wide Identifier/{ print substr ( $NF, length($NF) – 3, length($NF) ) }’);

l1=$(echo ${rldev} | awk ‘{ print substr ( $NF, length($NF) – 3, 2 ) }’);
l2=$(echo ${rldev} | awk ‘{ print substr ( $NF, length($NF) – 1, length($NF) ) }’);

### echo “raw: ${rldev} l1: ${l1} l2: ${l2} …”
if [ “$fmt” = “raw” ]
then
echo ${rldev}
else
echo “${l1}:${l2}”
fi

Should work on any SAN based storage

Tags: , , , ,

09 Feb 17 A Script to identify entries for a particular user

Starting a series on automation scripting.

This one is meant to be run from a master of the universe host, eg a host with root public keys placed on all work servers.

cat searchforid.ksh

 

#!/usr/bin/ksh
#
# test script
#
. ./.scriptenv
# provides standardization for example SSH_CMD="ssh -q -f -o ConnectionAttempts=3 -o ConnectTimeout=10 -o PasswordAuthentication=no -o BatchMode=yes"

LF="${LOGDIR}/${0}.logfile.txt"
> ${LF}
sc=0

uid=$1
date >> ${LF}

awk '{ print $1 }' $serverlist | while read -r hn
do
echo "################### ${hn} searching for user ${uid} ######################"
echo "################### ${hn} searching for user ${uid} ######################" >> ${LF}
if [ "${hn}" != "mygush0" ]
then
  ${SSH_CMD} ${hn} "grep ${uid} /opt/iexpress/sudo/etc/sudoers;grep ${uid} /etc/passwd"
  sleep 5
  ${SSH_CMD} ${hn} "grep ${uid} /opt/iexpress/sudo/etc/sudoers;grep ${uid} /etc/passwd" >> ${LF}

else
  grep ${uid} /opt/iexpress/sudo/etc/sudoers;grep ${uid} /etc/passwd
  grep ${uid} /opt/iexpress/sudo/etc/sudoers >> ${LF};grep ${uid} 
  /etc/passwd >> ${LF}
  echo  
 
 "#######################################################################################################"
echo "#######################################################################################################" >> ${LF}

fi
done
echo "Success count: ${sc} " >> ${LF}

Tags: , ,

15 Jan 16 Automated setboot check and correction

When you use drd to patch and update systems offline to reduce downtime there is an unintended impact: setboot issues.

Using HP best practices after you boot the new image the setboot -a (alternate) and -p (primary) settings are often the same.

Below is an audit and correction script that helps you track the issue and limit manual intervention and the human error it can introduce:
myserv0:root > cat 349_bootconf
#!/bin/ksh
#########################################################################
# default_umask

#HPUX_SCRIPTS=/opt/depots/scripts/system_build/HPUX
#COMMON=/opt/depots/scripts/system_build/COMMON
# Load common environment
. /var/adm/bin/.scriptenv

#
# The point here is there should be an a primary boot disk
# and an alternate boot disk and they need to be different
#
pboot=$(/usr/sbin/setboot | grep ^Primary | awk ‘{ print $NF }’ | awk -F\/ ‘{print $NF}’ |
awk -F\) ‘{print $1}’);
aboot=$(/usr/sbin/setboot | grep ^Alternate |awk ‘{ print $NF }’|awk -F\/ ‘{print $NF}’ |
awk -F\) ‘{print $1}’);

if [ “$aboot” = “$pboot” ]
then
echo “NOTICE – ${hn} The primary boot disk ${pboot} is the same as the alternate boot disk ${aboot}”
else
echo “pass – The primary boot disk ${pboot} is the different than the alternate boot disk ${aboot}”

fi

if [ “$1” = “-y” ];then
echo “This may need to be remediated manually.”

#
# attempt to figure this out in an automated fashion
#
#
# Determine what the boot dg is.
# Try to use DRD configuration to determine the alt. boot disk and set it.
> /tmp/drdstatus.tfile.txt
/opt/drd/bin/drd status -x logfile=/tmp/drdstatus.tfile.txt
CLONE_DISK=$(awk ‘/Clone Disk: /{ print $NF}’ /tmp/drdstatus.tfile.txt | awk -F\/ ‘{ print $4 }’ | awk -F\) ‘{ print $1 }’);
echo “Clone disk is ${CLONE_DISK}”
setboot -a /dev/rdisk/${CLONE_DISK}
fi

echo “#### end report $0 ${sn} ####”

There is an audit script:

myserver0:root > /var/adm/bin/audit/349_bootconf
Executing HP-UX specific environment parameters…
NOTICE – The primary boot disk disk1972 is the same as the alternate boot disk disk1972
#### end report /var/adm/bin/audit/349_bootconf myserv0 ####

mysys03:root > setboot
Primary bootpath : 2/0/2/1/0/4/1.0x50060e80166f4202.0x4001000000000000 (/dev/rdisk/disk2490)
HA Alternate bootpath :
Alternate bootpath : 2/0/2/1/0/4/0.0x50060e80166f4212.0x4001000000000000 (/dev/rdisk/disk2490)

Autoboot is ON (enabled)
Hyperthreading : ON
: ON (next boot)

This is wrong but is a known issue that results from my patch methodology

First step to fixing is to confirm current booted details and drd details

mysys03:root > lvlnboot -v
Boot Definitions for Volume Group /dev/vg00:
Physical Volumes belonging in Root Volume Group:
/dev/disk/disk2490_p2 — Boot Disk
Boot: lvol1 on: /dev/disk/disk2490_p2
Root: lvol3 on: /dev/disk/disk2490_p2
Swap: lvol2 on: /dev/disk/disk2490_p2
Dump: lvol2 on: /dev/disk/disk2490_p2, 0

lvlnboot: Volume group not activated.
Cannot display volume group “/dev/vgAP1”.
lvlnboot: Volume group not activated.
Cannot display volume group “/dev/vgsapAP1”.
mysys03:root > cat /var/adm/bin/drd_data
DISK1=/dev/disk/disk2490
DISK2=/dev/disk/disk1951
mysys03:root > drd status

======= 01/14/16 14:15:17 PST BEGIN Displaying DRD Clone Image Information
(user=root) (jobid=mysys03)

* Clone Disk: /dev/disk/disk1951
* Clone EFI Partition: AUTO file present, Boot loader present
* Clone Rehost Status: SYSINFO.TXT not present
* Clone Creation Date: 01/07/16 15:00:27 PST
* Last Sync Date: None
* Clone Mirror Disk: None
* Mirror EFI Partition: None
* Original Disk: /dev/disk/disk2490
* Original EFI Partition: AUTO file present, Boot loader present
* Original Rehost Status: SYSINFO.TXT not present
* Booted Disk: Original Disk (/dev/disk/disk2490)
* Activated Disk: Original Disk (/dev/disk/disk2490)

======= 01/14/16 14:15:40 PST END Displaying DRD Clone Image Information
succeeded. (user=root) (jobid=mysys03)
Fix is currently manual

mysys03:root > setboot -a /dev/rdisk/disk1951
Alternate boot path set to 2/0/2/1/0/4/0.0x50060e80166f4212.0x4000000000000000 (/dev/rdisk/disk1951)
mysys03:root > setboot
Primary bootpath : 2/0/2/1/0/4/1.0x50060e80166f4202.0x4001000000000000 (/dev/rdisk/disk2490)
HA Alternate bootpath :
Alternate bootpath : 2/0/2/1/0/4/0.0x50060e80166f4212.0x4000000000000000 (/dev/rdisk/disk1951)

Autoboot is ON (enabled)
Hyperthreading : ON
: ON (next boot)

Possible automated fix (needs to be verified manually first use).

mysys00:root > ./349_bootconf -y
Executing HP-UX specific environment parameters…
NOTICE – The primary boot disk disk1972 is the same as the alternate boot disk disk1972
This may need to be remmediated manually.

======= 01/14/16 14:43:22 PST BEGIN Displaying DRD Clone Image Information
(user=root) (jobid=aappch0)

* Clone Disk: /dev/disk/disk2236
* Clone EFI Partition: AUTO file present, Boot loader present
* Clone Rehost Status: SYSINFO.TXT not present
* Clone Creation Date: 01/14/16 14:00:36 PST
* Last Sync Date: None
* Clone Mirror Disk: None
* Mirror EFI Partition: None
* Original Disk: /dev/disk/disk1972
* Original EFI Partition: AUTO file present, Boot loader present
* Original Rehost Status: SYSINFO.TXT not present
* Booted Disk: Original Disk (/dev/disk/disk1972)
* Activated Disk: Original Disk (/dev/disk/disk1972)

======= 01/14/16 14:43:45 PST END Displaying DRD Clone Image Information
succeeded. (user=root) (jobid=aappch0)

Clone disk is disk2236
Alternate boot path set to 3/0/4/0/0/0/0/4/0/0/1.0x50060e80166f4273.0x4001000000000000 (/dev/rdisk/disk2236)
#### end report ./349_bootconf aappch0 ####
myserv0:root > ./349_bootconf
Executing HP-UX specific environment parameters…
Pass – The primary boot disk disk1972 is the different than the alternate boot disk disk2236
#### end report ./349_bootconf aappch0 ####
myserv0:root > setboot
Primary bootpath : 3/0/6/0/0/0/0/4/0/0/0.0x50060e80166f4213.0x4000000000000000 (/dev/rdisk/disk1972)
HA Alternate bootpath :
Alternate bootpath : 3/0/4/0/0/0/0/4/0/0/1.0x50060e80166f4273.0x4001000000000000 (/dev/rdisk/disk2236)

Autoboot is ON (enabled)
Hyperthreading : ON
: ON (next boot)

Tags: , ,

28 Oct 15 Keeping track of san disks

HP-UX does not make it easy to keep track of SAN presented disks. HBA switch ports are in short supply in many data centers. It is important for performance and reliability to be able to account for how many disks are presented to what HBA WWN ports.

This articles outlines a generic method of doing so. It is better than fcmsutil output but is based only on tools provided with the OS (with 1 small exception).

To make sharing easier, I will provide links to scripts. It is up to you to perform due diligence. The scripts write no data and do not change your system. They are provided without warranty under US Law by ISN Corporation.

I recommend against cutting and pasting scripts from this web page, errors are introduced. They are based on korn shell and do not work with bash shell. They probably work on POSIX shell but were not tested. They are specific to HP-UX B.11.31 but can if you wish be adapted to older versions of the OS.

Links to scripts are at the bottom of the post which is quite long.

Script names accurately describe their functionality
fcdisplaydev.ksh us a utility script designed to provide fcmsutil output to the other two scripts.

myserv1:root > cat pathcount_byhbaport.ksh
#!/bin/ksh
#
# Disk inventory by wwn
#
# 11.31 agile only
#
# Whole system version

## build an arry to hold wwn info
#
ap=0
zerod=0
idevice=$1

ls /dev/fc*| while read -r dv
do
/opt/fcms/bin/fcmsutil $dv | awk ‘/N_Port Port World Wide Name/{ print $(NF) }’ | while read -r wwpn
do
### echo ” array count ${ap} ..”
wwnarray[${ap}]=${wwpn}
wwncount[${ap}]=${zerod}
(( ap = ap + 1 ))
done
done
##echo ${#wwnarray[*]}
##echo ${#wwncount[*]}

calc_path()
{
### function to click counter of disks to port wwn
## set -x
## echo “calc_path $1 >>>”
fwwn=$1
fp=0
while [ ${fp} -le ${ap} ]
do
wwnport=${wwnarray[$fp]}
wwnportc=${wwncount[$fp]}
if [ “${fwwn}” = “${wwnport}” ]
then
##echo “updating wwn count ${fwwn} ..”
(( wwnportc = wwnportc + 1 ))
wwncount[${fp}]=${wwnportc}
fi
(( fp = fp + 1 ))
done
##set +x
}

if [ ! -z “$idevice” ]
then
dv=”/dev/rdisk/${idevice}”
scsimgr -p lun_map -D ${dv} | awk -F: ‘{ print $3 }’ | awk -F. ‘{ print $1 }’ | while read -r hp
do
##/var/adm/bin/fcdisplaydev.ksh ${hp}
wwnfound=$(/var/adm/bin/fcdisplaydev.ksh ${hp});
calc_path ${wwnfound}
done
else
ioscan -NfnCdisk | grep rdisk | grep -v p | awk ‘{ print $(NF) }’ | while read -r dv
do
### echo “cheking hba path disk … ${dv} ”
### scsimgr -p lun_map -D ${dv}
### scsimgr -p lun_map -D ${dv} | awk -F. ‘{ print $2 }’
scsimgr -p lun_map -D ${dv} | awk -F: ‘{ print $3 }’ | awk -F. ‘{ print $1 }’ | while read -r hp
do
##/var/adm/bin/fcdisplaydev.ksh ${hp}
wwnfound=$(/var/adm/bin/fcdisplaydev.ksh ${hp});
calc_path ${wwnfound}
done

done
fi
fp=0
echo “===========================================”
echo “= World wide port name: count ”
if [ ! -z “$idevice” ]
then
echo “= Individual device /dev/rdisk/${idevice} ”
fi
echo “===========================================”
while [ ${fp} -lt ${ap} ]
do
dv1=${wwnarray[$fp]}
dv2=${wwncount[$fp]}
echo “| ${dv1} : ${dv2} |”

(( fp = fp + 1 ))
done
echo “===========================================”

myserv1:root > cat fcdisplaydev.ksh
#!/bin/ksh
hwp=$1
ls /dev/fc*| while read -r dv
do
foundfc=$(/opt/fcms/bin/fcmsutil $dv | awk ‘/Hardware Path is/{ print $(NF) }’ | grep ${hwp} |wc -l);
if [ ${foundfc} -eq 1 ]
then
# echo “$dv to be checked.”
/opt/fcms/bin/fcmsutil $dv |awk ‘/N_Port Port World Wide Name/{ print $(NF)}’
fi
done

cat pathforalldisks.ksh
#!/bin/ksh

xpinfo -i > /tmp/xpinfo.txt

ioscan -NfnCdisk|grep rdisk | awk ‘{ print $NF }’ | grep -v _p | awk -F\/ ‘{ print $NF}’ | while read -r dsk
do
/var/adm/bin/pathcount_byhbaport.ksh $dsk
ldev=$(grep “${dsk} ” /tmp/xpinfo.txt | awk ‘{ print $6 }’ );
echo “LDEV: ${ldev}”

done

Script output. Modified to protect the security of the test systems.

fcmsutil

fcmsutil /dev/fcd0

Vendor ID is = 0x1077
Device ID is = 0x2422
PCI Sub-system Vendor ID is = 0x103C
PCI Sub-system ID is = 0x12DF
PCI Mode = PCI-X 133 MHz
ISP Code version = 5.6.5
ISP Chip version = 3
Topology = PTTOPT_FABRIC
Link Speed = 4Gb
Local N_Port_id is = 0x018100
Previous N_Port_id is = None
N_Port Node World Wide Name = 0x500143800117ef3d
N_Port Port World Wide Name = 0x500143800117ef3c
Switch Port World Wide Name = 0x20810027f8a27cd4
Switch Node World Wide Name = 0x10000027f8a27cd4
N_Port Symbolic Port Name = myserv0_fcd0
N_Port Symbolic Node Name = myserv0_HP-UX_B.11.31
Driver state = ONLINE
Hardware Path is = 0/2/1/0/4/0
Maximum Frame Size = 2048
Driver-Firmware Dump Available = NO
Driver-Firmware Dump Timestamp = N/A
TYPE = PFC
NPIV Supported = YES
Driver Version = @(#) fcd B.11.31.1403 Dec 4 2013

There is a slight error, if you have a fix please share
./pathcount_byhbaport.ksh
calc_path[14]: wwnportc = wwnportc + 1 : bad number
===========================================
= World wide port name: count
===========================================
| 0x500143800117ef3c : 24 |
| 0x500143800117ef3e : 0 |
| 0x500143800117ef40 : 24 |
| 0x500143800117ef42 : 0 |
===========================================

LDEV info is specific ti Hitachi VSP xpinfo utility. You will have to adapt that code to other storage providers.

./pathforalldisks.ksh
calc_path[14]: wwnportc = wwnportc + 1 : bad number
===========================================
= World wide port name: count
= Individual device /dev/rdisk/disk172
===========================================
| 0x500143800117ef3c : 1 |
| 0x500143800117ef3e : 0 |
| 0x500143800117ef40 : 1 |
| 0x500143800117ef42 : 0 |
===========================================
LDEV: 03:f3

Link to http://www.hpux.ws/scripts/pathcount_byhbaport.ksh

Link to http://www.hpux.ws/scripts/pathforalldisks.ksh

Link to fcdisplaydev.ksh

All scripts are provided with no warranty. Use them at your own risk.

Tags: , , ,

03 Mar 15 Making sure MWA is running properly

What follows is a health check script that checks the installation status of HP Operations Agent and the run status of the two mwa daemons that measure performance.

When run with the -y parameter the script will attempt to correct installed status of HP Operations Agent.

If you want the script, please email me via the sites response form. Cutting and pasting from this site can be done, but may be a very frustrating endeavor.

I have added commentary to the script, which may introduce run errors if screen scraped.

myserva:root > cat 247_mwarun
#!/bin/ksh
############################################################################
# make sure scopeux is running, if not run if not installed install.ed

# Load common environment
. /var/adm/bin/.scriptenv
echo “. Checking for mwa software installed and running on ${hn}.”

is=myserva
if [ “${hn}” = “myserva” ]; then is=”myservb”;fi

ps -ef >/tmp/plist.txt

srun=$(awk ‘/scopeux/{print $NF}’ /tmp/plist.txt | wc -l);
mrun=$(awk ‘/midaemon/{print $NF}’ /tmp/plist.txt | wc -l);
swlist -l bundle TC097EA > /tmp/swlist.txt
mwainst=$(awk ‘/TC097EA/{ print $NF}’ /tmp/swlist.txt| wc -l);

#echo “scopeux procs running: $srun mwa installed: $mwainst”
if [ “$1” = “-y” ];then
CHANGES=1
fi

if [ ${srun} -eq 0 ] || [ ${mrun} -eq 0 ] ;then
if (($CHANGES));then
if [ ${mwainst} -ne 1 ]
then
### depot server location is in variable ${is}. This is an ignite depot server.
swinstall -x mount_all_filesystems=false -s ${is}:/Depots/B.11.31/2014midyear_depot TC097EA
rc=$?
echo “mwa TC097EA install succeeded checking sd on ${hn}…”
swlist -l bundle TC097EA > /tmp/swlist.txt
mwainst=$(awk ‘/TC097EA/{ print $NF}’ /tmp/swlist.txt| wc -l);
if [ ${mwainst} -eq 1 ];then echo ” pass – mwa NOW installed.” ;fi
optstat=$(/var/adm/bin/bdfmegs “/opt ” |awk ‘!/File-System/{print $5}’);
echo “${hn} /opt is ${optstat} full remediate if above 85% …”
else
mwa start all
fi
else
echo ” NOTICE – mwa not installed or scopeux/midaemon is not running on ${hn} .(-y will fix).”
fi
else
echo ” pass – mwa installed. scopeux/midaemon is running on ${hn}.”
fi
optstat=$(/var/adm/bin/bdfmegs “/opt ” |awk ‘!/File-System/{print $5}’);
echo “${hn} /opt is ${optstat} full remediate if above 85% …”
rm -f /tmp/plist.txt
rm -f /tmp/swlist.txt
echo “#### end report $0 ${sn} ####”

Script depends on Bill Hassell’s bdfmegs script. bdf can be made to work.
Typical output is:

myserv0:root > ./247_mwarun
Executing HP-UX specific environment parameters…
. Checking for mwa software installed and running on myserv0.
pass – mwa installed. scopeux/midaemon is running on mserv0.
myserv0 /opt is 68% full remediate if above 85% …
#### end report ./247_mwarun myserv0 ####
myserv0:root > mwa stop all

Shutting down Perf Agent collection software
Shutting down scopeux, pid(s) 28345
The Perf Agent collector, scopeux has been shut down successfully.
NOTE: The ARM registration daemon ttd will be left running.

OVOA is running. Not shutting down coda
myserv0:root > ./247_mwarun
Executing HP-UX specific environment parameters…
. Checking for mwa software installed and running on myserv0.
NOTICE – mwa not installed or scopeux/midaemon is not running on myserv0 .(-y will fix).
myserv0 /opt is 68% full remediate if above 85% …
#### end report ./247_mwarun myserv0 ####

Tags: ,

03 Mar 15 scopeux and midaemon don’t want to run

midaemon and scopeux combine to collect performance data on HP-UX.

They both need to be running to properly collect data.

These are part of a depot called measureware which is part of the base OS.

To see if it is installed:
swlist -l bundle TC097EA
myserv0:root > swlist -l bundle TC097EA
# Initializing…
# Contacting target “myserv0″…
#
# Target: myserv0:/
#

TC097EA 11.20.000 HP Operations Agent

If not installed, HP Operations Agent can be downloaded from HP if you have a software contract with HP.

It is also delivered as part of openview, which is a separately licensed product.

I recently implemented performance data collection on a fleet of 100+ servers where I work.

On three of the servers, the daemons refused to run normally.

The following error was recorded in the file /var/opt/perf/status.mi
Unable to find newly enabled CPU.
Please use -prealloc to allocate bufsets for all CPUs.

Here are the steps to implement.
mwa stop all
/opt/perf/bin/ovpa stop
/opt/perf/bin/pctl stop
perfstat

kill any processes gently identified as running in perfstat output.

Edit the file /etc/rc.config.d/ovpa
MIPARMS=”-prealloc=2 -pids 10000 -kths 10000 -smdvss 512M”
export MIPARMS

2 is the number of physical cpus in the box.
If present the file /var/opt/perf/datafiles/RUN should be deleted.


mwa start all
perfstat

Check back in 1 hour and one day that midaemon and scopeux are still running.
Check /var/opt/perf/datafiles for updated log files.

Tags: , , , ,

Supportscreen tag
WhatsApp chat