msgbartop
Tips and Tricks site for advanced HP-UX Engineers
msgbarbottom

04 Oct 18 Glance/midaemons wont start

Troublehooting steps:

1- Remove the /var/opt/perf/ttd.pid and try to start glance again

 

#rm /var/opt/perf/ttd.pid

#glance

 

2- If the above fails to fix it then stop and restart Glance as follows

 

# mwa stop

# midaemon -smdvss 4M -kths 1000 -pids 5000 -p # ps -ef | grep midaemon Make sure the midaemon is running, # mwa start

 

Modify MWA_START_COMMAND variable in /etc/rc.config.d/ovpa as follows to keep the changes across system reboot.

 

# grep MWA_START_COMMAND /etc/rc.config.d/ovpa MWA_START_COMMAND=”/opt/perf/bin/midaemon -smdvss 4M -kths 1000 -pids 5000 -p ; /opt/perf/bin/mwa start”

 

Tags: , , , ,

11 Apr 11 swlist check the state of patches

swlist -l fileset -a state | grep -v config | sed ‘/^#/d’

 

Output looks like this:
PHCO_36551.CORE2-64SLIB               transient
PHCO_36551.CORE2-SHLIBS               transient

Look for stuff that is in state installed instead of configured.

swconfig \* or swconfig PHCO_36551 may fix the issue.

Tags: , ,

09 Jun 10 System I/O layout (EMC specific)

This is a script designed to document the layout of a system. It is EMC specific and requires the output of the inq command, which I store in /tmp/inq.txt

It can however be adapted to non EMC disk systems. It correctly shows the port/lba layout and status of network and fiber I/O the way it appears on the outside of the system. Two tables are included to permit lookups of the port/lba information. It has worked on blades, and is certified for rp8420 and superdome systems.

It has only been tested on HP-UX 11.31.

… please stand by….update in progress…

DF=”super.translate.dat”
MA=”router.macadd.dat”

typeset MYDIR=/var/tmp/syslayout
typeset MYPAGE=mypage
typeset MYDATA=mydata
typeset IDX_HTML=syslayout.html

writehtml (){
while [ $# -gt 0 ]
do
echo “<td>${1}</td>” >> ${IDX_HTML}

shift
done
echo “<tr>” >> ${IDX_HTML}
}

cat -<<!EOF > ${IDX_HTML}
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.0 Transitional//EN”>
<title>Dana IT Unix System documentation</title>
<BODY>
<TABLE style=”WIDTH: 100%; COLOR: rgb(0,0,0); TEXT-ALIGN: left” cellSpacing=2
cellPadding=2 border=0>
<TBODY>
<TR>
<td width=”200″><img
style=”border-width: 0px; margin: 0px; padding: 0px;” alt=”Dana”
src=”dana_logo.jpg”> </td>
<td style=”font-weight: bold;”><big><big>Dana
IT
Unix:
Documentation</big></big></td>

<TD style=”VERTICAL-ALIGN: top; TEXT-ALIGN: center” colSpan=7><BIG
style=”FONT-FAMILY: helvetica,arial,sans-serif”><BIG>Dana IT system I/O Layout.</BIG></BIG><BR></TD></TR>
!EOF

# colum layout # Path       slot MAC Address    lan  ipaddress      vlan   linkstatus

sysname=$(uname -n)
this_cell=$(vparstatus -p $sysname -v |awk ‘/Boot processor/ {print $4}’ |awk -F’\.’ ‘{print $1}’)

# echo $this_cell

this_par=$(parstatus -c 0 |awk ‘/cell’$this_cell’/ {print $9}’)

nparname=$(/usr/sbin/parstatus -P |awk “/^$this_par/ {if(pname == 1) {print}};/Partition Name/ {pname=1}”|awk ‘/’$this_par’ / {print $6}’)

#echo “Diag nparname: ${nparname}”

complexname=$(/usr/sbin/parstatus -X |awk “/Complex Name/”)
cellind=”cell${this_par}”

nparinfo=$(/usr/sbin/parstatus -P |awk “/^$this_par/ {if(pname == 1) {print}};/Partition Name/ {pname=1}”)
# model needs to be determineed
OS=$(uname -r)
if [ “$OS” = “B.11.31” ]
then
mod=$(model | awk ‘{ print $5 }’)
else
mod=$(model | awk -F/ ‘{ print $3}’)
fi

hn=$(hostname)

this_cell=$(vparstatus -p ${hn} -v |awk ‘/Boot processor/ {print $4}’ |awk -F’\.’ ‘{print $1}’)

echo $this_cell

this_par=$(parstatus -c 0 |awk ‘/cell’$this_cell’/ {print $9}’)

/usr/sbin/parstatus -P |awk “/cell/ {if(pname == 1) {print}};/Partition Name/ {pname=1}”|awk ‘/’$this_par’ / {print $6}’

hn=$(hostname)
lhn=”${hn}.dana.com”
echo “Host name: ${lhn}”
writehtml “Host name:” ${lhn}
echo “Model number is: $mod”
writehtml “Model number:” ${mod}
#echo “<td>$complexname</td><td>$nparname</td><tr>” >>  ${IDX_HTML}
writehtml “${complexname}” ${nparname}
echo “Model number is: $mod”
pbootpath=$(parstatus -p 0 -V |awk -F: ‘/Primary Boot Path/ {print $2}’)

echo “Primary boot path: ${pbootpath}”
writehtml “Primary boot path:” ${pbootpath}
#echo “<td>$complexname</td><td>$nparnam</td><tr>” >>  ${IDX_HTML}
# echo “<td>Path</td><td>slot</td><td>MAC Address</td><td>lan</td><td>IP Address</td><td>vlan</td><td>Link Status</td><tr>” >>  ${IDX_HTML}
writehtml Path slot MAC_Address lan IP_Address vlan Link_Status

#echo “$nparinfo”
# echo “Path        slot MAC         lan     check    ip”
# 2/0/5/1/0/6/1 4 0x002264E4948B lan1 10.8.128.162
echo  “Path       slot MAC Address    lan  ipaddress      vlan   linkstatus”

/usr/sbin/ioscan -fnk | awk ‘/^lan/ {print $3}’ |while read -r path
do
ip=””;
echo $path  | sed ‘s/\// /g’ | read p1 p2 p3 p4 p5 p6 p7
macaddy=$(lanscan | awk ‘{if($1 == “‘${path}'”) print $2}’)
lanid=$(lanscan | awk ‘{if($1 == “‘${path}'”) print $5}’)
plan=$(lanscan | awk ‘{if($1 == “‘${path}'”) print $3}’)
lchk=$(/usr/sbin/linkloop -i $plan $macaddy 2>/dev/null | grep “OK”)
# If linkloop produces postive results then see if there is an ip address
ip=”IP not set”
if [ -n “$lchk” ]
then
# echo “lchk not null. running ifconfig command”
ip=$(/usr/sbin/ifconfig $lanid | grep netmask | awk ‘{print $2}’)
fi
# roll through the router table and see if you can establish
# linkloop with the gateway
DRMAC=”No link..”
DVLAN=”Not found”
#while [[ “$value” != “val1” || “$value” != “val2” || “$value” != “val3” ]]
while read -r DL
do
rmacaddy=$(echo $DL | awk -F: ‘{print $2}’)
rvlan=$(echo $DL | awk -F: ‘{print $3}’)
rlchk=$(/usr/sbin/linkloop -i $plan $rmacaddy 2>/dev/null | grep “OK”)
if [ -n “$rlchk” ]
then
# echo “rlchk not null. setting vlan information.”
DRMAC=${rmacaddy}
DVLAN=${rvlan}
break;
fi
done < $MA

#  echo “${path} ${p7} ${macaddy} ${lanid} ${lchk} ${ip} ”
#p1=$(echo $path | awk -F/ ‘{print $1}’);
#p2=$(echo $path | awk -F/ ‘{print $2}’);
#p3=$(echo $path | awk -F/ ‘{print $3}’);
#p4=$(echo $path | awk -F/ ‘{print $4}’);
#p5=$(echo $path | awk -F/ ‘{print $5}’);
#p6=$(echo $path | awk -F/ ‘{print $6}’);
#p7=$(echo $path | awk -F/ ‘{print $7}’);

portpath=$(echo $path | awk -F/ ‘{print $3}’)
actualport=$(awk -F: ‘{if($2 == “‘${portpath}'” && $3 == “‘$mod'”) print $1}’ ${DF})

# echo “Actual path: ${p1} ${p2} ${p3} ${p4} ${p5} ${p6} ${p7}  ${actualport} ${ip} ${macaddy} ${lanid} ${ip}”
echo “${path} ${actualport} ${macaddy} ${lanid} ${ip}   ${DVLAN}   ${DRMAC}”
# echo “<td>${path}</td><td>${actualport}</td><td>${macaddy}</td><td>${lanid}</td><td>${ip}</td><td>${DVLAN}</td><td>${DRMAC}</td><tr>” >> ${IDX_HTML}
writehtml ${path} ${actualport} ${macaddy} ${lanid} ${ip} ${DVLAN} ${DRMAC}
done

echo “Fiber Channel….”
# echo “<td>Fiber Channel….</td><tr>” >> ${IDX_HTML}
writehtml  “Fiber Channel”
echo “PATH       slot Device… Status spd Hardware address”
# echo “<td>PATH</td><td>slot</td><td>Device</td><td>Status</td><td>speed</td><td>Hardware address</td><tr>” >>  ${IDX_HTML}
writehtml PATH slot Device Status speed Hardware address
#/usr/sbin/ioscan -fnCfc | grep fcd | awk ‘{print $3}’ |while read -r path
/usr/sbin/ioscan -fnk | awk ‘/^fc / {hw=$3;getline;print hw,$1}’ |while read -r hw devfile
do
#   echo “diag ${hw} dev file … ${devfile}”
port=$(echo $hw | awk -F/ ‘{print $3}’)
OSTAT=$(fcmsutil $devfile | awk ‘/ONLINE/  {print $4}’)
LSPD=$(fcmsutil $devfile | awk ‘/Link Speed/  {print $4}’)
WWN=$(fcmsutil $devfile | awk ‘/N_Port Port World Wide Name/  {print $7}’)
#  OSTAT=$(fcmsutil /dev/fcd1 | awk ‘/ONLINE/  {print $4}’)
#  LSPD=$(fcmsutil /dev/fcd1 | awk ‘/Link Speed/  {print $4}’)
#  WWN=$(fcmsutil /dev/fcd1 | awk ‘/N_Port Port World Wide Name/  {print $7}’)
actualport=$(awk -F: ‘{if($2 == “‘${port}'” && $3 == “‘$mod'”) print $1}’ ${DF})
echo “$hw ${actualport} $devfile ${OSTAT} ${LSPD} ${WWN}”
#echo “<td>$hw</td><td>${actualport}</td> <td>$devfile</td><td>${OSTAT}</td><td>${LSPD}</td> <td>${WWN}</td><tr>” >>  ${IDX_HTML}
writehtml ${hw} ${actualport} ${devfile} ${OSTAT} ${LSPD} ${WWN}
done

#awk -F: ‘{printf(“%8s %5s %4s\n”,$1,$3,$4)}’ steve
#2/0/5/1/0/6/0
#2/0/5/1/0/6/1

cat -<< !EOF >> ${IDX_HTML}
<TR></TR></TBODY></TABLE></BODY></HTML>
!EOF

chmod a+r ${IDX_HTML}
# Added to copy the data file to my home directory for diagnosis.
cp syslayout.html /home/sprotte
chmod a+r /home/sprotte/syslayout.html

super.translate.dat
—–
root@gitop2:/root # more /usr/global/etc/super.translate.dat
11:8:SD64B
10:9:SD64B
9:10:SD64B
8:12:SD64B
7:13:SD64B
6:14:SD64B
5:6:SD64B
4:5:SD64B
3:4:SD64B
2:2:SD64B
1:1:SD64B
0:0:SD64B
1:8:rp8420
2:10:rp8420
3:12:rp8420
4:14:rp8420
5:6:rp8420
6:4:rp8420
7:2:rp8420
8:1:rp8420

more /usr/global/etc/router.macadd.dat
Format:

IP Address : MAC Address : vlan# : Description
:0x00000c07ac0a:vlan9:HP-UX Production

Obviously this data varies from place the place. The Mac address of the router is used to check network connectivity with the linkloop command

Tags: , , , , ,

18 May 10 swlist command to provide install date

New trick learned from HP support backline engineer.

swlist -l fileset -a revision -a title -a state -a install_date

———Sample output ——
# vmGuestLib B.04.00 Integrity VM vmGuestLib 200903081306.51
vmGuestLib.GUEST-LIB B.04.00 Integrity VM GUEST-LIB fileset 200903081306.51 configured
# vmProvider B.04.00 WBEM Provider for Integrity VM vmProvider 200903081306.59
vmProvider.VM-PROV-CORE B.04.00 WBEM Provider for Integrity VM VM-PROV-CORE 200903081306.59 configured

Tags: , , , , , , , ,

30 Sep 09 SD-UX Locked. Diagnostic steps.

Problem: After being Ignited superman lost most sd-ux functionality.

Note: superman (not its real name) is a vpar running on a superdome complex.  Only swlist works, swreg -l depot, swinstall -i, swverify all fail with the same error.

 

 

ERROR:   “spuerman/”:  You do not have the required permissions to
         select this target.  Check permissions using the “swacl”
         command or see your system administrator for assistance.  Or,
         to manage applications designed and packaged for nonprivileged
         mode, see the “run_as_superuser” option in the “sd” man page.
       * Target connection failed for “zrtph0v0:/”.
ERROR:   More information may be found in the daemon logfile on this
         target (default location is
         superman:/var/adm/sw/swagentd.log).
       * Selection had errors.

Standard techniques say check:

/sbin/init.d/swagentd stop

/sbin/init.d/swagentd start

Check /etc/hosts networking is consistent.

Make sure /etc/nsswitch.conf is present and makes sense.

Check permissions on /var/tmp and all the swagent files.

None of this worked.

swlist -i -s $PWD in a depot generated the following error taken from ITRC because the system is already fixed.:

swacl -l host @ superman

 

 

List swacl generates this:

Util_Random internal error:  Read of /dev/urandom failed, rv=-1, size=8, No such device (19).

There were a series of other errors all pointing to /dev/urandom

lsdev showed that /urandom did not load the kernel module rng (Randome Number Generator).
Detail    root      /usr/sam/tui/kc/modulemod.sh rng
Detail    root      /usr/sbin/kcmodule -a -P ALL

This is normal output. Before the system was fixed the system did not show the module running.

lsdev | grep rng

138          -1         rng             pseudo

Fix was to unload the rng module in the kernel (using sam SEP cheats)
Then we loaded it. In spite of being listed as dynamic a reboot was required to restore sd-ux functionality.

Actual source of the problem: Ignite image of supergirl did not exclude the /dev/ “files” This cause the wrong kernel module to be loaded with the /dev/urandom “file” driver. Normally this is not a problem becuase /dev is crecreated but for some reason /dev/udandom was not loading the kernel module rng

Ignite excludes have been updated to exclude these files and the system will be re-ignited to make sure nothing else bad happens.

Tags: , , , , , , ,

09 Sep 09 Case Study: Capacity & Migration planning for a small organization

This is our first case study. The events leading up to it occur between 1998 and 2002. It is a real life case study based on my experience. For legal reasons, I can not identify the organization. It is a charity that raises now around $100 million, 92% of funds raised go to actual charitable work. 8% is overhead. IT infrastructure is overhead, even though it is critical to actually raising funds.

From 1991-2005 I worked at this charity in IT, first as a programmer analyst, then as a dba, finally becoming the backup Unix Admin in 1998 and the full time Unix Admin in 2000. The organization ran its legacy fund raising systems on a pair of D class HP-UX systems. The back end database was Software AG adabas. The user fund raising community wanted to have an sql like ability to look into the database and run queries. they wanted flexible use of strategic data. An attempt was made in early 1997 to install a sql front end, but it did not provide acceptable results.

An internal study was done and it was decided in late 1997 to migrate legacy systems to a web based front end, with Oracle as the back end database, Oracle Application Server using forms and reports to build applications. Initially no plan was made to migrate to stronger hardware, due to the assurance from Oracle that their software would run on the existing infrastructure.

By 2000 it was obvious that this was not true. Though the database server itself ran acceptably, there was not sufficient memory or disk capacity to run the application server. So I was asked to prepare a plan to migrate legacy systems. Here were the guidelines:

  • To run three environments, to be described below, each with a database server, an application server and forms and reports development tools on them.
  • Sandbox was to be used to test OS patches, Oracle patches, and tools upgrades. It was to belong to the systems administrator who was permitted to restart this system on short or no notice.
  • The development environment was to be where the developers were to develop code. It needed to be stable and available 100% during normal development hours 8 a.m. to 6 p.m. Any changes made to his system were first to be vetted on the sandbox system.
  • The production system had the same uptime requirements accept that all changes needed to be vetted first on the other two systems.
  • The hardware was to be the same model for all the systems. This was defined to avoid hardware surprises. Only the production system needed to be at full capacity. the other systems were to be the same to permit realistic load testing.
  • Databases would be hosted on SAN disk with an HBA fiber channel connection. Systems were to boot locally.

Overall, I thought this was a solid foundation. Some of the points were made by management, some were suggested by me.

The following basic technical requirements were developed:

  • Overall database needed to be approximately 5 GB for server. Actual use hit 15 GB by 2005. This growth factor was planned.
  • Oracle Server, one instance had to run on each server.
  • Oracle Application server one instance had to run on each server.
  • Legacy applications Natural/Software AG Adabas needed to run on each server.
  • Server configuration needs to be manged and tracked responsibly.
  • HP-UX bi-annual updates needed to be installed in a timely basis after quality assurance.
  • The replacement cycle on hardware would be 3-5 years to maintain cost savings provided by being under warranty (First three years)

Deployment Diagram

Server Deployment

Other Relevant facts on the decision making process.

  • HP Hardware and Software agreements were running over $30,000 per year on existing infrastructure.
  • Much of the cost was hardware support due to the age and near obsolescence of the hardware.
  • Significant savings could be obtained by using current hardware that was under warranty.
  • Systems would be configured and used to provide a disaster recovery solution.

Three vendors were picked to provide proposals. All ended up recommending HP-9000 L2000(later renamed rp5450) servers. Here are the highlights:

  • rp5450 systems with 2 GB system memory.
  • 146 GB dual disks to server as boot disks with software mirroring.
  • 2 CPU would be installed per server.
  • Memory capacity and purchase was planned to enable an upgrade to 8 GB without replacing exiting memory.
  • Two HBA Fiber channel cards provided per machine to provide redundancy and fail over.
  • A capital budget request was made showing that support cost savings would over the course of 4 years, completely recover the cost of the systems.
  • Systems would each have a Ultirum tape drive, for locally provided backups and Ignite-UX make_tape_recovery backups as part of the DR plan.
  • Systems had two Gigabit Network Interface cards.
  • Systems would have a private network for use in Ignite backup, recovery and system replication.
  • Systems were to be delivered with HP-UX 11.11
  • HP provided RAC and UPS and PDU were specified.

How it went:

  • Systems were delivered in May of 2002.
  • Initial OS install began immediately. Systems were initially delivered with HP-UX 11.00. We delayed start of installation until correct media was provided.
  • All three systems were installed with a base OS to insure that hardware was working.
  • OS patch requirements for Oracle, security and bi-annual updates were installed on the sand box. It was decided that Ignite Golden Image would be used to replicate the sand box configuration, once a stable configuration was found.
  • Significant problems were encountered with the Oracle and Oracle Application Server installations. The version was changed twice. Several major Oracle patch sets had to be installed to deal with “show stopper” bugs that were encountered.
  • After the September 11 attacks in New York City in 2001, a security review was conducted and the deployment plan was modified to include improved security. Several rounds of patching and tools testing occurred on the OS level.
  • In December of 2002, the application development team notified us that they were satisfied with the sandbox and asked that an Ignite image be made and transferred to the development system.
  • In January-February of 2003 Imaging was done and the system was replicated. There were OS problems with the Ignite replication that took several weeks to work out.
  • Several changes were requested by the development staff. They were tested on the sand box and then deployed on the development system.
  • An Ignite central server was built on the sand box to handle images which were shared on NFS and available for use after booting of the sandbox Ignite configuration.
  • In June of 2003 after several change cycles the configuration was approved for deployment.
  • Ignite replication was completed on the production environment using the sandbox, which had been frozen for this purpose as the image template.
  • In August of 2003 all legacy systems were cut over to the rp5450 systems. HR would be migrated 18 months later due to Integration issues.
  • In the early of 2004 due to performance and memory use issues all systems were upgraded to 8 GB of system RAM.
  • For the year 2004 there was no downtime in production systems during normal business hours.
  • Weekly Ignite tape backups were taken on all systems and network based backup to shared NFS was used as a secondary DR method.
  • In February of 2004 a DR test was run at the HP Performance center and we successfully migrated a sandbox image to an rp5470 server in the HP infrastructure. Legacy systems were tested and approved as functional.

Note: This document was designed entirely using the wordpress interface and a Linux system. The diagram was created with a free Linux alternative to visio called dia. The tool is in evaluation, and might be replaced. Still a pretty good start. Cost to produce this environment in licensing fees?: Zero dollars.

Tags: , , , , , , , ,

04 Sep 09 Combining patch sets and deploying as a single file (tape format)

I’ve had to deliver some absolutely enormous patch sets in my day and it can be a real pain to have to tar them all up and untar them.

Another real problem in high availability environments is the urgent need to limit the number of boots you schedule.

The way around it is to create a larger single file tape format patch. The key command is swpackage.

I generally like to set up a major system update bi-annually.

In this example, I keep a file system called /patch which has plenty of room.

cd /patch

mkdir June-2009

cd June-2009

I then proceed to gather new software.

http://software.hp.com

Patch database on http://itrc.hp.com

I download the software depots directly to /patch/June-2009/

If security restrictions permit, I use a HP-UX based browser to get the software.hp.com stuff. It just avoids errors in the file transfer process. filezilla works well, if you must use a PC with windows to gather patches.

In the patch database, I use the scripted ftp option. I normally combine the bi-annual QPK/gold pack with any patches required by security_patch_check or SWA.

The scripted ftp option is great, because it restarts, and comes with a build in script to build a depot at the end of the download.

First I use the create_depot script, usually using the -d option

create_depot_<unique name> -d June-2009-bi-annual.depot

Now I have everything in the same directory with the extension depot.

First i build a depot. I have a script named depmake2

for i in *.depot
do

swcopy -x enforce_dependencies=FALSE \
-x mount_all_filesystems=FALSE \
-x reinstall=TRUE \
-x write_remote_files=TRUE \
-x layout_version=1.0 \
-s ${PWD}/$i \* @ ${1}
done

The $1 parameter is once gain the name of the new, combined depot I wish to create

depmake2 June-2009-bi-annual.depot

This will pick up anything I’ve downloaded from software.hp.com

swpackage -s /patch/June-2009/June-2009-bi-annual.depot -d June-2009.bid.tape.depot -x target_type=tape -x media_capacity=8000

The maximum media capacity is 8 GB.

What I am left with is a single tape depot that I can install on any system or even copy into an Ignite patch server with a single command.

swinstall -x autoreboot=true -x reinstall=false -s /path/June-2009.bid.tape.depot \*

One deployment, one reboot.

Before you install, don’t forget to Ignite your system. Always have a backup plan.

#!/usr/bin/sh
/opt/ignite/bin/make_tape_recovery -Av -x inc_entire=vg00 -x exclude=/home -d “description”
# /usr/contrib/bin/eject.tape

Or backup to a central Ignite NFS share.

/opt/ignite/bin/make_net_recovery -s tzfat -x inc_entire=vg00 -a jufdev:/scratch/ignite/archives -C

Ignite is a separate article

Tags: , ,

03 Sep 09 HP-UX Patch Designations

Ever wonder what those letters in HP-UX patches stand for.

PHCO – General Command and libraries patches
PHKL – Kernel patches.
PHNE – Networking Patches
PHSS – Sub System patches (Anything else)

Any of these patches can force you to do a reboot. Kernel patches almost always involve a reboot

Tags: , ,

WhatsApp chat