msgbartop
Tips and Tricks site for advanced HP-UX Engineers
msgbarbottom

11 Feb 10 EMC based system I/O layout tool

This tool is called syslayout.sh

It works on superdome and rp8420 systems. It requires the EMC utility inq be installed on the system or it simply will not work.

You can provide new translation tables for i/o layout and get it to work on other platforms. It might work on the host for a blade system. It will not under any circumstances work on an hpvm guest.

Translation tables:

super.translate.dat

—data—

sprotte@mngp01:/home/sprotte $ cat super.translate.dat
11:8:SD64B
10:9:SD64B
9:10:SD64B
8:12:SD64B
7:13:SD64B
6:14:SD64B
5:6:SD64B
4:5:SD64B
3:4:SD64B
2:2:SD64B
1:1:SD64B
0:0:SD64B
1:8:rp8420
2:10:rp8420
3:12:rp8420
4:14:rp8420
5:6:rp8420
6:4:rp8420
7:2:rp8420
8:1:rp8420

—end data—-

router.macadd.dat

This file is specific to your vlan and router configuration. It uses linkloop to confirm network connectivity. This portion of the final script can be commented out. the data below is altered due to corporate security concerns.

—data—

sprotte@mngp01:/home/sprotte $ cat router.macadd.dat
192.168.128.1:0x00000c07ac0a:vlan4:HP-UX Production
192.165.138.1:0x00000c07ac8a:vlan118: Peoplesoft
192.170.12.1:0X001ae24a1d00:vlan30: Replica Network

—end data—

The script:

—begin script—

DF=”super.translate.dat”
MA=”router.macadd.dat”

typeset MYDIR=/var/tmp/syslayout
typeset MYPAGE=mypage
typeset MYDATA=mydata
typeset IDX_HTML=syslayout.html

writehtml (){
while [ $# -gt 0 ]
do
echo “<td>${1}</td>” >> ${IDX_HTML}

shift
done
echo “<tr>” >> ${IDX_HTML}
}

cat -<<!EOF > ${IDX_HTML}
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.0 Transitional//EN”>
<title>Dana IT Unix System documentation</title>
<BODY>
<TABLE style=”WIDTH: 100%; COLOR: rgb(0,0,0); TEXT-ALIGN: left” cellSpacing=2
cellPadding=2 border=0>
<TBODY>
<TR>
<td width=”200″><img
style=”border-width: 0px; margin: 0px; padding: 0px;” alt=”Dana”
src=”dana_logo.jpg”> </td>
<td style=”font-weight: bold;”><big><big>Dana
IT
Unix:
Documentation</big></big></td>

<TD style=”VERTICAL-ALIGN: top; TEXT-ALIGN: center” colSpan=7><BIG
style=”FONT-FAMILY: helvetica,arial,sans-serif”><BIG>Dana IT system I/O Layout.</BIG></BIG><BR></TD></TR>
!EOF

# colum layout # Path       slot MAC Address    lan  ipaddress      vlan   linkstatus

sysname=$(uname -n)
this_cell=$(vparstatus -p $sysname -v |awk ‘/Boot processor/ {print $4}’ |awk -F’\.’ ‘{print $1}’)

# echo $this_cell

this_par=$(parstatus -c 0 |awk ‘/cell’$this_cell’/ {print $9}’)

nparname=$(/usr/sbin/parstatus -P |awk “/^$this_par/ {if(pname == 1) {print}};/Partition Name/ {pname=1}”|awk ‘/’$this_par’ / {print $6}’)

#echo “Diag nparname: ${nparname}”

complexname=$(/usr/sbin/parstatus -X |awk “/Complex Name/”)
cellind=”cell${this_par}”

nparinfo=$(/usr/sbin/parstatus -P |awk “/^$this_par/ {if(pname == 1) {print}};/Partition Name/ {pname=1}”)
# model needs to be determineed
OS=$(uname -r)
if [ “$OS” = “B.11.31” ]
then
mod=$(model | awk ‘{ print $5 }’)
else
mod=$(model | awk -F/ ‘{ print $3}’)
fi

hn=$(hostname)

this_cell=$(vparstatus -p ${hn} -v |awk ‘/Boot processor/ {print $4}’ |awk -F’\.’ ‘{print $1}’)

echo $this_cell

this_par=$(parstatus -c 0 |awk ‘/cell’$this_cell’/ {print $9}’)

/usr/sbin/parstatus -P |awk “/cell/ {if(pname == 1) {print}};/Partition Name/ {pname=1}”|awk ‘/’$this_par’ / {print $6}’

hn=$(hostname)
lhn=”${hn}.dana.com”
echo “Host name: ${lhn}”
writehtml “Host name:” ${lhn}
echo “Model number is: $mod”
writehtml “Model number:” ${mod}
#echo “<td>$complexname</td><td>$nparname</td><tr>” >>  ${IDX_HTML}
writehtml “${complexname}” ${nparname}
echo “Model number is: $mod”
pbootpath=$(parstatus -p 0 -V |awk -F: ‘/Primary Boot Path/ {print $2}’)

echo “Primary boot path: ${pbootpath}”
writehtml “Primary boot path:” ${pbootpath}
#echo “<td>$complexname</td><td>$nparnam</td><tr>” >>  ${IDX_HTML}
# echo “<td>Path</td><td>slot</td><td>MAC Address</td><td>lan</td><td>IP Address</td><td>vlan</td><td>Link Status</td><tr>” >>  ${IDX_HTML}
writehtml Path slot MAC_Address lan IP_Address vlan Link_Status

#echo “$nparinfo”
# echo “Path        slot MAC         lan     check    ip”
# 2/0/5/1/0/6/1 4 0x002264E4948B lan1 10.8.128.162
echo  “Path       slot MAC Address    lan  ipaddress      vlan   linkstatus”

/usr/sbin/ioscan -fnk | awk ‘/^lan/ {print $3}’ |while read -r path
do
ip=””;
echo $path  | sed ‘s/\// /g’ | read p1 p2 p3 p4 p5 p6 p7
macaddy=$(lanscan | awk ‘{if($1 == “‘${path}'”) print $2}’)
lanid=$(lanscan | awk ‘{if($1 == “‘${path}'”) print $5}’)
plan=$(lanscan | awk ‘{if($1 == “‘${path}'”) print $3}’)
lchk=$(/usr/sbin/linkloop -i $plan $macaddy 2>/dev/null | grep “OK”)
# If linkloop produces postive results then see if there is an ip address
ip=”IP not set”
if [ -n “$lchk” ]
then
# echo “lchk not null. running ifconfig command”
ip=$(/usr/sbin/ifconfig $lanid | grep netmask | awk ‘{print $2}’)
fi
# roll through the router table and see if you can establish
# linkloop with the gateway
DRMAC=”No link..”
DVLAN=”Not found”
#while [[ “$value” != “val1” || “$value” != “val2” || “$value” != “val3” ]]
while read -r DL
do
rmacaddy=$(echo $DL | awk -F: ‘{print $2}’)
rvlan=$(echo $DL | awk -F: ‘{print $3}’)
rlchk=$(/usr/sbin/linkloop -i $plan $rmacaddy 2>/dev/null | grep “OK”)
if [ -n “$rlchk” ]
then
# echo “rlchk not null. setting vlan information.”
DRMAC=${rmacaddy}
DVLAN=${rvlan}
break;
fi
done < $MA

#  echo “${path} ${p7} ${macaddy} ${lanid} ${lchk} ${ip} ”
#p1=$(echo $path | awk -F/ ‘{print $1}’);
#p2=$(echo $path | awk -F/ ‘{print $2}’);
#p3=$(echo $path | awk -F/ ‘{print $3}’);
#p4=$(echo $path | awk -F/ ‘{print $4}’);
#p5=$(echo $path | awk -F/ ‘{print $5}’);
#p6=$(echo $path | awk -F/ ‘{print $6}’);
#p7=$(echo $path | awk -F/ ‘{print $7}’);

portpath=$(echo $path | awk -F/ ‘{print $3}’)
actualport=$(awk -F: ‘{if($2 == “‘${portpath}'” && $3 == “‘$mod'”) print $1}’ ${DF})

# echo “Actual path: ${p1} ${p2} ${p3} ${p4} ${p5} ${p6} ${p7}  ${actualport} ${ip} ${macaddy} ${lanid} ${ip}”
echo “${path} ${actualport} ${macaddy} ${lanid} ${ip}   ${DVLAN}   ${DRMAC}”
# echo “<td>${path}</td><td>${actualport}</td><td>${macaddy}</td><td>${lanid}</td><td>${ip}</td><td>${DVLAN}</td><td>${DRMAC}</td><tr>” >> ${IDX_HTML}
writehtml ${path} ${actualport} ${macaddy} ${lanid} ${ip} ${DVLAN} ${DRMAC}
done

echo “Fiber Channel….”
# echo “<td>Fiber Channel….</td><tr>” >> ${IDX_HTML}
writehtml  “Fiber Channel”
echo “PATH       slot Device… Status spd Hardware address”
# echo “<td>PATH</td><td>slot</td><td>Device</td><td>Status</td><td>speed</td><td>Hardware address</td><tr>” >>  ${IDX_HTML}
writehtml PATH slot Device Status speed Hardware address
#/usr/sbin/ioscan -fnCfc | grep fcd | awk ‘{print $3}’ |while read -r path
/usr/sbin/ioscan -fnk | awk ‘/^fc / {hw=$3;getline;print hw,$1}’ |while read -r hw devfile
do
#   echo “diag ${hw} dev file … ${devfile}”
port=$(echo $hw | awk -F/ ‘{print $3}’)
OSTAT=$(fcmsutil $devfile | awk ‘/ONLINE/  {print $4}’)
LSPD=$(fcmsutil $devfile | awk ‘/Link Speed/  {print $4}’)
WWN=$(fcmsutil $devfile | awk ‘/N_Port Port World Wide Name/  {print $7}’)
#  OSTAT=$(fcmsutil /dev/fcd1 | awk ‘/ONLINE/  {print $4}’)
#  LSPD=$(fcmsutil /dev/fcd1 | awk ‘/Link Speed/  {print $4}’)
#  WWN=$(fcmsutil /dev/fcd1 | awk ‘/N_Port Port World Wide Name/  {print $7}’)
actualport=$(awk -F: ‘{if($2 == “‘${port}'” && $3 == “‘$mod'”) print $1}’ ${DF})
echo “$hw ${actualport} $devfile ${OSTAT} ${LSPD} ${WWN}”
#echo “<td>$hw</td><td>${actualport}</td> <td>$devfile</td><td>${OSTAT}</td><td>${LSPD}</td> <td>${WWN}</td><tr>” >>  ${IDX_HTML}
writehtml ${hw} ${actualport} ${devfile} ${OSTAT} ${LSPD} ${WWN}
done

#awk -F: ‘{printf(“%8s %5s %4s\n”,$1,$3,$4)}’ steve
#2/0/5/1/0/6/0
#2/0/5/1/0/6/1

cat -<< !EOF >> ${IDX_HTML}
<TR></TR></TBODY></TABLE></BODY></HTML>
!EOF

chmod a+r ${IDX_HTML}
# Added to copy the data file to my home directory for diagnosis.
cp syslayout.html /home/sprotte
chmod a+r /home/sprotte/syslayout.html

—end script—

Your mileage may vary. You will have to customize this script.

There is html based output.

Tags: , , ,

30 Sep 09 SD-UX Locked. Diagnostic steps.

Problem: After being Ignited superman lost most sd-ux functionality.

Note: superman (not its real name) is a vpar running on a superdome complex.  Only swlist works, swreg -l depot, swinstall -i, swverify all fail with the same error.

 

 

ERROR:   “spuerman/”:  You do not have the required permissions to
         select this target.  Check permissions using the “swacl”
         command or see your system administrator for assistance.  Or,
         to manage applications designed and packaged for nonprivileged
         mode, see the “run_as_superuser” option in the “sd” man page.
       * Target connection failed for “zrtph0v0:/”.
ERROR:   More information may be found in the daemon logfile on this
         target (default location is
         superman:/var/adm/sw/swagentd.log).
       * Selection had errors.

Standard techniques say check:

/sbin/init.d/swagentd stop

/sbin/init.d/swagentd start

Check /etc/hosts networking is consistent.

Make sure /etc/nsswitch.conf is present and makes sense.

Check permissions on /var/tmp and all the swagent files.

None of this worked.

swlist -i -s $PWD in a depot generated the following error taken from ITRC because the system is already fixed.:

swacl -l host @ superman

 

 

List swacl generates this:

Util_Random internal error:  Read of /dev/urandom failed, rv=-1, size=8, No such device (19).

There were a series of other errors all pointing to /dev/urandom

lsdev showed that /urandom did not load the kernel module rng (Randome Number Generator).
Detail    root      /usr/sam/tui/kc/modulemod.sh rng
Detail    root      /usr/sbin/kcmodule -a -P ALL

This is normal output. Before the system was fixed the system did not show the module running.

lsdev | grep rng

138          -1         rng             pseudo

Fix was to unload the rng module in the kernel (using sam SEP cheats)
Then we loaded it. In spite of being listed as dynamic a reboot was required to restore sd-ux functionality.

Actual source of the problem: Ignite image of supergirl did not exclude the /dev/ “files” This cause the wrong kernel module to be loaded with the /dev/urandom “file” driver. Normally this is not a problem becuase /dev is crecreated but for some reason /dev/udandom was not loading the kernel module rng

Ignite excludes have been updated to exclude these files and the system will be re-ignited to make sure nothing else bad happens.

Tags: , , , , , , ,

09 Sep 09 Case Study: Capacity & Migration planning for a small organization

This is our first case study. The events leading up to it occur between 1998 and 2002. It is a real life case study based on my experience. For legal reasons, I can not identify the organization. It is a charity that raises now around $100 million, 92% of funds raised go to actual charitable work. 8% is overhead. IT infrastructure is overhead, even though it is critical to actually raising funds.

From 1991-2005 I worked at this charity in IT, first as a programmer analyst, then as a dba, finally becoming the backup Unix Admin in 1998 and the full time Unix Admin in 2000. The organization ran its legacy fund raising systems on a pair of D class HP-UX systems. The back end database was Software AG adabas. The user fund raising community wanted to have an sql like ability to look into the database and run queries. they wanted flexible use of strategic data. An attempt was made in early 1997 to install a sql front end, but it did not provide acceptable results.

An internal study was done and it was decided in late 1997 to migrate legacy systems to a web based front end, with Oracle as the back end database, Oracle Application Server using forms and reports to build applications. Initially no plan was made to migrate to stronger hardware, due to the assurance from Oracle that their software would run on the existing infrastructure.

By 2000 it was obvious that this was not true. Though the database server itself ran acceptably, there was not sufficient memory or disk capacity to run the application server. So I was asked to prepare a plan to migrate legacy systems. Here were the guidelines:

  • To run three environments, to be described below, each with a database server, an application server and forms and reports development tools on them.
  • Sandbox was to be used to test OS patches, Oracle patches, and tools upgrades. It was to belong to the systems administrator who was permitted to restart this system on short or no notice.
  • The development environment was to be where the developers were to develop code. It needed to be stable and available 100% during normal development hours 8 a.m. to 6 p.m. Any changes made to his system were first to be vetted on the sandbox system.
  • The production system had the same uptime requirements accept that all changes needed to be vetted first on the other two systems.
  • The hardware was to be the same model for all the systems. This was defined to avoid hardware surprises. Only the production system needed to be at full capacity. the other systems were to be the same to permit realistic load testing.
  • Databases would be hosted on SAN disk with an HBA fiber channel connection. Systems were to boot locally.

Overall, I thought this was a solid foundation. Some of the points were made by management, some were suggested by me.

The following basic technical requirements were developed:

  • Overall database needed to be approximately 5 GB for server. Actual use hit 15 GB by 2005. This growth factor was planned.
  • Oracle Server, one instance had to run on each server.
  • Oracle Application server one instance had to run on each server.
  • Legacy applications Natural/Software AG Adabas needed to run on each server.
  • Server configuration needs to be manged and tracked responsibly.
  • HP-UX bi-annual updates needed to be installed in a timely basis after quality assurance.
  • The replacement cycle on hardware would be 3-5 years to maintain cost savings provided by being under warranty (First three years)

Deployment Diagram

Server Deployment

Other Relevant facts on the decision making process.

  • HP Hardware and Software agreements were running over $30,000 per year on existing infrastructure.
  • Much of the cost was hardware support due to the age and near obsolescence of the hardware.
  • Significant savings could be obtained by using current hardware that was under warranty.
  • Systems would be configured and used to provide a disaster recovery solution.

Three vendors were picked to provide proposals. All ended up recommending HP-9000 L2000(later renamed rp5450) servers. Here are the highlights:

  • rp5450 systems with 2 GB system memory.
  • 146 GB dual disks to server as boot disks with software mirroring.
  • 2 CPU would be installed per server.
  • Memory capacity and purchase was planned to enable an upgrade to 8 GB without replacing exiting memory.
  • Two HBA Fiber channel cards provided per machine to provide redundancy and fail over.
  • A capital budget request was made showing that support cost savings would over the course of 4 years, completely recover the cost of the systems.
  • Systems would each have a Ultirum tape drive, for locally provided backups and Ignite-UX make_tape_recovery backups as part of the DR plan.
  • Systems had two Gigabit Network Interface cards.
  • Systems would have a private network for use in Ignite backup, recovery and system replication.
  • Systems were to be delivered with HP-UX 11.11
  • HP provided RAC and UPS and PDU were specified.

How it went:

  • Systems were delivered in May of 2002.
  • Initial OS install began immediately. Systems were initially delivered with HP-UX 11.00. We delayed start of installation until correct media was provided.
  • All three systems were installed with a base OS to insure that hardware was working.
  • OS patch requirements for Oracle, security and bi-annual updates were installed on the sand box. It was decided that Ignite Golden Image would be used to replicate the sand box configuration, once a stable configuration was found.
  • Significant problems were encountered with the Oracle and Oracle Application Server installations. The version was changed twice. Several major Oracle patch sets had to be installed to deal with “show stopper” bugs that were encountered.
  • After the September 11 attacks in New York City in 2001, a security review was conducted and the deployment plan was modified to include improved security. Several rounds of patching and tools testing occurred on the OS level.
  • In December of 2002, the application development team notified us that they were satisfied with the sandbox and asked that an Ignite image be made and transferred to the development system.
  • In January-February of 2003 Imaging was done and the system was replicated. There were OS problems with the Ignite replication that took several weeks to work out.
  • Several changes were requested by the development staff. They were tested on the sand box and then deployed on the development system.
  • An Ignite central server was built on the sand box to handle images which were shared on NFS and available for use after booting of the sandbox Ignite configuration.
  • In June of 2003 after several change cycles the configuration was approved for deployment.
  • Ignite replication was completed on the production environment using the sandbox, which had been frozen for this purpose as the image template.
  • In August of 2003 all legacy systems were cut over to the rp5450 systems. HR would be migrated 18 months later due to Integration issues.
  • In the early of 2004 due to performance and memory use issues all systems were upgraded to 8 GB of system RAM.
  • For the year 2004 there was no downtime in production systems during normal business hours.
  • Weekly Ignite tape backups were taken on all systems and network based backup to shared NFS was used as a secondary DR method.
  • In February of 2004 a DR test was run at the HP Performance center and we successfully migrated a sandbox image to an rp5470 server in the HP infrastructure. Legacy systems were tested and approved as functional.

Note: This document was designed entirely using the wordpress interface and a Linux system. The diagram was created with a free Linux alternative to visio called dia. The tool is in evaluation, and might be replaced. Still a pretty good start. Cost to produce this environment in licensing fees?: Zero dollars.

Tags: , , , , , , , ,

WhatsApp chat