msgbartop
Tips and Tricks site for advanced HP-UX Engineers
msgbarbottom

18 May 10 swlist command to provide install date

New trick learned from HP support backline engineer.

swlist -l fileset -a revision -a title -a state -a install_date

———Sample output ——
# vmGuestLib B.04.00 Integrity VM vmGuestLib 200903081306.51
vmGuestLib.GUEST-LIB B.04.00 Integrity VM GUEST-LIB fileset 200903081306.51 configured
# vmProvider B.04.00 WBEM Provider for Integrity VM vmProvider 200903081306.59
vmProvider.VM-PROV-CORE B.04.00 WBEM Provider for Integrity VM VM-PROV-CORE 200903081306.59 configured

Tags: , , , , , , , ,

26 Oct 09 HP-UX system update terminal session title bar

Nothing huge today.

I just wanted my HP-UX systems to udpate the title bar when Linux systems or Putty terminal sessions come in.

——–begin code——

NAME=”$(uname -n):${PWD}”
LEN=`echo “$NAME\c” | wc -c`
if [ “$TERM” = “xterm” ]
then
    PROMPT_COMMAND=”\033]0;${NAME}\007″
    echo $PROMPT_COMMAND
fi
——–end code———–

Tested with Putty. Will test later today with Linux.

Tags: , , , ,

30 Sep 09 SD-UX Locked. Diagnostic steps.

Problem: After being Ignited superman lost most sd-ux functionality.

Note: superman (not its real name) is a vpar running on a superdome complex.  Only swlist works, swreg -l depot, swinstall -i, swverify all fail with the same error.

 

 

ERROR:   “spuerman/”:  You do not have the required permissions to
         select this target.  Check permissions using the “swacl”
         command or see your system administrator for assistance.  Or,
         to manage applications designed and packaged for nonprivileged
         mode, see the “run_as_superuser” option in the “sd” man page.
       * Target connection failed for “zrtph0v0:/”.
ERROR:   More information may be found in the daemon logfile on this
         target (default location is
         superman:/var/adm/sw/swagentd.log).
       * Selection had errors.

Standard techniques say check:

/sbin/init.d/swagentd stop

/sbin/init.d/swagentd start

Check /etc/hosts networking is consistent.

Make sure /etc/nsswitch.conf is present and makes sense.

Check permissions on /var/tmp and all the swagent files.

None of this worked.

swlist -i -s $PWD in a depot generated the following error taken from ITRC because the system is already fixed.:

swacl -l host @ superman

 

 

List swacl generates this:

Util_Random internal error:  Read of /dev/urandom failed, rv=-1, size=8, No such device (19).

There were a series of other errors all pointing to /dev/urandom

lsdev showed that /urandom did not load the kernel module rng (Randome Number Generator).
Detail    root      /usr/sam/tui/kc/modulemod.sh rng
Detail    root      /usr/sbin/kcmodule -a -P ALL

This is normal output. Before the system was fixed the system did not show the module running.

lsdev | grep rng

138          -1         rng             pseudo

Fix was to unload the rng module in the kernel (using sam SEP cheats)
Then we loaded it. In spite of being listed as dynamic a reboot was required to restore sd-ux functionality.

Actual source of the problem: Ignite image of supergirl did not exclude the /dev/ “files” This cause the wrong kernel module to be loaded with the /dev/urandom “file” driver. Normally this is not a problem becuase /dev is crecreated but for some reason /dev/udandom was not loading the kernel module rng

Ignite excludes have been updated to exclude these files and the system will be re-ignited to make sure nothing else bad happens.

Tags: , , , , , , ,

09 Sep 09 Case Study: Capacity & Migration planning for a small organization

This is our first case study. The events leading up to it occur between 1998 and 2002. It is a real life case study based on my experience. For legal reasons, I can not identify the organization. It is a charity that raises now around $100 million, 92% of funds raised go to actual charitable work. 8% is overhead. IT infrastructure is overhead, even though it is critical to actually raising funds.

From 1991-2005 I worked at this charity in IT, first as a programmer analyst, then as a dba, finally becoming the backup Unix Admin in 1998 and the full time Unix Admin in 2000. The organization ran its legacy fund raising systems on a pair of D class HP-UX systems. The back end database was Software AG adabas. The user fund raising community wanted to have an sql like ability to look into the database and run queries. they wanted flexible use of strategic data. An attempt was made in early 1997 to install a sql front end, but it did not provide acceptable results.

An internal study was done and it was decided in late 1997 to migrate legacy systems to a web based front end, with Oracle as the back end database, Oracle Application Server using forms and reports to build applications. Initially no plan was made to migrate to stronger hardware, due to the assurance from Oracle that their software would run on the existing infrastructure.

By 2000 it was obvious that this was not true. Though the database server itself ran acceptably, there was not sufficient memory or disk capacity to run the application server. So I was asked to prepare a plan to migrate legacy systems. Here were the guidelines:

  • To run three environments, to be described below, each with a database server, an application server and forms and reports development tools on them.
  • Sandbox was to be used to test OS patches, Oracle patches, and tools upgrades. It was to belong to the systems administrator who was permitted to restart this system on short or no notice.
  • The development environment was to be where the developers were to develop code. It needed to be stable and available 100% during normal development hours 8 a.m. to 6 p.m. Any changes made to his system were first to be vetted on the sandbox system.
  • The production system had the same uptime requirements accept that all changes needed to be vetted first on the other two systems.
  • The hardware was to be the same model for all the systems. This was defined to avoid hardware surprises. Only the production system needed to be at full capacity. the other systems were to be the same to permit realistic load testing.
  • Databases would be hosted on SAN disk with an HBA fiber channel connection. Systems were to boot locally.

Overall, I thought this was a solid foundation. Some of the points were made by management, some were suggested by me.

The following basic technical requirements were developed:

  • Overall database needed to be approximately 5 GB for server. Actual use hit 15 GB by 2005. This growth factor was planned.
  • Oracle Server, one instance had to run on each server.
  • Oracle Application server one instance had to run on each server.
  • Legacy applications Natural/Software AG Adabas needed to run on each server.
  • Server configuration needs to be manged and tracked responsibly.
  • HP-UX bi-annual updates needed to be installed in a timely basis after quality assurance.
  • The replacement cycle on hardware would be 3-5 years to maintain cost savings provided by being under warranty (First three years)

Deployment Diagram

Server Deployment

Other Relevant facts on the decision making process.

  • HP Hardware and Software agreements were running over $30,000 per year on existing infrastructure.
  • Much of the cost was hardware support due to the age and near obsolescence of the hardware.
  • Significant savings could be obtained by using current hardware that was under warranty.
  • Systems would be configured and used to provide a disaster recovery solution.

Three vendors were picked to provide proposals. All ended up recommending HP-9000 L2000(later renamed rp5450) servers. Here are the highlights:

  • rp5450 systems with 2 GB system memory.
  • 146 GB dual disks to server as boot disks with software mirroring.
  • 2 CPU would be installed per server.
  • Memory capacity and purchase was planned to enable an upgrade to 8 GB without replacing exiting memory.
  • Two HBA Fiber channel cards provided per machine to provide redundancy and fail over.
  • A capital budget request was made showing that support cost savings would over the course of 4 years, completely recover the cost of the systems.
  • Systems would each have a Ultirum tape drive, for locally provided backups and Ignite-UX make_tape_recovery backups as part of the DR plan.
  • Systems had two Gigabit Network Interface cards.
  • Systems would have a private network for use in Ignite backup, recovery and system replication.
  • Systems were to be delivered with HP-UX 11.11
  • HP provided RAC and UPS and PDU were specified.

How it went:

  • Systems were delivered in May of 2002.
  • Initial OS install began immediately. Systems were initially delivered with HP-UX 11.00. We delayed start of installation until correct media was provided.
  • All three systems were installed with a base OS to insure that hardware was working.
  • OS patch requirements for Oracle, security and bi-annual updates were installed on the sand box. It was decided that Ignite Golden Image would be used to replicate the sand box configuration, once a stable configuration was found.
  • Significant problems were encountered with the Oracle and Oracle Application Server installations. The version was changed twice. Several major Oracle patch sets had to be installed to deal with “show stopper” bugs that were encountered.
  • After the September 11 attacks in New York City in 2001, a security review was conducted and the deployment plan was modified to include improved security. Several rounds of patching and tools testing occurred on the OS level.
  • In December of 2002, the application development team notified us that they were satisfied with the sandbox and asked that an Ignite image be made and transferred to the development system.
  • In January-February of 2003 Imaging was done and the system was replicated. There were OS problems with the Ignite replication that took several weeks to work out.
  • Several changes were requested by the development staff. They were tested on the sand box and then deployed on the development system.
  • An Ignite central server was built on the sand box to handle images which were shared on NFS and available for use after booting of the sandbox Ignite configuration.
  • In June of 2003 after several change cycles the configuration was approved for deployment.
  • Ignite replication was completed on the production environment using the sandbox, which had been frozen for this purpose as the image template.
  • In August of 2003 all legacy systems were cut over to the rp5450 systems. HR would be migrated 18 months later due to Integration issues.
  • In the early of 2004 due to performance and memory use issues all systems were upgraded to 8 GB of system RAM.
  • For the year 2004 there was no downtime in production systems during normal business hours.
  • Weekly Ignite tape backups were taken on all systems and network based backup to shared NFS was used as a secondary DR method.
  • In February of 2004 a DR test was run at the HP Performance center and we successfully migrated a sandbox image to an rp5470 server in the HP infrastructure. Legacy systems were tested and approved as functional.

Note: This document was designed entirely using the wordpress interface and a Linux system. The diagram was created with a free Linux alternative to visio called dia. The tool is in evaluation, and might be replaced. Still a pretty good start. Cost to produce this environment in licensing fees?: Zero dollars.

Tags: , , , , , , , ,

04 Sep 09 Creating Logical Volumes and Filesystems

Quick and Dirty Example here.

In our last example, we created a volume group vg03. It had thee disk, we expanded it to 4 because we planned proper capacity.

Our volume group now consists of 4 disks.

We are asked to create an approximately 10 GB files system in this SAN based volume group.

vgdisplay /dev/vg03

vgdisplay -v /dev/vg03

< Insert vgdisplay example here>

HP vgdisplay documentation link (Note this tends to change. I can’t help it if HP breaks the links)

This will show an empty volume group as we have not created any logical volumes

pvdisplay /dev/dsk/c10d0t1

… repeat for other disks …

<Insert pvdisplay examples here>

HP pvdisplay document link

Make sure nothing is on them.

Turns out 10 GB will fit quite nicely on a single disk. Since this is a SAN based disk, we need not worry here about raid configuration. If you are hosting an oracle rdbms, you should make sure the SAN admin sets up data, index and rollback as raid 1 or raid 10 to insure good performance.

lvcreate /dev/vg03

# Creates an empty logical volume on vg03. Uses default naming.

You can also do it this way if you like names.

lvcreate /dev/vg03 -n mydata

lvextend -L 10240 /dev/vg03/mydata /dev/dsk/c10t0d1

# This command creates an approximately 1024 MB logical volume and defines the disk it goes on. Always define the disk. Don’t let LVM or SAM decide where your data is going to go. Plan in advance. Note that LVM for Linux which is a feature port and not a binary recompile does let you define size 10 GB or 10240 MB. Still waiting for that feature on LVM for HP-UX.

newfs -F vxfs -o largefiles /dev/vg03/rmydata

# Why largefiles? Databases are big and the default limit on a file size in a file system is 2 GB. That is too small. I almost always set up my file systems these days for largefiles unless the file system itself is less than 2 GB

# Create a mount point.

mkdir /mydata

# mount it.

mount /dev/vg03/mydata /mydata

# This does not set an optimal JFS logging and recovery options, but that is a different article

bdf

# See if its there and the right capacity.

Next article: Edit /etc/fstab and set permanent mount options.

NOTE: This article needs to be checked and have vgdisplay and pvdisplay and other examples inserted into it.

Tags: , , , , , , ,

04 Sep 09 Q4 Crash Dump Analysis to Analyze System Dump Files

What follows is a document I found on the forums. It can also be found on the docs.hp.com site, but this is a paraphrase, with some extra commentary.

1) You need to have foresight. Before you have a crash you must enable your system to save crash dumps.

2) vi /etc/rc.config.d/savecrash    — set the first parameter to 1. Now when your system crashes, and some day it probably will you can perform q4 analysis and send the results to HP. I think this document originated within HP. I have one written somewhere on the forums, but his one is better.

USING Q4 TO ANALYZE SYSTEM DUMP FILES

————————————-

When a 11.X HP-UX system crashes, it saves a snapshot of RAM in swap and during the reboot, copies it into /var/adm/crash. Because these files are binary, a utility called “q4” is used to analyze them and create readable text from which the response center can determine the failure cause.

============================ STEP 1 ===========================

Dumps are normally saved to /var/adm/crash.

Verify you have a dump to analyze by doing:

# ll /var/adm/crash/cr*

You may see:

/var/adm/crash/crash.0/INDEX

/var/adm/crash/crash.0/vmunix.gz

/var/adm/crash/crash.0/image.0.1.gz

/var/adm/crash/crash.0/image.0.2.gz

/var/adm/crash/crash.0/image.0.3.gz

/var/adm/crash/crash.0/image.0.4.gz

^ your suffix may vary

The INDEX file contains and the /etc/shutdownlog contains the “panic” statement.

============================ STEP 2 ===========================

The following commands must all be run from the dump directory:

  • cd to the dump directory ie: cd /var/adm/crash/crash.0

^^^^ ^

your dump dir.

  • # /usr/contrib/bin/gunzip vmunix.gz

(uncompresses the kernel file – may already be done)

  • # q4prep -p

(ignore the error if this was previously done)

  • Now type:

# q4 -p .

^ Notice this ‘dot’

This will put you at the q4 utility prompt: q4>

  • The next command will get you a “fingerprint” of what was going on on the system at the time of the failure.

  • If you are working with an HP RCE at this time, type the following line and read the results to him:

trace event 0

Otherwise, simply type this next line and continue.

trace event 0 > trace

  • At the prompt type: include analyze.pl

\_letter “el”

  • At the next prompt type: run Analyze AU >> ana.out

  • At the next prompt type: exit

============================ STEP 3===========================

Generate a patch list:

# swlist -l product PH\* > patch_list

Using the CALL ID as the subject, email patch_list, ana.out and possibly the trace file and what.out to : hpcu@atl.hp.com

NOTE: Max 3MB email size

To speed future calls of this nature, open a call with the Response Center and inform them that you will send email with the call ID as the subject. Then send the ana.out and patch_list file to the email address listed above.

Tags: , , ,

03 Sep 09 HP-UX Patch Designations

Ever wonder what those letters in HP-UX patches stand for.

PHCO – General Command and libraries patches
PHKL – Kernel patches.
PHNE – Networking Patches
PHSS – Sub System patches (Anything else)

Any of these patches can force you to do a reboot. Kernel patches almost always involve a reboot

Tags: , ,

WhatsApp chat