AIX UNIX system administration

IBM AIX/UNIX system storage administration ksh/perl scripting

Wednesday, June 03, 2009

AIX 5.3 Commit Applied Software Updates failed

Problem:

Tried to commit applied Software Updates, failed with the following messages.

MISSING REQUISITES: The following filesets are requisites of one or more
of the selected filesets listed above. They are not currently installed
on the system. You should install these requisites to ensure that the
selected filesets function correctly. You MUST install these requisites
before committing the selected filesets.

bos.rte 6.1.0.0 # Base Level Fileset
devices.common.IBM.mpio.rte 5.2.0.50 # MPIO Disk Path Control Module
xlC.aix61.rte 9.0.0.1 # Fileset Update

<<>>

Solution:

Force install fileset devices.common.IBM.mpio.rte. (-F option below)

# installp -a -d . -F devices.common.IBM.mpio.rte

+-----------------------------------------------------------------------------+
Pre-installation Verification...
+-----------------------------------------------------------------------------+
Verifying selections...done
Verifying requisites...done
Results...

SUCCESSES
---------
Filesets listed in this section passed pre-installation verification
and will be installed.

Selected Filesets
-----------------
devices.common.IBM.mpio.rte 5.3.9.1 # MPIO Disk Path Control Module

<<>>

+-----------------------------------------------------------------------------+
BUILDDATE Verification ...
+-----------------------------------------------------------------------------+
Verifying build dates...done
FILESET STATISTICS
------------------
1 Selected to be installed, of which:
1 Passed pre-installation verification
----
1 Total to be installed

0503-409 installp: bosboot verification starting...
installp: bosboot verification completed.
+-----------------------------------------------------------------------------+
Installing Software...
+-----------------------------------------------------------------------------+

installp: APPLYING software for:
devices.common.IBM.mpio.rte 5.3.9.1


. . . . . <<>> . . . . . . .
Licensed Materials - Property of IBM

5765G0300
Copyright International Business Machines Corp. 1995, 2008.

All rights reserved.
US Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corp.
. . . . . <<>>. . . .

Finished processing all filesets. (Total time: 20 secs).

0503-409 installp: bosboot verification starting...
installp: bosboot verification completed.
0503-408 installp: bosboot process starting...

bosboot: Boot image is 38319 512 byte blocks.
0503-292 This update will not fully take effect until after a
system reboot.

* * * A T T E N T I O N * * *
System boot image has been updated. You should reboot the
system as soon as possible to properly integrate the changes
and to avoid disruption of current functionality.

installp: bosboot process completed.
+-----------------------------------------------------------------------------+
Summaries:
+-----------------------------------------------------------------------------+

Installation Summary
--------------------
Name Level Part Event Result
-------------------------------------------------------------------------------
devices.common.IBM.mpio.rte 5.3.9.1 USR APPLY SUCCESS
devices.common.IBM.mpio.rte 5.3.9.1 ROOT APPLY SUCCESS


Now smit commit works file.

Monday, May 25, 2009

Which Process Is Using Up Most The CPU Resources


Question


How can you determine which process is using up the most CPU time?


Cause



Answer

The following commands and tools can be used to find which process is using the most cpu resources.

1. topas -P



In the topas -P output above the process called "cpu-eater" is the top consumer of cpu resources.

2. tprof -x sleep 10; vi sleep.prof



bosboot fails with malloc error 0301-106

Problem(Abstract)

During or after an OS upgrade, bosboot fails with the following error:

0301-106 /usr/lib/boot/bin/mkboot_chrp the malloc call failed for size

0301-158 bosboot: mkboot failed to create bootimage.

0301-165 bosboot: WARNING! bosboot failed - do not attempt to boot device.

Symptom

During or after an OS upgrade, bosboot fails with the following error:

0301-106 /usr/lib/boot/bin/mkboot_chrp the malloc call failed for size

0301-158 bosboot: mkboot failed to create bootimage.

0301-165 bosboot: WARNING! bosboot failed - do not attempt to boot device.


Cause


Environment

Recently upgraded AIX OS

Diagnosing the problem

Check size of PdDv.vc ODM class file...

eg...

# ls -al /usr/lib/objrepos/PdDv*
-rw-r--r-- 1 root system 110592 Apr 14 11:42 PdDv
-rw-r--r-- 1 root system
200937472 Apr 14 11:42 PdDv.vc

Resolving the problem

bosboot uses the PdDv ODM class files to build device information into the boot image and pre-allocate memory for these devices. If the file is too large, malloc cannot satisfy the request, causing bosboot to fail.

The following instructions can be used to reduce the size of the PdDv.vc file:

# mkdir /tmp/objrepos
# cd /tmp/objrepos
# export ODMDIR=/usr/lib/objrepos
# odmget PdDv > PdDv.out
# cp /usr/lib/objrepos/PdDv /usr/lib/objrepos/PdDv.bak
# cp /usr/lib/objrepos/PdDv.vc /usr/lib/objrepos/PdDv.vc.bak
# export ODMDIR=/tmp/objrepos
# echo $ODMDIR
# odmcreate -c /usr/lib/cfgodm.ipl
# ls -l PdDv*
# odmadd /tmp/objrepos/PdDv.out
# ls -l PdDv*
# cp /tmp/objrepos/PdDv /usr/lib/objrepos/PdDv
# cp /tmp/objrepos/PdDv.vc /usr/lib/objrepos/PdDv.vc
# export ODMDIR=/etc/objrepos
# rm -rf /tmp/objrepos

Using DBX and KDB to build stack traces

Question

I have a hung process, how can I get a stack trace of it?

Cause



Answer

NOTE: Not all processes that show up in ps -ef will be able to have stack traces built on them. Old processes tend to be eventually paged out of memory and neither dbx or kdb will then be able to be used to look at the stack trace for that process.

DBX Stack Trace Instructions for building a stack trace on a hung process:

In order to use dbx, the customer must first have the fileset
bos.adt.debug installed.

Attach to hung process
1. Capture console output, enter:
script

2. Enter:
ps -ef | grep

3. Enter:
dbx -a

4. Format trace, enter:
where

5. Leave dbx, enter:
detach (Typing quit will kill the process)

6. To leave script, type exit.
The script will be named typescript and will be located in the current
working directory.


Steps to obtain thread stack trace using kdb
Using the alog process as an example.

1) Start script session to capture data:
# script /tmp/kdb.out

2) Find the process id and convert it into hexadecimal:

# ps -ef | grep alog
UID PID PPID C STIME TTY TIME CMD
root 1231 1 1 Jun 30 - 1:12 alog

Convert 1231 to Hexadecimal number
1231 converts to 4CF

3) Start kdb

# kdb

4) Locate the process while in kdb

(0) p * | grep 4CF

Ex.
pvproc+013800 78 alog ACTIVE 004E036 004A01E
0000000002525400 0 0001

5) find initial thread
(0) p (PSLOT) | grep pvthread [The pslot is the second
column. In the above example, it is 78]

6) locate initial thread in 'p' output
example:

...
THREAD..... threadlist :EA005E00
...

7) list function stack for initial thread
(0) f pvthread+005E00

8) Exit out of the script session
# exit
Data will be saved in /tmp/kdb.out.


The procstack command can also be used to print the stack of a process.

# ps -ef | grep alog
root 491752 450572 0 15:45:52 pts/4 0:00 alog

# procstack 491752
491752: alog
0xd0375da4 read(??, ??, ??) + 0x1a8
0x10001500 main(??, ??) + 0x11b0
0x10000198 __start() + 0x98

JFS2 Snapshot Quick Reference

Question
This document is a quick guide to using snapshots of JFS2 filesystems


Answer
The JFS2 snapshot command will create an image of a filesystem at a point in time, allowing the user to back up data from the snapshot rather than from the original filesystem. This allows backing up data without having to stop using it first.
The concept used in the snapped filesystem is "copy on write". During creation of the snapshot filesystem the source filesystem is quiesced while the copy is made, to insure a proper copy. Then only the filesystem structure is created. When any modification is done to the source system, such as a write of data or delete, the original data is copied into the snapped filesystem.

Usually a snapshot filesystem will only need to be 2-6% of the size of the original filesystem, due to this copy-on-write feature.

* Creating a snapshot:
Find out the size of the filesystem:

# lsfs -q /origfs
Name Nodename Mount Pt VFS Size Options Auto
Accounting
/dev/fslv02 -- /origfs jfs2 4194304 rw,cio no
no
(lv size: 4194304, fs size: 4194304, block size: 4096, sparse files: yes, inline log: no, inline log size: 0, reserved: 0, reserved: 0, DMAPI: no, VIX: yes)

In the lsfs -q output the size is reported in 512-byte blocks. So in the above example the filesystem and logical volume are 2Gb in size. We'll make the snapshot filesystem 204Mb (10% of the original).

# snapshot -o snapfrom=/origfs -o size=419430
Snapshot for file system /origfs created on /dev/fslv05

* Mounting a snapshot:
# mount -v jfs2 -o snapshot /dev/fslv05 /mysnap

* Finding out if a fs has a snapshot already:
# snapshot -q /origfs

Snapshots for /origfs
Current Location 512-blocks Free Time
* /dev/fslv05 419430 418662 Fri Apr 21 08:30:36 PDT 2006

* Deleting a snapshot:

# snapshot -d /dev/fslv05
rmlv: Logical volume fslv05 is removed

For further information see the man page for the snapshot command.

Friday, May 22, 2009

Replacing a disk in an SSA RAID5 Array

Replacing a disk in an SSA RAID5 Array


Environment

OS level: 4.3.x - 5.x
SSA Raid 5 Array

Problem

How do I replace a disk in an SSA RAID5 array?

Solution

If the disk has not been rejected from the array:
Enter smitty ssaraid and select the following:
--> Change Member Disks in an SSA RAID Array
--> Remove a disk from an SSA RAID Array
--> Select the array in question and remove pdisk#...

The following steps apply to both rejected and non-rejected disks:
2) Have the CE physically replace the disk (he should set it in service mode).
3) rmdev -dl pdisk# ; cfgmgr -vl ssar
4)Enter smitty ssaraid and select the following:
--> Change/Show use of an ssa physical disk
--> Change the new pdisk's "current use" to Array Candidate.
5) smitty ssaraid
--> Change Member Disks in an SSA RAID Array
--> Add a disk to an SSA RAID Array
--> Add the new pdisk definition to the array

This definition should now be in a "degraded" state. After adding the new disk the array will go into a "Rebuilding" state, and ultimately a "Good" state after the rebuild operation is complete. This progress can be monitored via:

smitty ssaraid
--> List Status of all Defined SSA RAID Arrays
--> These numbers will get smaller as the array rebuilds and once they all go to zero, the array should be in a "Good" State"

This menu does not update dynamically, you will have to exit out, then go back in to see the progress.

Wednesday, May 06, 2009

Determine which process is using a specific network port on AIX with or without lsof

Method 1. Using lsof.

It will be easy if you have lsof installed.

# lsof -i:32876
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
oracle 135744 oracle7 13u IPv4 0x70bbe200 0t11 UDP loopback:32876

# ps -ef|grep 135744
oracle7 135744 1 0 Apr 21 - 2:45 ora_pmon_ftc_p01

Method 2. Using netstat and rmsock

# netstat -Aan|grep 9991
f100020000626b98 tcp4 0 0 *.9991 *.* LISTEN

# rmsock f100020000626b98 tcpcb
The socket 0x626808 is being held by proccess 200928 (sysscand).

#ps -ef|grep 200928
root 200928 1 0 Apr 21 - 0:01 /opt/sysscan/bin/sysscand


Method 3: Using netstat and kdb.

# netstat -Aan|grep 9991
f100020000626b98 tcp4 0 0 *.9991 *.* LISTEN

# kdb
The specified kernel file is a 64-bit kernel
Preserving 1418431 bytes of symbol table
First symbol __mulh
START END
0000000000001000 0000000003E0D050 start+000FD8
F00000002FF47600 F00000002FFDC940 __ublock+000000
000000002FF22FF4 000000002FF22FF8 environ+000000
000000002FF22FF8 000000002FF22FFC errno+000000
F100070F00000000 F100070F10000000 pvproc+000000
F100070F10000000 F100070F18000000 pvthread+000000
PFT:
PVT:
id....................0002
raddr.....0000000000724000 eaddr.....F200800030000000
size..............00040000 align.............00001000
valid..1 ros....0 fixlmb.1 seg....0 wimg...2
ERROR: Unable to acess nfs_syms

(0)> sockinfo f100020000626b98 tcpcb
.................
..............

on last a few lines.

proc/fd: 49/0
proc/fd: fd: 0
SLOT NAME STATE PID PPID ADSPACE CL #THS

pvproc+00C400 49*sysscand ACTIVE 00310E0 0000001 00000000285C7400 0 0001



(0)> hcal 00310E0
Value hexa: 000310E0 Value decimal: 200928

(0)> quit

# ps -ef|grep 200928
root 200928 1 0 Apr 21 - 0:01 /opt/sysscan/bin/sysscand

Wednesday, April 29, 2009

Backupios Fails with 0512-008 savevg

Problem(Abstract)
Backupios command fails with 0512-008 savevg: The mkvgdata command failed. Backup canceled.
Symptom
$ backupios -file /home/padmin/mksysb/ibm74vioa_mksysb -mksysb


/home/padmin/mksysb/ibm74vioa_mksysb doesn't exist.

Creating /home/padmin/mksysb/ibm74vioa_mksysb
Backup in progress. This command can take a considerable amount of time
to complete, please be patient...


Creating information file (/image.data) for rootvg.
0512-008 savevg: The mkvgdata command failed. Backup canceled.

/usr/bin/mkvgdata[1068]: -: more tokens expected


Cause
This error is caused by having a user created filesystem mounted within rootvg.

Resolving the problem
Unmount all user-created filesystems in rootvg and re-run backupios command.

Monday, April 27, 2009

Mail stucked in /var/spool/mqueue

Problem:
Thousands of mails are stucked in /var/spool/mqueue

Solution:

Manually test sendmail:

sendmail -v -q
Warning: .cf file is out of date: sendmail AIX5.3/8.13.4 supports version 10, .cf file is version 9

Running /var/spool/mqueue/n3RLGNaZ195422 (sequence 1 of 30)
dtuser... Connecting to local...
dtuser... Deferred: local mailer (/bin/bellmail) exited with EX_TEMPFAIL

Running /var/spool/mqueue/n3RLHQDC198652 (sequence 2 of 30)
dtuser... Connecting to local...
dtuser... Deferred: local mailer (/bin/bellmail) exited with EX_TEMPFAIL


Check permission of /var/spool/mail and /var/spool/mqueue

ls -ld /var/spool/mail
drwxr-xr-x 2 bin mail 512 Jul 30 2007 /var/spool/mail
ls -ld /var/spool/mqueue
drwxrwx--- 2 root system 6376448 Apr 27 14:47 /var/spool/mqueue

Permission of /var/spool/mail should be 775.

chmod 775 /var/spool/mail

Now problem is solved.

sendmail -v -q
Warning: .cf file is out of date: sendmail AIX5.3/8.13.4 supports version 10, .cf file is version 9

Running /var/spool/mqueue/n3RLYND7164390 (sequence 1 of 32)
dtuser... Connecting to local...
dtuser... Sent

Running /var/spool/mqueue/n3RLZQLo185340 (sequence 2 of 32)
dtuser... Connecting to local...
dtuser... Sent

Running /var/spool/mqueue/n3RLaUAI198578 (sequence 3 of 32)
dtuser... Connecting to local...
dtuser... Sent

Running /var/spool/mqueue/n3RLbWZV198618 (sequence 4 of 32)
dtuser... Connecting to local...
dtuser... Sent

Friday, April 24, 2009

Reconfigure the console on AIX

Problem:

When use vtmenu or mkvterm on the HMC to establish a console session, the console is blank.

Solution:


To completely remove vsa0 and vty0 from ODM and have system come up on
reboot and prompt you to set this terminal as your console.

If you have network access you can do this from a telnet or ssh session
If you have no network access you will need to boot into Maintenance
Mode.

- List all vty's and tty's on system
# lsdev -Cc tty

- Delete all vty's and tty's from ODM

# odmdelete -q name=tty0 -o CuDv <---- run this command for
all vty's and tty's
0518-307 odmdelete: 1 objects deleted.

- List all vsa's on system
# lsdev -Cc adapter | grep vsa

- Delete all vsa's from ODM

# odmdelete -q name=vsa0 -o CuDv <---- run this command
for all vsa's and sa's
0518-307 odmdelete: 1 objects deleted.

# odmdelete -q attribute=syscons -o CuAt
0518-307 odmdelete: 1 objects deleted.

# bosboot -ad /dev/ipldevice
bosboot: Boot image is 23794 512 byte blocks.

# sync

# savebase

# shutdown -Fr

Define your console.


******* Please define the System Console. *******

Type a 2 and press Enter to use this terminal as the
system console.

About Me

Sharing is caring
View my complete profile
Your Ad Here

Labels

Blog Archive

BlogCatalog

Trend Watch