|
Corruption of the utmp file shows up in two ways:
- The uptime and w commands show a time greater than 8000 days since the system was last booted.
Users are shown as still logged in when in fact they are not.
Both types of corruption have many causes because both AIX commands and third party applications write to the utmp file.
Problem: uptime greater than 8000 days
If record number 0 is overwritten by anyone (normally a third party program), the uptime shows up as greater than 8000 days.
To correct the invalid boot time you must reboot the system. The utmp file is recreated with each boot.
To attempt to discover who or what overwrote the first entry in the file, use the following command to create a readable version of the utmp file and look at record 0:
NOTE: The fwtmp command must first be installed. For AIX Version 4 and above, install bos.acct.
/usr/sbin/acct/fwtmp < /etc/utmp >/tmp/out
A valid entry looks something like this:
system boot 0 2 0000 0000 818538505 Sat Dec 9 13:48:25 CST 1995
Instead of the system boot entry, you will probably find an entry like:
jones pts/2 19193 7 0000 0000 818683926 Mon Dec 11 06:12:06 CST 1995
This output means that the time stamp was corrupted by whatever program jones on pts/2 used to login. A program should never overwrite the first two entries in the utmp file. You would have to talk with jones to see what he did. This is almost always caused by a third party program that is incorrectly writing to the utmp file or a corrupted file system where the data is invalid.
Problem: who or w show users logged in when they are not
When a user logs into the system, the /usr/sbin/getty program writes an entry in /etc/utmp like:
AIX Version 4
sandy pts/23 pts/23 7 42300 0000 0000 818973357
[more data...]
* *
Field #1 = user's name
Field #2 = /etc/inittab id
Field #3 = tty used to login on
Field #4 = type of entry
Field #5 = PID (process id)
The types of entries can be seen by examining the /usr/include/utmp.h file under ut_type. Type 7 is a USER_PROCESS.
When a user logs out, it is the responsibility of the last process running to update the entry in the utmp file. After a logout, the entry should look like:
AIX Version 4:
pts/23 pts/23 8 42300 0000 0000 818973357
[more data...]
* *
The user name is erased and the state is changed from 7 to 8 (DEAD_PROCESS).
The who command will only show entries that are in state 7.
How to determine what program caused the corruption
Set up auditing on writes to the utmp file.
Have cron do the who command each minute and send the results to a file.
When you notice corruption with the who or w command, check the cron output files to determine when the corruption occurred.
Look in the audit log to determine what process was writing to the utmp file at the time the corruption occurred.
This is an example of an audit log output:
event login status time command
---------- ------ ------- ---------------------- -------
UTMP_WRITE root OK Tue Dec 19 17:00:29 1995 telnetd
The example above shows that telnetd wrote to the file at 17:00:29.
Known problems:
For fixes related to utmp corruption, install the latest level of the following filesets:
- bos.rte.misc_cmds
bos.rte.tty
devices.tty.rte
bos.net.tcp.server
bos.net.tcp.client
|