KDump - Ken's Dump Utility, A User's Guide

Intended Audience
Why it is
What it is
Where it is
Karc, Ken's Archiver
Exceptions
Encryption
The Bdump Package
Restoring an Entire Filesystem

Intended Audience

This guide describes how an administrator of a unix based system would set things up so that filesystems are backed up by kdump. The users of the system will use the recover utility to restore files and are not interested in this document. A technical guide for kdump is also available.

Why it is

Ken's Dump Utility is designed to overcome some of the shortcomings associated with the standard Berkeley dump utility as it was used on the Uniform Access computers. To wit:

UIDs greater than 65,535 were not supported.
There was no good way to get an index of what dump images contained a particular file or files.
Users could not initiate the restore process themselves.
A restore of a file near the end of a 9 gigabyte dump image required reading the entire dump image across the network.
There was no optimal way to restore a single directory and its contents from an incremental dump image. If not all files changed since the previous lower level dump, you needed to go to the parent dump(s), but that got files that had been deleted between the full and the incremental.

Ken's Dump Utility is not designed to do a system level dump of the root partition. That should be done with mksysb or the equivalent. Kdump will work best with user filesystems, particularly ones that require periodic restores on selected files or directories. You can certainly use kdump to dump the root partition and restore lost files, but in case of a disaster, the root partition should be recreated through other means and then individual files extracted from the kdump dump images.

What it is

Ken's Dump Utility consists of a suite of utilities:

kdump: Kdump dumps individual filesystems across the network to a centrally maintained archiving system. The kdump utility takes the place of dump in the Berkeley system. Kdump does not read the raw device, it uses the normal filesystem interface. Advantage: It doesn't need to know about the filesystem structure. Disadvantage: It updates the access time of any file that it dumps and likely takes a performance hit.
krest: Krest retrieves files from kdump dump images. This corresponds to the restore utility in the Berkeley system.
karc: Karc controls the files in the centrally maintained archive. This can be used to generate lists of dump images, delete old dump images, etc. Karc can also be used to transfer individual files (bitstreams) to the archive system and retrieve them again.
recover: Recover is a cover for the krest utility. It presents a list of available dump images for a particular filesystem and then invokes krest to perform the restore.
bdump: Bdump is a suite of utilities that use kdump and karc to schedule filesystem dumps and thin out older ones that are no longer necessary.

Where it is

You can pick up a binary distribution for the following:

File Date Architecture Operating System

kdump_aixv43.tar.gz 2004-04-15 RS/6000 AIX version 4.3

kdump_aixv51.tar.gz 2004-04-15 RS/6000 AIX version 5.1

kdump_aixv52.tar.gz 2004-04-15 RS/6000 AIX version 5.2

kdump_du40.tar.gz 2001-12-12 Alpha Digital Unix 4.0/Compaq Tru64

kdump_linux24.tar.gz 2004-04-15 Intel Linux 2.4

kdump_redhat72.tar.gz 2004-04-15 Intel Redhat 7.2

kdump_rhe3.tar.gz 2004-04-15 Intel Redhat Enterprise 3

These tar files contain the following files:

-rw-r--r--   0 0     6492 Feb 26 09:10:30 2000 Readme
drwxr-sr-x   0 0        0 Feb 25 10:35:19 2000 local
drwxr-sr-x   0 0        0 Feb 26 11:09:07 2000 local/bdump
-rwxr-xr-x   0 0     1875 Feb 26 11:09:07 2000 local/bdump/bdump.sh
-rwxr-xr-x   0 0    40960 Feb 05 10:07:30 2000 local/bdump/bdump1
-rwxr-xr-x   0 0    24576 Oct 01 10:38:52 1999 local/bdump/bdump2
-rwxr-xr-x   0 0    32768 Oct 01 10:38:59 1999 local/bdump/bdump3
-rwxr-xr-x   0 0    32768 Oct 01 10:39:17 1999 local/bdump/bdump5
-rw-r--r--   0 0      810 Feb 25 10:44:22 2000 local/bdump/prune.sample
-rw-r--r--   0 0     1260 Feb 25 10:42:48 2000 local/bdump/sched.sample
drwxr-xr-x   0 0        0 Feb 25 10:13:55 2000 local/lib
drwx------   0 0        0 Feb 25 11:00:45 2000 local/lib/karc
-rw-r--r--   0 0     2976 Feb 05 08:48:52 2000 local/lib/karc/serv_keys
-rw-r--r--   0 0      323 Feb 25 10:26:37 2000 local/lib/karc/conf.sample
drwxr-sr-x   0 0        0 Feb 26 11:10:14 2000 local/bin
-rwxr-xr-x   0 0    57344 Feb 10 10:10:40 2000 local/bin/recover
-rwxr-xr-x   0 0   376832 Oct 05 14:09:43 1999 local/bin/lsc_keygen
drwxr-sr-x   0 0        0 Feb 25 10:35:19 2000 local/etc
-rwxr-xr-x   0 0   417792 Feb 10 10:17:01 2000 local/etc/karc
-rwxr-xr-x   0 0   425984 Feb 10 10:22:58 2000 local/etc/kdump
-rwsr-xr-x   0 0   450560 Feb 10 10:23:09 2000 local/etc/krest

Sources are also available upon request, although they may not be buildable out of the box without fudging portions of our maintenance and development system.

Karc, Ken's Archiver

Kdump uses the karc facility to transfer dump images to a central repository for later retrieval. The first step in getting kdump working is to get karc configured properly.

Karc's configuration files are stored in the /usr/local/lib/karc directory. The important files in this directory are:

/usr/local/lib/karc/conf (sample included): The configuration file specifies the names of the servers and the port to connect to. It also specifies the default encryption mode that kdump uses when dumping files.
/usr/local/lib/karc/serv_keys (included): The pubic keys for the server dæmons.
/usr/local/lib/karc/priv_keys (to be generated): The private keys for this computer.
/usr/local/lib/karc/dump_keys (to be generated): The keys necessary to decrypt private files on dump images.

Authorization and authentication to karc is accomplished through a public key mechanism. Each key has a name of the form:

karX_group_member

where X is the security level of your key as a hexidecimal digit. The higher the level, the more privileged the key. The group is an arbitrary character string representing the group that your computer is in and member is your computer's host name. Individual hosts will have a level 2 key, only allowing them to access their own dump images. Members of a group of hosts will be at level 4, allowing access to all the dumps for hosts in that group. The administrator of a group will have a level 7 key, allowing additional keys to be added for the group.

Use the lsc_keygen utility to generate your key and then send the public half of the key to the group administrator. For example:

	frodo% cd /usr/local/lib/karc
	frodo% lsc_keygen kar2_USERS_frodo priv_keys pub_keys
	frodo% /usr/ucb/mail -s "Here's my key" ken@u <pub_keys

At this point you can remove the pub_keys file. The priv_keys file contains your private key and should be kept secret. You should not do anything with that file that would cause it to be displayed on your screen or sent across the network in plain text. To remind yourself of the key names that you've generated, you can use:

	frodo% awk '{print $1}' /usr/local/lib/karc/priv_keys

This will merely print out the first token of each line which is the key's name.

Once the public half of your key is installed on the server, you should be able to use the karc utility:

	frodo% karc -list
	List frodo/: No matches

If you're going to be the administrator for this group, the key you just created should have been a level seven key. Additional keys should be created on the host that will be using them and the public half of the key copied or emailed to the computer with the level 7 key. The administration computer can then add the keys to the server with:

	frodo% rcp newhost:/usr/local/lib/karc/pub_keys newfile
	frodo% karc -key 7 -newkey newfile
	frodo% rm newfile

Exceptions

The normal use of the kdump utility is to dump whole filesystems starting at the root level and going down. If a filesystem that you want to dump has files or subtrees that you don't want dumped, an exception list can be generated to exclude those files or subtrees from getting dumped. For example, if your computer has /tmp set up as a simple directory on your root filesystem rather than being a filesystem of its own, you would want to list it as an exception so it doesn't get dumped.

There are two ways that directory trees can be excluded from the dump image. Having a file named .nodump in a directory will automatically cause all files in that directory (except for the .nodump file itself) to be excluded from the dump image. If you just want to exclude certain files based on their names, an exception list is the way to go.

Exception lists are normally placed in a file called e_list that appears at the top level of the filesystem to be dumped. Each line in this file lists a file or subtree relative to the top level of the filesystem. Any file or directory matching an entry in the exception list won't be dumped by kdump. To reduce the overhead in processing these exception lists, only simple pattern matching is allowed. A single asterisk can be used to specify a wild card that will match any token at that level in the path name. Wild cards must be specified first in the exception list, ahead of any non-wild cards that they would otherwise match.

For example, putting the following e_list file in the top level of the root filesystem:

    */*/core
    */core
    core
    tmp
    var/tmp

will prevent kdump from dumping any file named core found at the top three levels of the filesystem as well as the /tmp and /var/tmp directory trees assuming they were all part of the root filesystem. If /var is a mount point for a separate filesystem and other directories within /var are to be dumped, a separate /var/e_list file containing:

    */core
    core
    tmp

would be required to prevent kdump from dumping /var/tmp and any core file in /var or other subdirectory thereof.

Encryption

If you don't want your privates exposed to the world, you'll want to encrypt your files as they're sent across the network by kdump to the archive server. You should create a dump_keys file at this time containing your key with:

	frodo% cd /usr/local/lib/karc
	frodo% lsc_keygen kar1_USERS_frodo_991201 /dev/null dump_keys

If you're going to have multiple computers in the same group that will be decoding the files on dump images made by other computers, each of those computers will need access to the public half of the keys used to encrypt those files. You can either copy the entries from the dump_keys file yourself or you can have the group administrator install that key onto the server. The down side of storing the key on the server is that the server administrator can then decrypt your private files. The up side, presuming you trust the server administrator, is that you don't lose your key when you lose your disk. In either case, it's a good idea to copy that key somewhere safe such as a floppy disk as well.

Note that the name of this key is arbitrary unless you decide to store it on the server for automatic retrieval. In that case it must adhere to the karX_group_member naming convention. A process can only retrieve a key from the server that has a level less than or equal to its own with a matching group (if its level is less than 10) and member (if its level is less than 3). Encryption keys should never be changed as you won't be able to decrypt dumps encrypted with the earlier version of the key. Rather, new keys should be created with a different name. Appending the date to the name of the key would be a good idea.

The private half of this key is never used. Actually, all we need is a random text string and the lsc_keygen utility happens to generate a suitable string.

If you're going to have multiple hosts with multiple encryption keys shared between them, the files can be concatenated together with the unix cat command. It would probably be best to generate the keys on one computer with lsc_keygen, having it append to the existing file, and then distribute that file to the other hosts with scp or floppies or other secure mechanism. The content of the dump_keys file should not be copied across the net in clear text.

After creating the key, change the Encrypt line in the /usr/local/lib/karc/conf file to reference it:

	Encrypt Private  kar1_USERS_frodo_991201

The Bdump Package

Once you have installed karc, kdump and krest, you can use the kdump command to dump filesystems and the krest command to restore files from dump images, but the recover utility requires some additional processing as provided by the bdump package.

The bdump package (so named because it is the second generation of the adump utility which got its name from Automated Dumps) consists of a script and several utilities that use karc and kdump to maintain a reasonable collection of dump images for each filesystem on a particular computer. The utilities and configuration files for the bdump package are stored in the /usr/local/bdump directory.

Bdump1: Schedule New Dumps

The first step of the bdump suite is to generate a list of filesystems to dump. The bdump1 utility scans the mount table to find all the native filesystems. It then invokes a karc -list command to find the current list of dump images in the karc warehouse and then consults the /usr/local/bdump/sched configuration file to determine which filesystems are due to be dumped.

A typical sched file might appear as:

  #
  #  Bdump schedule file.
  #
  #  Syntax is <delim><pat><delim>{<white><type>"="<hours>"/"<pct>}
  #
  #  Where delim  Is some delimiter that does not appear in the pattern.
  #        pat    Is a regular expression to match the filesystems for this
  #               dump schedule (see ed(1)).
  #        white  Is space or tab characters.
  #        type   Is a single letter, usually F for full and I for incremental.
  #        hours  Is the minimum time to elapse before doing another dump at
  #               at this level.  Specifying hours of -1 requests no dump.
  #        pct    Maximum fraction of the size of the previous dump at this
  #               level that the level n+1 dump can be before doing another
  #               dump at this level.
  #
  #  The schedule on the first pattern matching the filesystem will be used.
  #  The type=hours/pct fields are repeated for each dump level desired.  The
  #  last entry's fraction must be 1.0.

  '^/tmp'         F=-1/1.0
  '^/crash'       F=-1/1.0
  '^/mnt'         F=-1/1.0
  '^/inst.images' F=2880/0.01     I=12/1.0
  '^/$'           F=1440/0.3      I=12/1.0
  '^/usr$'        F=1440/0.3      I=12/1.0
  '^/var$'        F=1440/0.3      I=12/1.0
  '^/usr/local$'  F=1440/0.3      I=12/1.0
  '^/home$'       F=1440/0.3      I=12/1.0
  '^/wgt'         F=-1/1.0
  '^/wg'          F=1440/0.3      I=168/.9        I=12/1.0
  '^/sy'          F=1440/0.3      I=12/1.0
  '^/tulsa$'      F=1440/0.3      I=12/1.0
  '^/tulsa/src$'  F=1440/0.3      I=12/1.0

In the above, any filesystem starting with "/wgt" will not be dumped. The "/wg" filesystems will get a full dump at least every 1440 hours (60 days) or when the previous incremental level 1 dump is over 30% the size of the last full. A level 1 dump will be done at least every 168 hours (1 week) or when the previous incremental level 2 dump is over 90% the size of the last incremental level 1 dump. A level 2 dump will be done every day (as long as the previous level 2 dump was done at least 12 hours earlier).

The process of scheduling a dump consists of simply putting a file into the /usr/local/bdump/todo subdirectory. The name of the file indicates the mount point of the filesystem to be dumped (with slashes replaced by underscores). The contents of the file consists of three fields: an integer specifying the level of the dump, the block device of the filesystem and the rest of the line is a comment to be assigned to the dump image.

Bdump2: Perform Dumps

The second phase of the bdump suite is to perform the actual dumps. The bdump2 utility is invoked from within the /usr/local/bdump/todo directory. It invokes appropriate kdump commands to dump each filesystem listed.

Bdump3: Prune Old Dumps

The third phase of the dump process is to prune out old dump images that are no longer required. The bdump3 utility gets a list of all the dumps in the warehouse and then deletes the ones that are no longer necessary based on the schedule defined in its configuration file, /usr/local/bdump/prune. The idea behind the pruning is that you want to be able to restore files from the past. As the distance into the past that you must reach grows, the less particular you are about getting a dump from a specific date. Thus if you want to restore a file that was deleted today, you want to be able to find it on last night's dump, but if you want to restore a file that was deleted last week, chances are that any day in the previous week would suffice. If you have to go back a month, any dump made that quarter might be okay. A typical pruning file might be:

  #
  #  Bdump pruning file.
  #
  #  Syntax is <delim><pat><delim>{<white><days>":"<keep>}
  #
  #  Where delim  Is some delimiter that does not appear in the pattern.
  #        pat    Is a regular expression to match the filesystems for this
  #               dump schedule (see ed(1)).
  #        white  Is space or tab characters.
  #        days   Is an integer specifying a count of days prior to now.
  #        keep   Is an integer specifying the maximum number of days one
  #               should go back before finding a dump of the specified
  #               file system.
  #
  #  The schedule on the first pattern matching the filesystem will be used.
  #  The days:keep fields are repeated.
  #

  '^/$'                   7:1     30:7
  '^/usr$'                7:1     30:7
  '^/var$'                7:1     30:7
  '^/usr/local$'          7:1     30:7
  '^/inst.images'         2:1     30:7
  '^/home$'               7:1     30:7
  '^/wg'                  7:1     14:3    30:7  60:30   370:100
  '^/sy'                  7:1     14:3    30:7  60:30
  '^/tulsa$'              7:1     14:3    30:7  60:30
  '^/tulsa/src$'          7:1     14:3    30:7  60:30

For the /wg disks, if you want to restore a file from 0-7 days ago you should only have to go back one day. If you're looking for a file that was deleted 8-14 days ago, you should be able to find it within 3 days. One that was deleted within the last two months (60 days) should be found on a dump no more than one month old (30 days). Anything up to 370 days ago will be found on the quarterly dumps which are made every 90 days or so. It is assumed that nothing older than 370 days will need to be restored.

The pruning process will retain approprate dump images to do the above restores plus any parent dumps of those dumps. Thus if a full dump of /usr was made in August and daily incrementals were done after that, at the end of December we would have the last seven daily dumps, one dump for each of the preceeding three weeks and the full dump that was made in August.

Bdump3 simply creates a script of karc -erase commands which is then invoked by the bdump.sh script. If you've got a filesystem with special pruning needs, this phase can be replaced with whatever manual or automated process you desire.

Bdump4: Offsite Copies

The bdump4 utility is no longer required. It made a second copy of selected dump images. That copy was placed on tapes that were moved to an offsite location in case of a disaster. This phase is no longer required because the karc process automatically writes the dump images to two different sets of tapes at two different locations.

Bdump5: Catalogue Dumps

The next phase is to maintain a simple list of dump images. This list is stored on files called b_list that are stored in the top level root directory of the filesystems that were dumped. The lists are created by the bdump5 utility. The b_list file is used by the recover utility to determine where the filesystem divisions are and what dump images are available for that filesystem. When a user or the administrator of a system requests that a file be restored, recover starts at the indicated directory on the running system and works its way back through the filesystem hiearchy (looking in the ".." directory at each level) until it finds a b_list file or reaches the mount point.

Restoring an Entire Filesystem

If you've lost an entire filesystem due to a hardware error or some other disaster, you won't have the b_list file out there. In order to perform a restore of a wiped out filesytem, you'll need to do a karc -list command and then invoke the krest utility manually. For example:

   #seuss1> karc -list seuss1.usr_local
   C-di      79.72  seuss1.usr_local.1999.11.19.08.0     F-Size
   C-di      22.41  seuss1.usr_local.1999.11.26.17.1     I-Aged From 99.11.19.08
   C-di      22.27  seuss1.usr_local.1999.12.03.14.1     I-Aged From 99.11.19.08
   C-di      83.56  seuss1.usr_local.1999.12.09.11.0     F-Size
   C-di      81.32  seuss1.usr_local.1999.12.15.20.0     F-Size
   C-di      76.68  seuss1.usr_local.1999.12.22.21.0     F-Size
   C-di      21.94  seuss1.usr_local.1999.12.24.09.1     I-Need From 99.12.22.21
   C-di      22.28  seuss1.usr_local.1999.12.25.15.1     I-Aged From 99.12.22.21
   C-di      22.31  seuss1.usr_local.1999.12.26.23.1     I-Aged From 99.12.22.21
   C-di      22.32  seuss1.usr_local.1999.12.27.17.1     I-Sync From 99.12.22.21
   C-di      22.34  seuss1.usr_local.1999.12.28.22.1     I-Aged From 99.12.22.21
   C-di      23.64  seuss1.usr_local.1999.12.31.17.1     I-Aged From 99.12.22.21
   #seuss1> mount /dev/lv04 /mnt
   #seuss1> cd /mnt
   #seuss1> krest seuss1.usr_local.1999.12.31.17.1

If all the files on the incremental dump image made on December 31st were selected to be restored, the krest utility would automatically go back to that full dump that was made on December 21st. The end result will be that the filesystem will look pretty much as it did on December 31st, without the files that got deleted between December 21st and the 31st reappearing.

Recover User History

To make life easier for users and system administrators, the recover utility will make use of a history database that keeps track of users' home directories. The the database doesn't exist, recover will do its darnedest with the /etc/passwd file to figure out where the files might be, but quite often the reason you're restoring files for some slob is because his account has been deleted. Six months after it's removed from /etc/passwd, and the user comes wandering back looking for his files how are you going to find them when they could be on any of several dozen filesystems without even remembering when they were deleted.

The updhist utility accepts an input file that consists of text lines of blank delimited fields:

Timestamp in time_t format.
User's UID from /etc/passwd
User's username from /etc/passwd
Characters representing "clusters" affected
User's home directory from /etc/passwd or "NONE"

For example:

   1041620045   95784  rls2       a        /jr02/d40/rls2
   1041620407   95784  rls2       @        NONE
   1041621123  159972  bunny21    @        /bp12/d21/bunny21
   1041621128  159972  bunny21    D        /da39/d81/bunny21
   1041621611  159973  pq         @        /bp02/d55/pq
   1041621616  159973  pq         D        /da23/d15/pq
   1041621699  159974  fgarner    @        /mailer03/d90/fgarner
   1041622337  159975  rhg20      @        /bp09/d22/rhg20
   1041622342  159975  rhg20      D        /da35/d82/rhg20
   1041622825  159976  mackfiny   @        /ep07/d56/mackfiny

The cluster codes are single letter codes that are translated into something recognizable to the user with a simple "clusters" table. For example:

   H       Homer
   D       Dante
   A       Aagaard
   M       Mead
   J       Goodall
   S       Saul
   B       Becker
   V       Melville
   N       Servers
   @       UW Email
   W       Web Publishing
   X       Student Web
   T       Streaming Media
   C       Student Streaming Media
   b       MyUW.net
   a       MyUW.net Email
   c       MyUW.net Web

If whatever you've got that creates new accounts, deletes old accounts, or moves users' home directories from one filesystem to another can be modified to create log entries of the form above, they can be fed into the updhist utility regularly to generate and maintain the recover history database. This database should then be placed on an NFS mounted filesystem accessible to anyone who would be needing to restore their files.

Ken Lowe
Email -- ken@u.washington.edu
Web -- http://staff.washington.edu/krl/

File	Date	Architecture	Operating System
kdump_aixv43.tar.gz	2004-04-15	RS/6000	AIX version 4.3
kdump_aixv51.tar.gz	2004-04-15	RS/6000	AIX version 5.1
kdump_aixv52.tar.gz	2004-04-15	RS/6000	AIX version 5.2
kdump_du40.tar.gz	2001-12-12	Alpha	Digital Unix 4.0/Compaq Tru64
kdump_linux24.tar.gz	2004-04-15	Intel	Linux 2.4
kdump_redhat72.tar.gz	2004-04-15	Intel	Redhat 7.2
kdump_rhe3.tar.gz	2004-04-15	Intel	Redhat Enterprise 3