FSTORE

FSTORE, File storing/archiving utility

What is FSTORE?

The FSTORE utility was designed to fill the need of allowing the various hosts and workstations to throw their disk backups into the network somewhere with the ability to retrieve them at some later date. Statistics on how much each system has dumped are available. This is a table showing the number of gigabytes that were in use two weeks ago, followed by the number of gigabytes added to that over the following week and then the amount added on each of the last four days.

The FSTORE/BACKUP service consists of three elements:

fstore the client.
backupd the daemon.
backupmgr the database tweaker.

Going hand in hand with the BACKUP service is the bdump utility which is not, per se, part of the BACKUP service.

In normal operation, bdump, the automatic dump scheduling utility, will determine which filesystems need to be dumped and invoke the fstore client to transfer the contents of the dump images across the network to be stored in the BACKUP service's database. At some later time an operator or system administrator will invoke the fstore client to list out what dumps are available and to retrieve an actual dump image, piping it into the restore utility. The bdump utility will also take care of determining which dumps are obsolete and invoke the fstore client to remove them from the database.

1) Fstore, the client.

The fstore utility resides on each system that wants to take advantage of the BACKUP service. In order to make life easier, the utility is kept on the root filesystem, in the /etc directory. The idea is that a system is not going to be very useful without its root filesystem, so if that gets destroyed some extra work is necessary to get it back regardless of where fstore is. If you have a valid root filesystem, with the fstore client on it you should be able to restore the rest of the filesystems without too much hassle. If the fstore client was on the filesystem you needed to restore using fstore it would be too much hassle.

The syntax of the fstore command is:

      fstore -command filename [other parameters]

The commands are:

DUMP

Indicates that fstore should take its standard input and send it across the network to the daemon giving it the specified filename.

BDUMP

This version of DUMP buffers the data on disk before it is flushed to tape. BDUMP is the default.

TDUMP

This version of DUMP writes directly to tapes without going through the intermediate disk buffer.

RESTORE

Indicates that fstore should ask the daemon to send the contents of the specified file across the network which fstore will send to its standard output.

ERASE

Indicates that fstore should ask the daemon to forget everything about the specified file.

COMMIT

Indicates that fstore should ask the daemon to declare the specified file a completed thing. When dumping, the fstore client detects the end of the file by an EOF condition on its standard input. This may be due to the dump completing successfully or failing -- fstore has no way of knowing. The bdump utility can check the exit status of the dump process it invoked and then issue an "fstore -commit file" if it worked properly. If a system crashes while it is performing a dump, the file won't end up being committed. If a file remains in the uncommitted state for too long, it will be removed from the database automatically.

COPY

Indicates that fstore should ask the daemon to make a copy of the specified file. DCOPY should be used rather than COPY if possible.

DCOPY

Indicates that fstore should ask the daemon to schedule a copy of the specified file. DCOPY will return immediately whereas COPY will wait for the copy to complete. The copy requested by DCOPY will be done when the backup daemon has idle time on its hands (fewer than five current processes running).

EJECT

Indicates that fstore should ask the daemon to eject the specified file for offsite storage.

RETURN

Indicates that fstore should tell the daemon that the specified file can be brought back onsite again.

SET

Indicates that fstore should ask the daemon to set a new password or new comment on the specified file.

LIST

Fstore asks the daemon to generate a list of files that match the specified filename pattern. Patterns are regular expressions as in ed(1) The default pattern is "host.*" where host is the current system.

The convention used to name files is: host.fs.lev.yy.mm.dd.hh where host is the system doing the dump, fs is the filesystem, lev is the level of the dump, 0 for full, 1 for incremental, yy.mm.dd is the day the dump was made and hh is the hour the dump was started.

The other, optional, parameters are:

-PASSWORD pwd

The default password is retrieved from the /etc/fstore.pw file. This file contains a 9 character word that is sent to the daemon on all requests. If the first 5 characters of the supplied password match the password associated with the specified file in the BACKUP database, the list command and restore commands will work. The rest of the commands require a match in all nine characters. The "make install" procedure for the client will generate an /etc/fstore.pw file by piping the output of the finger command to the sum command to generate an unpredicable (effectively random) 5 digit number. On the UA hosts I have prefixed this number with a common 5 letter nonsense word by hand to allow operations to restore files regardless of the system they were dumped on or are being restored to.

-SERVER host

The server parameter can be used to override the default sequence of searching for the daemon's host. By default fstore will try the hosts

               archive3.u.washington.edu
               archive2.u.washington.edu
               archive1.u.washington.edu

in that order. The archive3 name is in the nameserver database as a CNAME pointing to nineveh's SOCC interface. Archive2 points to nineveh's second ethernet interface and archive1 points to nineveh's primary ethernet interface. For efficiency, that's the order that the interfaces should be used. If a system can reach nineveh through the back-door network all is well. An Ultrix system such as stein2 which doesn't have a back-door interface will get a "Destination unreachable" error status back immediately and will proceed to the next name in its list of servers. An AIX system such as daffy, however, will ignore the "Destination unreachable" error status and wait for two minutes for the connection to time out. For AIX, fstore will send a "ding" datagram and wait a few seconds for a response to determine if the connection is valid. Not fool-proof, but a whole lot faster than waiting on the TCP timeout.

-PORT num

The port parameter can be used to tell the client to attempt to connect to a daemon on an alternate port. This could be used for debugging a new daemon someday.

-COMMENT "string"

A comment can be associated with a file in the database if specified on the fstore dump command when the file is created.

-CLASS string

The tapes in the BACKUP database have a class associated with them. When new files are written to tapes, fstore supplies a class of tapes to use. If existing tapes of that class exist in the database, the daemon will pick them to write the file to. The default class that the server selects if fstore doesn't specify is "normal". Classes that start with a capital X will be sent offsite after the tapes are written.

-TO newlfn

The -TO option specifies the new file name for a COPY or DCOPY request. If not specified, the new file name will be the old filename with a capital A appended.

-FORMAT "string"

The fstore LIST command will send an optional format string to the daemon to tell it how file entries should be formatted. The format string uses a bastardized printf syntax with different letters for the various components of the data from the BACKUP service's database. The default format string is "%-5.5S %9.2s %-30L %C\n". The special characters that can be used are:

        %C	String		Comment associated with file
	%L     	String		Filename ("LFN") associated with file
	%P	String		Password of file (oops)
	%S	String		Status of file
	%T	String		Tape list
	%p	Integer		PID of process locking file
	%s	Float		Size of file in 1024*1024 megabytes
	%t	String		Time of lock

The status of a file is listed out as up to 5 characters meaning:

        C	File has been committed.
        d       File is a scheduled delayed copy.
        D       File is a committed delayed copy (copied but not flushed).
	p	File is written, but not committed.
        i	File is created but contains no data.

	e	File is erased (shouldn't show up on a list).

	O	File is on tapes which are out of the silo.
        G	File is on tapes which are offsite.
        g       File is scheduled to be sent offsite.

	X	File eject has been requested.

	*	File is currently locked.

The fstore client program connects to the daemon process via TCP/IP and sends it a newline terminated string that looks like:

	command filename password [class]

followed by a newline terminated string that looks like:

	comment actual_comment
or
	nocomment

followed by an optional:

	format actual_format
or
	noformat

in the case of a LIST command or:

        newlfn new_filename

in the case of the COPY or DCOPY commands. In the case of a DUMP request, the daemon responds with:

	use port integer

The client connects to this port and starts sending its data to it. In the case of a RESTORE request, the daemon responds with:

        waiting for data port

The client then creates a socket and binds it to a port and sends the daemon:

	use port integer

and then waits for the data to appear on the specified port.

2) Backupd, the daemon

The backupd daemon resides as /usr/local/etc/backupd on nineveh and runs out of the /usr/local/backup directory. Contained in this directory are a number of interesting files:

    lfn.dat	The database holding file information.
    tape.dat	The database holding tape information.
    syslog	Any interesting things the daemon may have to say.

The databases are binary files with each record being the structure found in the lfn.h or tape.h files. DBM databases, lfn_index and tape_index, are used to index into these databases. Each morning at 10am a cron process copies the databases and cuts a new syslog file. The files xxxx.0 are yesterday's file, xxxx.1 the day before's, on to xxxx.7 which is last week's.

The daemon sits in its main loop doing a select waiting for a connection from a client. The requests are processed by a monolithic main program. For the simple requests; commit, erase, eject and return; that don't require any intercommunication between the daemon and the client, the main daemon processes them directly. For the requests that require handshaking between the client and the daemon; dump, restore and list; the daemon forks off a child process to do the work while the main daemon waits for additional requests. The main daemon keeps a pipe open to each of its children processes. These children continuously give the main daemon progress reports via this pipe which are fed to Argus when it queries the daemon.

The tapes are structured as a number of 20480 byte blocks. The lfn.dat information lists what tapes in what sequence are used to hold a file. The tape.dat information lists up to 10 files on each tape as a number of blocks with a count of the number of used bytes in the last block. An EOF is written after each file to allow skipping forward on a tape to find the Nth file on a tape.

When a dump request starts, the daemon first creates an entry in the LFN database and then allows the transfer to start. When the daemon is ready to transfer data to tape, it selects the next tape to write to by hunting through its database for a tape with a class matching the one specified on the dump request with a reasonable amount of free space. If none is found, a second pass is made looking for an unused tape. Once it selects a tape it sets the lock flag for that tape in the tape database. It then skips if required and starts writing until it encounters an error such as END OF TAPE or it runs out of data from the client. When it is all done it refetches the tape entry from the database (which may have been updated by an intervening erase or commit request), adds information for the current file and then rewrites the entry into the database, unlocking it. A locked tape will prevent other writes to the physical tape, but not other operations. A restore request will hang up waiting for the tape as the silo will respond with a "volume in use" error if the tape is mounted on another drive. Each update to the databases is made autonomous by flocking the .dat file on the open.

The daemon communicates with the silo via the (mumble,mumble) interface. Interfaces to a library of routines can be found in util.c if you're really interested. The library performs some magic to communicate with the Sun, cajamarca.adp, who then tells the silo what to do via its own hardware interface. If there are problems with this communication, there is a program called "stkjig" that can be built from the sources on nineveh to attempt the various calls that the daemon will attempt:

     nineveh> cd /tulsa/obj/aix/fstore
     nineveh> ./stkjig
     Usage: ./stkjig vsn drive# command
     Commands are: mount, dismount, eject, query
     Drive # is an integer from 0 to 4:
       0: /dev/rmt3 (0,2,10,3)
       1: /dev/rmt4 (0,2,3,0)
       2: /dev/rmt5 (0,2,3,1)
       3: /dev/rmt6 (0,2,3,2)
       4: /dev/rmt7 (0,2,3,3)
     nineveh> ./stkjig 380005 0 query 
     Status of tape is 90: volume home.
     nineveh> ./stkjig 380173 0 query
     Status of tape is 92: volume in transit.

You should do an flock on /dev/rmtX.lock before playing with a drive if you don't want to confuse the daemon who may be playing with the same drive. The drive number is ignored on a query and the vsn is ignored on a dismount. You can also log into cajamarca directly (details left out intentionally) to determine where the problems may be.

The tapes that the BACKUP service uses are VSNs in the 38xxxx range.

3) Backupmgr, the database tweaker.

The backup manager utility can be used to perform various operations on the database that are not necessary to do from a remote system via the client/server approach. The backupmgr utility exists on nineveh as /usr/local/etc/backupmgr. The easiest way to invoke the utility, though, is via the Argus display by clicking in the main title bar. Some of the functions of the backup manager are intended for operations use in their daily function such as ejecting and replacing tapes. Others are there to fix problems that are encountered and, if used improperly, can destroy the databases.

There is an internal help command for the backup manager that will give you content free information about some of the commands therein. This situation is liable to change before operations are given the go-ahead to start pounding away on it. The program needs a lot of work which is why I haven't done much to document how it looks now. Some of the more interesting commands are:

    eject	  To eject all tapes pending ejection.
    find	  To hunt for tapes that have been returned.
    return	  To list all tapes that should be returned.

    bad		  To list all "bad" tapes.
    eject vsn	  To eject a specific tape.
    files	  To list information out about a file.
    good	  To mark all free bad tapes as "good".
    locks	  To list locked files and/or tapes.
    patch         To willy-nilly update fields in an lfn entry.
    rekey         To rebuild the lfn_index and tape_index DBM databases.
    report	  To generate a report of usage by system.
    stats	  To list status information on the tape situation.
    tape	  To list information out about a tape.
    unlock lfn	  To unlock a file.
    unlock tape	  To unlock a tape.
    verify        To verify consistancy between the databases.

A bad tape is one that the daemon had problems writing to. In all the cases so far, this has been due to confusion on the part of the daemon or hardware problems with the drive and not a problem with a tape. The "Media surface error" when the controller was unplugged was a cute one. The daemon now automatically recovers from "Drive busy" errors that happen constantly as long as multiple drives on the one controller are being used. A tape that the daemon had problems writing to cannot be reused until all files have been erased though. If the daemon fails to write an EOF on the tape after the last file, the next skip will go off into la-la land. Tapes marked as bad will not be selected to be written on by the daemon.