MANAGING THE UNMANAGEABLE: The University of Washington Approach Terry Gray and Brad Greer TABLE OF CONTENTS EXECUTIVE SUMMARY 1. BACKGROUND 1.1 The Problem 1.2 The University of Washington 1.3 Computing & Communications Philosophy 1.4 The Changing Desktop Computing Environment 2. ARCHITECTURE 2.1 System Goals; User View 2.2 System Goals; Manager View 2.3 Design Principles 2.4 Distributed System Elements 2.5 Email 2.6 Reference System Concepts 3. IMPLEMENTATION The Reference System in detail Unix/X clusters PC clusters PC Hardware Assumptions System Limitations 4. CONCLUSIONS ----------------------------------- EXECUTIVE SUMMARY The University of Washington in Seattle has a large and diverse population of computers. As the UW community grew increasingly dependent on both local and worldwide information resources, integration of personal computers with the Internet and Unix-based campus servers became a critical goal. Experience with the management of computing clusters comprised of multiple Unix machines provide insights on how a huge population of PCs could be integrated and managed. A key component of the architecture is the "Reference System" which maintains the primary copy of software and configuration files, and from whence individual PCs may be updated. Discussion of goals and UW's overall network computing environment is included, as well as details of the Reference System implementation and its application to PCs. 1. BACKGROUND This section will attempt to lay the groundwork for UW's approach to PC-Unix integration by outlining the problem, describing the University and the central organization responsible for computing, and finally, the changes in personal computing that drive both the problem and the solution space. 1.1 The Problem Like most large institutions, the University of Washington has a heterogeneous computing environment, including all four basic food groups of personal/desktop computing devices: Macs, PCs, Unix workstations, and X terminals. For campus-wide computing systems, Unix is the predominant platform --for both interactive and non-interactive services. The problem of integrating computing and information services across disimilar platforms is the general issue; in this case study we describe specifically the approach UW has used to integrate desktop PCs into the world-wide Internet information infrastructure as well as a campus computing environment laden with Unix-based servers, 1.2 The University of Washington The University of Washington (UW) is located in Seattle, Washington, and was founded in 1861. It is the oldest institution of higher education on the West Coast of the U.S. and the preeminent research university north of Berkeley (CA) and west of the Mississippi river. UW serves not only Seattle, but the entire Pacific North West region of the United States through distance learning programs, inter-library loans, and network-based information resources. The principal campus of UW is in the city of Seattle; two relatively new branch campuses have recently been established in the neighboring cities of Bothell and Tacoma. There are also two hospitals affiliated with UW; one on the Seattle campus (University Hospital), and one in downtown Seattle (Harborview Medical Center). On the main campus, approximately 50,000 people work and study in several hundred buildings enclosing more than 13 million square feet of space. Underground, there are 7.5 miles of utility tunnels which greatly ease the problem of providing connectivity to the buildings. Unfortunately, many of the buildings are very old with totally inadequate communications infrastructure, or worse, asbestos insulation --which is now considered to be a serious health risk if disturbed by installation crews. Currently, there are well over 15,000 machines on the campus network. More than 6.000 are PCs; another 4,000 are Macs. There are around 2,000 X terminals and about 2,000 Unix machines. For central services, we make heavy use of Unix machines because of the open development environment they offer, and their superior network connectivity tools and applications. Hence, the challenge of integrating many thousands of PCs (and Macs) into a predominantly Unix-based server landscape. 1.3 Computing & Communications Philosophy 1.3.1 Organization UW's Office of Computing & Communications (C&C) is the organization responsible for central computing, networking, telecommunications, and instructional media services, including television production and cablecasting. There is a governence structure in the form of the University Advisory Committee on Computing and Technology (UACAT). Together, UACAT and C&C have developed the policies which shape UW's technology landscape. Some of these guiding principles are discussed below. 1.3.2 Uniform Access entitlements It is the policy of C&C that all faculty, staff, and students at UW are entitled to computer accounts. C&C operates a set of computers that are known, collectively, as the "Uniform Access" machines, since they provide timesharing services available to the entire campus community. Currently there are approximately 33,000 people with accounts on the campus-wide computers. Each has a basic entitlement of disk and cpu resources, and there is a way to obtain additional resources for special needs. 1.3.3 Evolution from number-crunching to information services In recent years, there has been a pronounced shift in the nature of our computer users. Previously, academic computers were used primarily for scientific data acquisition and reduction, and later, program development. Now, the principal use is communication and information retrieval. Electronic mail has become such an integral part of the University workplace and educational process that failure of email systems nearly brings the institution to a halt. Similarly, online access to information resources --both local and world-wide-- is no longer a luxury; it is a necessity. Increasingly, the information needed to function effectively is found outside one's department, so high-availability, high-performance access to world-wide networked information resources --as well as local ones-- is absolutely critical. 1.3.4 Internet orientation The computing and communication infrastructure at UW is based on Internet standards. To achieve our goal of world-wide information access, it is imperative that virtually all of the systems at UW --personal and most shared hosts-- have full connectivity to the Internet. This policy has wide-ranging implications, about which, more in due course. Our intent is to continue to track and deploy Internet standards, including the next generation of IP, which is currently under discussion within the Internet Engineering Task Force. 1.3.5 Pure IP network backbone The UW network consists of several hundred Ethernet segments linked by IP routers. The only network protocol supported on the backbone is IP. Contrary to popular belief, this is a feature, not a bug. Heterogeneity always costs you more than you think it will. Not only does each protocol family have its own overhead and support costs, but problems with one protocol can sometimes affect the others, affecting overall system availability. The arguments for running a multi-protocol network are diminishing. It is clear that support of the Internet (TCP/IP) protocol suite is *necessary* for interoperability with the world's largest and fastest growing information infrastructure... the only question is whether or not TCP/IP support is *sufficient*. Our answer is "Yes", firstly because there are alternative ways of supporting the systems requiring proprietary protocols (e.g. tunneling the proprietary protocol within IP packets), and secondly because even the most recalcitrant computer system manufacturers (names witheld to protect the guilty) have finally figured out that it is important for them to make their network services operate over TCP/IP connections. By holding the line on an IP-only backbone, and using tunneling as an interim strategy to accommodate proprietary systems, it will be possible to converge on a predominantly homogeneous TCP/IP environment. Whereas, support for multiple protocols on the campus backbone would guarantee that there would *always* be multiple protocols on the backbone. Tunneling shifts some support costs to departments, but it helps contain central support costs, and more importantly, reduces the probability of multiple protocols interfering in the communications equipment, potentially causing widespread outages. An example, which is not hypothetical, concerns a commercial router whose code for one particular proprietary protocol had a memory leak that would eventually cause the entire router to crash. Finally, the advent of high-bandwidth graphical applications provide another very strong reason to operate a single-protocol backbone. The technology to do resource reservation on a single protocol IP network is just now being deployed; the prospects for managing extreme bandwidth demands across a set of different protocols sharing a single communication channel are slim indeed. For example, there would be nothing to prevent a video conferencing application using IPX from consuming the entire channel capacity, thus bringing IP applications to their knees. Technology such as Asynchronous Transfer Mode (ATM) switching can provide distinct channels for different classes of service, but multiple protocols sharing a single channel are destined to be a significant resource management headache, as more demanding applications begin to compete for available bandwidth. The good news is that most vendors really have gotten the message about the importance of converging on TCP/IP protocols. Microsoft and Apple now essentially bundle TCP/IP support with their operating systems, and Novell now offers TCP/IP as an alternative (albeit at extra cost) to their own IPX protocol suite. By the time this book is published, Apple has even promised to have their file and printer sharing protocols running over TCP/IP. So perhaps protocol convergence is finally at hand, but in any case we are convinced that UW's IP-only policy will be completely vindicated in the fullness of time. 1.3.6 Role of timesharing Is timesharing dead? Yes and no. Using personal computers as dumb terminals to connect to timesharing systems can hardly be considered the best use of either class of resource. Who would argue that interactive processing should not be done as close to the user as possible? On the other hand, providing advanced information services to users of disparate computers, ranging from high-end workstations directly connected to 100Mbps LANs to now-archaic 68000 or 80286-based personal computers connected via low-speed links, is a non-trivial problem. UW's strategy has been to first deploy information services that can be accessed from anything and anywhere, then later deploy tools for specific platforms. As a result, the majority of our constituency use central computing resources --principally for email. However, the tools and technologies needed to support a well-integrated client-server network computing infrastructure are finally close at hand, and we expect the trend to shift from interactive timesharing accounts to central accounts used primarily for mail servers or perhaps (in the future) institutional file servers. The platform-specific tools we expect to displace the "lowest common denominator" tools on interactive timesharing machines are typically "clients" for information and communication servers both within UW and the rest of the Internet. By basing our own standards on those used throughout the Internet, a single solution can be used in both contexts. 1.3.7 Role of client-server computing Even if time-sharing *is* dead, or at least mortally wounded, resource sharing is not. In fact, noot only is remote resource sharing alive and well, it is the cornerstone of inter-personal computing, the follow-on to the personal computing revolution. Given that a contemporary application will have at least the display code running on the computer in front of the user, remote resource sharing implies client-server computing. Said differently: If we assume that it is best to run interactive applications on the personal computer, then what is the proper role of a remote "server" computer? Possible answers include: -sharing information -sharing (expensive) hardware -sharing (expensive) software -sharing operational support The above list has to do with sharing resources among multiple users. There are also situations where using a remote computer as a server machine may make sense independent of whether the remote system is shared or not. For example, personal computers are notoriously bad candidates for email destination machines because they aren't always turned on, and they are often not backed-up regularly. Delivering mail to an "always up" host cared for by an operations staff makes more sense. Because of the incremental cost, such a mail server will almost always be shared across a group of users. However, even once mail has been delivered, it may make sense to keep the mail stored on a server machine rather than on the personal computer. Again this has to do with the capability of the desktop computer, the nature of its network connectivity, and whether it can adequately take on the role of an always-up data server --essential for when the user needs to access the stored messages from a different computer. Interactive applications can be modeled as having three functional elements: -user interface -application algorithms -data access In a client-server situation, one or more of these functionss occurs on the personal computer, and one or more occurs on a different computer. There is a spectrum of client-server architectural choice: at one end, the personal (client) machine does everything except hold the data, which is stored on a remote file server and is accessed by a generic file access protocol; at the other end of the spectrum, the personal CPU handles nothing but display chores, everything else occurs on a different machine. Most client-server implementations fall in between, with a "protocol" specifying the set of operations and responses between the client and server. In such cases, some application processing is done locally, some remotely. Having the remote server handle data access in an application-specific way may reduce the amount of information that must be transferred across the net (when compared to a generic file access protocol); an application-specific protocol may also allow certain functions to be done on the server which are inconvenient or inefficient to do on the personal/desktop computer. 1.3.8 Interoperability via Standards We define interoperability as the ability to exchange information among dissimilar types of systems just as easily as if all the systems were identical. Interoperability is a fundamental objective or our network computing environment. There are two ways to achieve interoperability: by corresponding elements of a system operating in accordance with a single specification --a standard-- or by those elements being able to understand multiple specifications. In practice, a standard is a specification that is widely used, whether it be formal or informal, proscribed or defacto. The standards-based approach is generally preferable to supporting multiple specifications because it keeps complexity down, and therefore reduces initial and recurring costs. However, there are multiple levels of technology, and the importance of standards, as a method for achieving interoperability, depends on the level of technology. To explain that statement, here are three examples representing different levels of technology: a. Consider a word processor. If everyone involved in document preparation and sharing uses the same word processor, there is a de facto standard, and interoperability is achieved. Similarly, if all word-processor vendors agree to support a common document format, interoperability is again achieve via a standard. However, many word processors have the ability to read file formats of competing word processors, so it may be possible to achieve interoperability without a single standard; that is, without homogeneity. b. Consider LAN technology. It is possible for some computers in an organization using Ethernet network technology to be completely interoperable, in the information sharing sense, with other computers that may be connected to a token ring network, provided that there are suitable communication devices linking the two LANs. Thus, at the link layer of the technology strata, commonality is not a prerequisite for interoperability, though there are economic reasons to use a single technology. c. Consider network transport protocols. Unlike the link layer, diversity at the network layer can be fatal to interoperability. This is because transport layer semantics, e.g. addressing, are visible and used at the application layer, so general purpose and transparent transport gateways do not exist (whereas a router will provide a more-or-less transparent link between disimilar LAN link-layer technologies.) While it is possible to build *application-specific* gateways that span multiple transport protocols, these tend to be a source of operational headaches and are by no means a general solution. Thus, at the transport level of technology it is most crucial to have a single standard. As argued previously, a single network protocol provides a common foundation upon which to build an information infrastructure, and is enormously important in facilitating interoperability in our network computing architecture. Although we can tolerate some diversity at the lower levels of technology, e.g. LAN protocols, and also at higher levels, e.g. applications, while still achieving interoperability, our operational costs are related to the amount of heterogeneity in the system, so our policy is to use standards wherever there is a clear idea of which standard to pick, not just at the network transport layer. The family of Internet standards have served us well, and we intend to continue down this road. 1.3.9 High-availability design After interoperability, it would be difficult to think of a design objective more important than high availability. Given a rich information infrastructure to work in, people come to depend on it in a big way, and become downright cranky when it isn't working properly. Consequently, system availability has been a fundamental consideration in our design. A corollary objective is worth mentioning: while we certainly seek to reduce the number of user-visible outages in the system, it is also a goal to reduce the scope of any outages that do occur. In other words, if we have to have a plane crash, we'd rather it be a Cessna than a 747. Two key methodologies are used in the pursuit of high availability: redundancy and functional separation. We cannot afford a totally redundant communication infrastructure, i.e. two paths to everywhere, but all of the networking elements that *everyone* depends upon, e.g. Domain Name Servers and certain key routers, are replicated in geographically diverse locations. Likewise, critical information resources are replicated. Functional separation is a less common design principle. This has to do with dedicating hardware to specific functions, rather than multiplexing many functions on the same general-purpose computer. The goal is to minimize the liklihood that a malfunction in one service will inadvertently take down an unrelated service that might be sharing the same platform. An example of a scenario we seek to avoid is having incoming mail cause a root disk partition to fill up with the side effect that Domain Name Service fails. "Good fences make good neighbors." This strategy does not mean using special-purpose hardware if there is a reasonable alternative. Hardware platforms are not immune to the law of all species: adapt or die. The idea is to use general purpose hardware for maximum management flexibility, but to configure systems to do a single function in order to maximize availability. 1.3.10 Access from anywhere In a perfect world, one could access one's data from anywhere, using any type of computer. That is, access to both personal data and the world's information resources should not be limited to the personal computer in the office. Incrasingly, access from home, or a laptop while in a hotel, is essential. Also in a perfect world, one would be able to use the same applications to access and manipulate that information, regardless of one's location or what type of computer was currently being used. In order to achieve these goals, it is at least necessary to have pervasive deployment of tcp/ip communication protocols, even via dialup links. Only recently has the software needed to do this become readily available. 1.3.11 Character-based and GUI apps We must support a diversity of computing platforms and access paths. While Graphical User Interface (GUI) applications are generally preferred, a particular appliation may not be available for all platforms, and we still have many users of DOS. It is also necessary to support access to key information resources (e.g. mail) via async dialup and character-oriented network connections (e.g. Telnet). While the trend toward GUI client-server applications running over tcp/ip connections is strong, the lowest common denominator is still a VT100 character-based application. 1.3.12 Security Security of information resources has always been an important goal, but one downside of the Internet's incredible growth is that the information highway has more jerks driving on it now than it used to, so security has become even more critical. Our view is that security is primarily a host problem rather than a network problem. That is, we don't not attempt to operate security firewalls at the network boundaries of the campus. The reason is that these firewalls tend to be application specific, and often reduce convenience. Instead, we encourage good passwords, and for critical systems we insist on the use of one-time passwords. In addition, we are in the planning stages of deploying a distributed authentication system and privacy enhanced mail, both based on cryptographic technology. 1.3.13 Division of labor The question of which part of an organization controls which elements of the distributed system has both technical and non-technical aspects. The non-technical ones have to do with organizational responsiveness to client needs and the amount of sharing permitted for a particular resource. For example, who owns/controls/supports the data on a server? And who gets to use that data? The technical aspect has to do with performance: who, besides you, can influence how quickly the system responds to your requests? The answer is certainly a function of resource sharing, since the key to high performance (and world peace, for that matter) is reducing contention for shared resources. Obviously a resource dedicated to you will perform better than the same resource shared by many people. There are several different places a particular service could be offered in a large organization, ranging from the desktop to the central services supporting the entire organization. At UW we believe that each level of the organizational hierarchy may have a legitimate claim on providing certain classes of computational services. Our view of central services is that they fall into three categories: 1. "Natural Monopoly" services such as the network backbone, where it would be both diseconomic and disfunctional for individual units to build their own network backbone. 2. Services that can be offered at lower cost if centralized. 3. Services for those who cannot afford --or do not wish to be bothered-- with providing their own computing services. Only in the case of network infrastructure does the central organization claim a monopoly, and even then there are a few exceptions. For computing services, a few departments are completely self-sufficient but most rely on central services at least partially. There are currently 45,000 accounts on central computers, representing about 35,000 distinct individuals. The central cluster intended primarily for email support has over 23,000 accounts. The general question of Central vs. Departmental computing becomes more complex in a client-server network computing environment. Given the desire to run most interactive applications on the desktop, the division of labor question becomes primarily one of data servers: who operates what classes of data server? Typical kinds of servers include file, print, email, news, and general information. Places these might reside include: o campus-wide servers o departmental servers o workgroup servers o personal/desktop computers Note that there is also a need for large-scale computational servers, but with the audience for information services growing much more rapidly than those who need number-crunching, we will focus more on the former in this discussion. One of the key virtues of personal computers is that they offer the user a degree of autonomy... control over their own computing environment. Likewise, a principal motivation for departments to provide their own computing services is so that they have control over the resources, in terms of features, operations, responsiveness to problems and changing needs, etc. The same arguments apply to departmental vs. central computing. Departments may opt to provide their own computing resources whenever the central systems are "inadequate", and local autonomy vs. the central organization's ability to respond to changing departmental needs is often a key ingredient in that decision. 1.4 The Changing Personal Computing Environment Ultimately, the "View from the Desktop" is the only one that matters. That is, the services available to the end-user on their preferred computing platform are what this business is all about. In this section we'll review the kinds of desktop computers we support, the key applications our information-oriented user community want, how those applications relate to the network computing environment, and how the desktop environment has evolved. 1.4.1 Types of Desktop/Personal Computers First, a clarification on Terminology. We define "Personal" computers as those devoted to a single user. A "desktop" computer is a personal computing device that fits on one's desk, as opposed to, for example, a Cray supercomputer dedicated to the exclusive use of a single individual. In a lab situation, personal/desktop machines are serially reused, but at any instant they are designed to server one and only one individual. A desktop computer is not always "personal" and a "personal" computer is not always "desktop" sized. But most of the time, the terms refer to the same class of device and are used somewhat interchangeably. It's understood that laptops and home PCs have made the term "desktop" too limiting. As noted previously, UW has all four basic foodgroups of desktop computing in abundance: -PCs (using both DOS and MS Windows) -Macs -Unix -X terminals Clearly the network computing architecture must accommodate all of the above. How easy or difficult that is depends primarily on the vendor of the operating system software that comes with the computer. In the past some vendors made it very difficult indeed to integrate their products into a multi-vendor network computing environment, but the picture is improving. 1.4.2 Key Applications In thinking about integrating desktop machines into a global information infrastructure, it is useful to identify both the key applications and network services that must be supported. Representative examples of applications needed by the new-generation of computer user (as opposed to the number-crunchers) include: o Messaging (Email and Bulletin Boards) o Information retrieval (ftp, gopher, world-wide-web) o Word processing o Spreadsheet o Presentation graphics o Scheduling o Project management o Software development and Authoring tools Some of these are inherently network-based applications (e.g email) while others may rely on the network and remote servers without the user even knowing it, as a function of the specific distributed system architecture used. 1.4.3 Distributed services In addition to the inherently network-based applications, the system must support a variety of "behind the scenes" distributed services such as: o file/print sharing o file backup, archiving o management/configuration The more independent the desktop computer is of network services, the greater the personal autonomy for the end-user, but this may be at the expense of functionality such as file sharing and backup. 1.4.4 Desktop Technology Evolution o X vs. native applications Some years ago, when it was already clear that the next-generation applications would have graphical user interfaces (GUIs), a decision had to be made concerning which types of GUI should be supported. It was difficult to justify developing applications for all three (X Windows, MS Windows, and Macintosh), so we settled on X as the target for advanced applications, even while recognizing that character-based applications must be supported indefinitely. The attraction of X was that it was the only GUI that could be supported on all four classes of desktop machines. That is, in addition to the native support for X on Unix workstations and X terminals, it was possible to buy "X server" software for PCs and Macs. In retrospect, the decision was correct for the time it was made, but it proved not to be a panacea. Even though some of our X applications are used quite successfully from PCs and Macs, we discovered that there is still a certain amount of cognitive dissonance when a PC or Mac user reaches for the third mouse button, commonly used in X applications, or when the window manager functions differ from those of the native GUI. At this point it is clear that a suite of native MS Windows applications will be needed, as this is the largest and fastest growing segment of our desktop population. No decision has been made on whether to also develop native Mac applications, or to rely on Apple's commitment to support Windows applications via emulation. o Changing the game One achilles heel of PCs has been the operating system software. Lack of a true multitasking kernel with reasonable memory management has caused endless grief for developers and end-users alike. In addition, high-quality high-resolution (over 1000 by 1000 picture element) graphical displays have been the exclusive province of Unix workstations and X terminals until recently. But the PC hardware is improving and a version of MS Windows that promises to address many of the traditional MS frustrations is on the way. With the price-performance of Intel-based PCs continuing to improve, it becomes increasingly difficult to justify X terminals on the basis of cost per seat. Moreover, as X terminal product lines evolve, there is more model diversity and complexity to contend with on those platforms. Thus, even though manageability is perhaps the biggest achilles heel of all for PCs, the prospect of tools to allow central management of PCs means the gap is closing. o Public vs. personal machines The desire to exploit the characteristic autonomy of PCs, especially with regard to independence from network resources, leads to the desire to use local storage. Having a hard disk means the PC can store programs locally, which improves performance and availability. However, having local state also means that central management is more challenging, and can make security more difficult. Personal computers are used in two fundamentally different contexts: a) as machines dedicated to a single user, and b) in lab situations where machines are serially reused by large numbers of users. Our architecture must accommodate both scenarios. o Security In the past, the greatest targets of opportunity for computer-age criminals have been large timesharing machines. After all, the legitimate owner of a PC could barely get at her machine via the network, and once there, had a minimal set of exploitable resources... so why bother when much richer targets were so plentiful? However, as PCs become both more capable and also the principal computing platform for growing numbers, the risks are changing. That PCs were once almost entirely single-tasking devices with few network daemons running on them provided a degree of "security-through-incapability" that was reassuring. Now, however, we find people wanting to run all manner of network service daemons on their desktop machines (telnetd, ftpd, smtpd, imapd, gopherd, httpd, etc.) Ah, for the good old days! Combine this trend with advanced applications such as Mosaic which can be configured to execute arbitrary programs on the destop machine, and with the traditional vulnerabilty of desktop machines to computer viruses, and we have a brand new ballgame, threat-wise. Security concerns are also exacerbated in lab situations where machines are used by many people, of varying honor. When a machine is dedicated to one user, perhaps in a lockable office, the security issues are slightly less alarming than when any lab user can potentially modify the system software on a PC's local hard disk. o Configuration management Next to security, by far the scariest aspect of supporting large numbers of personal computers (with local disks) is configuration management: making sure that they all have the correct set of applications, operating system files, and configuration profiles. This issue has traditionally argued for using diskless workstations or X terminals, but our experience with managing clusters of Unix timesharing systems led us to believe that the same techniques we developed for updating large collections of Unix systems could also be applied to PCs. This observation resulted in a project to adapt our "Reference System" technology to the desktop management challenge. It was this Reference System that provided the essential ingredient for managing the unmanageable... 2. ARCHITECTURE Distributed System Architecture has to do with arranging collections of computing hardware and software, all linked via communication a network, in such a way that a particular set of design goals are achieved. We begin this section with a brief discussion of our goals, from both the users' perspective and that of the system manager. From there, general distributed system design principles are discussed, and the component elements of the system are described. This leads to an overview of a typical UW computing cluster. Finally, two particularly important aspects of the architecture are discussed in more detail: email and the "Reference System" concept. 2.1 System Goals; User View The following list of goals is not intended to reveal any Great Hidden Truths... they should all be pretty obvious and non-controversial. Nevertheless, for the sake of completeness, the computing environment provided to the user should exhibit the following properties: o Function: It does something useful (e.g. provides desired applications) o Location-independent access: It doesn't matter where you are. o Platform-independent access: It doesn't matter which computer you use. o Simplicity: It must be really easy to use. o Dependability: It must be reliable, available and work correctly. o Security: Access only by the authorized. o Performance: High. o Cost: Low. o Flexible/Adaptible o Autonomy A note on location and platform-independence. Location-independent or "remote" access to information has to do with the ability to reach information of interest from anywhere --independent of one's present physical or geographic location. Typically, this goal translates into being able to use dialup access, or network connections at remote sites. Platform-independent access to information is a corollary to location-independence. It means that one can access information from more than one kind of computer; perhaps even from lowest-common-denominator or "dumb" terminals. The "Autonomy" goal may warrant special mention. Autonomy means the user feels --and in fact has-- significant control over their own computing environment. This is a fundamental characteristic of personal computers, and one that provided much of the fuel for the revolution. For example, being able to purchase an application program and install/use it without assistance --much less, approval-- from support staff. Other examples of autonomy relate to performance, such as knowing that the interactive responsiveness of a PC program is not being degraded by hundreds of other users sharing a single CPU. (Of course, the overall performance of network applications often does depend on how many others are using a shared resource.) Note that the "autonomy" goal is often at odds with other goals, e.g. "security" and "dependability", as in the case of a user installing buggy or virus-infested software on their machine. 2.2 System Goals; Manager View System managers generally share their user's goals (no system manager wants unhappy users!), but in addition they have some other goals that relate to how easy or hard it is to support the computing environment. These include: o Centralized system management o Centralized software management o Centralized file backup o Standardization (configs, apps, hw) o Simplicity and explainability and maintainability o Adaptible to changing needs, e.g. portable to other platforms o Scalable to many users Sometimes the two sets of goals conflict. For example, a user's goal of autonomy may be at odds with the manager's desire to standardize on certain software in order to simplify support. The architecture should allow a wide range of possibilities, depending on how such conflicts are resolved within any given group. 2.3. Design Principles In this section we discuss a series of distributed system design principles that have served us well. These include: o Standard, media-independent Internet protocols o Single place to update common files o Each CPU has local copies of key executables o Network access to less frequently used programs o Scaling: dividing the load by population or by function? o Single-function servers o Integrity checking o Replicate servers for availability and scalability o Minimize size of "fault zones" o Application-specific protocols when appropriate. 2.3.1 Standard, media-independent Internet protocols. This design principle has been covered sufficiently in the Background section. Suffice it to say here that use of Internet protocols on all media, including dialup, goes a long way toward achieving the goals of location and platform-independent access to information. 2.3.2 Single place to update common files. A key ingredient in being able to centrally support a large collection of machines is having a single place to update common files. This principle is complicated by the fact that the definition of "common files" may vary from one group of users to another. Thus, a system for updating PC files must allow for different "equivalence classes" of machines. 2.3.3 Each CPU has local copies of key executables. One may ask: Why not keep *all* files on a single file server? If this were done, then there would automatically be a single place to update common files. The principal arguments against using a file server for *all* files are performance and availability. If images of frequently used executables are stored --cached, if you will-- on the local hard disk, then the user sees faster startup time, and there is less load on the network and the file server, thus improving performance for other network or server-based activities. Availability is enhanced when a user can execute key applications even when the file server is unavailable. Of course, this assumes that the application is stand-alone, and can function without access to other network resources. Still, given that personal autonomy has always been the hallmark of personal computing, it seems reasonable that a PC user should always be able to do *some* work (e.g. begin writing a new document) even when the network or servers wree misbehaving. 2.3.4 Network access to less frequently used programs. There are now an infinite number of applications. Storing and continually updating every last one on each PC disk is both infeasible and not required by our goals. Given that the overall performance and availabilty of network servers can be very good, provided that they are not bogged down continually serving baseline applications, it is sufficient for less-frequently-used apps to be maintained only on a file server. 2.3.5 Scaling: dividing the load by population or by function? When a single multi-function machine becomes incapable of supporting all of the users using all of the functions on it, the offered load must be split across more than one machine. The question becomes: how should the load be divided? A typical answer is to divide the users of the system into two or more groups, and replicate the multi-function machine. Users would be vectored to their designated machine for service. An alternative approach is to divide the load by function. For example, if a server is providing both home-directory file service and incoming mail service for a group, then you might put the file service function on one machine, and incoming mail service on a different machine, with both machines serving the same (original) population of users. This approach has two potential advantages: first, each system can be "tuned" for optimum performance for a given service; and second, it obviates the need to implement a "mapping" mechanism for vectoring a particular user to their particular server. In practice, it may make sense to use both strategies; that is, divide load both by function, and if that isn't enough, then further divide by user population. For example, once can split incoming mail service to a single machine in some cases, but for a very large group, multiple mail servers would be appropriate in order to achieve both performance goals and to reduce the size of the population affected by an outage. 2.3.6 Single-function servers To some, it will seem ridiculously extravagant to suggest that an entire CPU be allocated to each of several relatively undemanding network service tasks, yet that is precisely the approach we have used in striving for an extremely high-availabilty distributed system. We are convinced that separating functions onto different machines has provided significant advantages at reasonable cost. The design principle embodied in this aproach can be characterized as "good fences make good neighbors". The idea is that different functions running a single multi-function platform can sometimes interfere with each other, such that a fault relating to one service can cause other services to (needlessly) fail. For example, suppose that email forwarding and domain name service are both using the same system. Now further suppose that a destination host is taken out of service for two days due to air-conditioning problems. This causes a large backup of mail queued on the mail forwarder. In this situation, it is possible that the mail forwarding function will exhaust certain global resources on the machine (e.g. /tmp file space or swap space) with the result that not only does mail forwarding to *any* host cease, but Domain Name Service also fails. As an alternative, one could allocate separate CPUs to each function, thereby increasing their mutual independence, and therefore, the overall system availability. In days past when every CPU was a major cost item, such a strategy was truly unthinkable. Now the incremental cost of a CPU adequate for many such network services is under $5,000, and closing in on $2,000. That's not much if it avoids a critical service outage. Our experience suggests that many failures would have been much worse if critical services had been combined on a single host machine. 2.3.7 Integrity checking In this context, "integrity checking" refers to verifying that critical elements in the entire distributed system are intact; that is, not modified from their intended state either maliciously or by accident or system failure. Examples of elements that are expecially worthy of integrity checking include all executables and configuration files. In a distributed sytem that includes large numbers of personal computers, verifying the integrity of executables is particularly important, since most of the computer viruses in the world target personal computers. The challenge is further exacerbated by the earlier design principle of having key executables stored locally for performance and autonomy reasons. The archictecture must include a means for automated verification of these executables (and config files) against a trusted standard. This can be either by comparing cryptographic checksums of the target and reference files, or via periodic byte-by-byte comparison. Conventional checksums are no longer adequate since system attackers may modify a file such that a simple checksum of the corrupted file is identical to the original file's checksum. 2.3.8 Replicate servers for availability and scalability There is nothing unusual in trying to minimize downtime via redundancy: designers of high-availability systems always seek to reduce single points of failure in the system, so replication of critical elements of a system is imperative. In the case of network servers, having more than one resource available also helps in scaling the system to accommodate greater load. Replication can bring with it some challenges, however. For example, a failure in a redundant system may be difficult to detect because the replication may mask the failure. Also, the mechanism needed to vector requests to a particular server adds some complexity to the system, and may introduce subtle failure modes. The design of such a system must provide for excellent resource monitoring and recovery mechanisms. 2.3.9 Minimize size of "fault zones" We use the term "fault zone" to refer to the size of an outage, or the number of people affected by a failure in a distributed system. The design goal is to keep the number of folks affected as small as possible. While the probability or frequency of failure for any given individual is not reduced by this design principle, there are obvious management advantages to having any given failure affect the smallest number of people possible. Unfortunately, the cost-per-user of system elements is often --but not always-- inversely proportional to the capacity of the element. In other words, there may be economy-of-scale considerations that lead to large groups being affected by a single outage. for such elements, engineering tradeoffs must be made to balance cost against the size of the "fault zone". This issue may apply more to communications infrastructure than computing systems. Fortunately, the desktop computing revolution has driven the cost of individual computers down to the point where the cost-per-user of shared systems may be higher than for dedicated personal machines. However, the cost of replicating volatile data may be high, so large shared information servers may be inevitable. In contrast, data that is slowly changing lends itself to replication and accordingly, smaller fault zones. 2.3.10 Application-specific protocols when appropriate. A recurring design question is when to use generic data access protocols, such as NFS (Network File System), and when to use application-specific data access protocols such as SQL (Structured Query Language) or IMAP (Internet Message Access Protocol). Both have their place, but the "obvious" choice of standardizing on a generic file access protocol has several problems: o There is no generic file access protocol that is widely available for all types of computers, either due to technical or economic reasons. o When multiple processes have write-access to a file, as when mail is being delivered to a folder open by a mail user agent, locking is imperative. Locking via NFS can be problematic, due to implementation bugs and race conditions when file attributes are cached for improved performance. An application-specific protocol allows all the processes desiring write-access to be co-located on the same processor, thus simplifying the locking problem. o It may be desirable to have the server perform functions beyond merely serving file data across the net. This can be done within the context of an application-specific protocol, but not a generic file access protocol. o File access protocols may sometimes be quite a bit less efficient compared to application-specific protocols. For example, in recent tests, opening a large remote message folder can take twice as long using NFS as compared to IMAP. On the other hand, actually making a complete local copy of the same message folder takes longer via IMAP than NFS... so the "best" protocol is a function of what kind of operations are going to be performed on the data, how much data there is, etc. 2.4 Distributed System Elements In this section we will review the basic elements of a contemporary network computing environment. Each element consists of the hardware and software needed to offer a particular type of service, but multiple services can coexist on the same piece of hardware. In the broadest sense, these system elements are all "servers", but that term is not generally used for the system elements a user interacts with directly to execute applications, e.g. a personal computer. Rather, servers are accessed by "client" processes, usually running on a different computer, often the one being used directly by the user. Certain types of elements are not obviously clients or servers, or they might be both. Examples include "gateways" which transform information from one format to another. The taxonomy we use for distributed system elements includes the following categories: o Network infrastructure services -Domain Name System (DNS) -IP Address assignment (BOOTP, DHCP) -Network Management (SNMP) o System support services -Time (NTP) -Boot (TFTP) -Mail forwarding (DNS/SMTP) -File (NFS, SMB, AFP) -Print (LPR) -System Configuration (Ref, X) -X Font -Archive and Backup o Communication and Information services -Mail servers (IMAP) -News servers (NNTP) -Information servers (FTP, Gopher, HTTP) -Database servers (SQL) o Application processing services -Shared, general purpose application servers -Shared, application-specific servers -Personal computers Personal computers have been discussed already; the other categories will now be considered in turn. 2.4.1 Network infrastructure services. These are systems that are necessary for the correct functioning of the basic communications infrastructure. Primary examples: -Domain Name System (DNS) -IP Address assignment -Network Management Domain Name System (DNS) servers provide the mapping between friendly host names (e.g. ftp.cac.washington.edu) and their corresponding IP address (e.g. 140.142.100.6). Correctly functioning of the DNS at all times is essential in a network computing environment. Unfortunately, DNS is vulnerable to bad data that occassionally finds its way into the global database, and even though DNS has been in use for many years now, there is still some DNS software that has bugs. The IP address assignment function has traditionally been the Achilles Heel of TCP/IP. However, it doesn't have to be. For several years we have provided the campus with installation software for PCs that will register a machine in the central database and obtain an IP address for it, without intervention by Network Operations personnel. More recently, an Internet standard called the "Dynamic Host Configuration Protocol" (DHCP) has been developed and is now being deployed. DHCP provides similar functionality to the home-brew system we have been using, plus the ability to "lease" addresses to a machine for a limited period of time, which is very useful for drop-in labs. Network management systems provide early notification of outages and tools for debugging faults and anticipating capacity problems. They are essential in a large network. 2.4.2 System support services. These include systems that provide "behind the scenes" support to computers that are directly executing a user application. For example: -Time (NTP) -Boot (TFTP) -Mail forwarding (DNS/SMTP) -File (NFS, SMB, AFP) -Print (LPR) -System Configuration (ref, X) -X Font -Archive and Backup Time. Time service is provided via the Network Time Protocol (NTP). The intent of NTP is to synchronize clocks throughout the Internet to within a few milliseconds of the high-accuracy atomic clocks connected to the net. Boot. A "bootstrap" or "boot" service allows a computing device to get a fresh copy of its operating software from a boot server. A common protocol for retrieving the software image is the Trivial File Transfer Protocol (TFTP). Mail Forwarding. Mail forwarding chores are supported by mail forwarding (MX) records in the Domain Name System and the Simple Mail Transfer Protocol (SMTP). The MX records in DNS tell a sending host where mail for a particular destination host should be routed. File. Remote file access is probably the most common system support service, perhaps in large measure due to the millions of Novell servers on LANs throughout the world. File servers exist for any one of several reasons: o People want to share information and a shared server is sometimes simpler or more robust than peer-to-peer file sharing. o Administrators want to have a single place to maintain software rather than having to keep the copies on each desktop computer up to date. o Desktop computers intended to be shared by many users (e.g. in a lab) may not have provision for protected personal files. Unfortunately, there is no single remote file access protocol that all vendors have agreed to use. Examples include: o Network File System (NFS), dominant in the Unix world, though also widely used on PCs to allow access to Unix-based files. o Server Message Block (SMB), the basis of Microsoft's LANManager protocol. o Apple Filing Protocol (AFP), used universally to share files among Macs. Although NFS client software is available for both PCs and Macintoshes, it has never fulfilled the promise of becoming the single ubiquitous remote file access protocol. There are both technical and economic reasons for this, but the result has been that full integration of desktop computers with Unix-based servers requires the server to learn to speak the native protocol of the desktop machine, rather than the client learning to speak the native protocol of the server. This was not even an option until recently, but a Unix-based SMB server, called "Samba" has become available, and the Columbia Appletalk Protocol (CAP) package --as well as commercial equivalents-- offer AFP services from Unix hosts. By the time you read this, it may even be possible to use AFP over TCP/IP. Print. Shared print service is sometimes even more important than shared file service. The LPR/LPD protocols from the Unix community have infiltrated the PC world, but as with file access, they may provide only a piece of the solution. Configuration. System configuration services are invisible to the end-user, but are none-the-less important for keeping large sites from disintegrating into techo-chaos. A configuration service provides a central place to manage a large collection of computers, either desktop machines or "back room" machines. We use two types of configuration servers: one for keeping track of X Windows terminal configurations, and the other --our "Reference System"-- for managing Unix clusters and PC collections. Finally, there are also servers to provide fonts to X terminals or computers acting as X display servers, and mass storage servers for archival access or file server backup. 2.4.3 Communication and information services. These services encompass a variety of data repositories, each supporting one or more application-specific access protocol: -Mail servers (IMAP) -News servers (NNTP) -Information servers (FTP, Gopher, HTTP) -Database servers (SQL) Mail should be delivered to machines that have three properties: a. They are always up. b. They are regularly backed-up. c. They are sufficiently capable to export the mail folders via an open client-server mail protocol. These requirements preclude delivering mail to the vast majority of desktop computers; hence the need for mail servers. We believe that the only open client-server mail protocol with sufficient functionality is the Internet Message Access Protocol (IMAP), hence we sometimes refer to our mail servers as IMAP servers. Second to email, network news (the Internet's distributed bulletin board service) is perhaps the most popular communication or information service. It is based on the Network News Transport Protocol (NNTP). The "information servers" group refers to systems designed to export information via Internet protocols such as FTP, Gopher, and HTTP. In contrast, "database servers" are oriented toward transaction processing, using protocols such as SQL. 2.4.4 Application processing services Although the programs implementing the servers described previously can be considered "applications", we'll reserve that term for programs directly invoked by users. Accordingly, the final category of services encompasses the systems that users interact with directly to run the programs they need to support their work: -Shared, general purpose application servers -Shared, application-specific servers -Personal computers The first group, "Shared, general purpose application servers" are in fact interactive time-sharing machines, either stand-alone, or part of a cluster. The "application-specific" group includes machines that are dedicated to running (one or more) specific applications. For example, one might dedicate processors to CPU-intensive applications such as CAD/CAM. Or one might incorporate a supercomputer into the architecture with the intended purpose of executing batch simulation jobs. This group also includes information-access gateways and front-ends, such as systems dedicated to running the UW Information Navigator (UWIN), our Willow database query tool, etc. Although it is generally desirable to run interactive applications on the machine "closest" to the user, it is not not always possible. For example, the needed application may not run on a desktop computer, or it may run better on a fast (but shared) machine. Or sometimes a person only has access to a timesharing account. Still, the clear trend is to run more and more of one's interactive applications on one's personal computer, hence the importance of integrating them into a sensible client-server architecture... 2.5 A Typical Computing Cluster A minimal cluster would consist of a single multi-function server and a collection of personal computers. The single server might encompass any or all of the following services: o File/Print server o Mail server o News server o Interactive Compute server o Application server o Reference system As load on the single server grows to exceed its capacity, those functions can be split across several machines. In a very large cluster, several machines might be allocated to a single function. In a personal-computer-oriented environment, the "interactive compute" server(s) might not be needed. However, it is useful to have at least one machine through which mail and other basic services can be accessed from lowest-common-denominator media, e.g. "Telnet" or async dialup. Whether an application server is needed depends on the specific requirements of each group. Examples might include a database server or a machine dedicated to program compilation. Typically, for an application server to be useful, it must have access to each user's home directory --another reason for those to exist on a file server rather than each desktop machine. The Reference System function will be described subsequently. 2.6 Email Electronic mail is such a crucial part of any network computing environment that it deserves special attention. In this section we will outline some of the key architectural issues. 2.6.1 Multimedia, using Internet standards Because the Internet constitutes the largest email system in the world, and continues to grow at a prodigious rate, it can no longer be ignored even by businesses that once thought they didn't care about the Internet. Although X.400 --the only other International standard for email-- continues to receive more attention from some vendors, businesses, and government agencies, in our opinion the fatally flawed addressing structure of X.400 will keep it from making much of a dent in Internet mail growth. Hence we feel more than confident in recommending an Internet-centric approach to email for *any* organization. Interoperability is the paramount requirement in any messaging system. To fully interoperate in the Internet mail world, it is essential that the local system support the basic Internet mail standards RFC-821 (SMTP) and RFC-822 (Header descriptions). Moreover, support for MIME (Multipurpose Internet Mail Extensions) is also mandatory for interoperability since a growing number of Internet mail users are sending multipart/multimedia messages using the MIME standard. 2.6.2 Freedom from gateways There are two main approaches to interfacing a local distributed email system to Internet mail: (a) native support for Internet standards in the local mailers, and (b) email gateways. An email gateway is a process, sometimes running on a dedicated computer, which translates messages between two different formats and/or re-transmits them via two different messaging protocols. (An extension of the email gateway is the "message switch" that understands many different email formats and protocols.) A typical gateway scenario would be to have a proprietary LAN-oriented email system, and a dedicated gateway to Internet mail. Email gateways and switches make sense when one has no alternative but to live amongst conflicting or proprietary email approaches. However, when developing a distributed system architecture, the downsides of email gateways should be carefully considered: -Email gateways are responsible for a disproportionate number of email failures in the Internet. -Some of the proprietary gateways common as of this writing are of notoriously poor quality. -Translating between message formats often means there will be a loss of information in one direction or the other. System architects for an organization may have a difficult challenge in this area. The commercial email solutions that are *not* based on Internet standards may have many attractive characteristics, and will always promise full Internet interoperabilty via their add-on gateways. However, the problems with gateways (both historic and inherent) affect users only indirectly, so are usually not given sufficient weight during email software evaluations. When there is not already a large installed base of vendor-specific mail software, we strongly believe that choosing Internet-based mail software (which obviates the need for a gateway) will be in everyone's best long-term interest. 2.6.3 Mail stored on always-up hosts As mentioned in a previous section, the local disk of a desktop computer is not necessarily the best place to deliver email, since the machine may not be turned on 24 hours per day, may not be regularly backed-up, etc. Email should be delivered to an "always up" server, then accessed by the user's computer via a client-server network protocol. One's primary desktop computer could also be a mail server, so that mail transferred to its local hard disk could be accessed remotely. As true multitasking operating systems become more common on the desktop, this scenario will be more realistic, but for the same reasons that desktop hard disks are not the best place to deliver mail in the first place, they probably are not the best place to try to get at mail from other machines. Hence, we would argue that email servers, not also acting as someone's desktop computer, are the best place for delivering and storing incoming mail messages. 2.6.4 Open client-server protocol While there are a number of commercial systems embracing the client-server email model, several do so via protocols that are not open (at least not open in the sense of Internet protocols). There are, however, several *open* client-server protocol choices: -A generic file transfer protocol (e.g. FTP) -An application-specific mail folder transfer protocol (e.g. POP) -A generic file access protocol (e.g. NFS) -An application-specific mail folder access protocol (e.g. IMAP) The *transfer* protocols are appropriate only when the user is going to use a single computer for reading mail --ever. That is, if there is any liklihood that the user may need to access mail from more than one computer, then the mail should be left on a mail server and accessed via a generic file-access protocol or mail-specific message access protocol. In the case of email, the generic choice (NFS) is not the best choice. IMAP (Internet Message Access Protocol) is a good bet for a robust distributed mail architecture. It offers performance and robustness advantages over NFS, and allows certain functions (e.g. MIME parsing) to be handled by the server. 2.6.5 Access from multiple computers In some situations, a user has a personal computer that is used exclusively from wherever that user is currently located. When the person moves, the computer comes along, and all of that person's files are on that computer's hard disk. In this scenario, a mail *transfer* protocol such as POP is sufficient. However, more and more people are finding that they need to use more than one computer. Perhaps one in the office, another in a lab, a third at home. Or perhaps they use someone else's machine on occasion, while visiting. For our constituency, we consider it essential that the architecture accommodate the general need for multi-platform access to email (at least one's incoming message folders). This calls for use of a mail *access* protocol rather than a transfer protocol. Accordingly, we have chosen to use IMAP as the basis of our distributed email infrastructure at UW. 2.7 Reference System Concepts As the computing world moved from everyone sharing a single CPU to everyone having (at least one) CPU of their own, with associated memory, disk, and personalized configuration, several people noticed that managing many machines was harder than managing one machine. Since returning to the Good Old Days of Mainframes didn't even seem desirable, much less, possible, we spent some time thinking about how a large collection of machines could be made as easy to manage as a single machine. The result of this process was the "Reference System" model. The Reference System is a crucial part of UW's PC integration and management strategy. Details will follow in Section 3, but here is a brief overview. 2.7.1 Single-image to update The key to managing many systems is to arrange for them to look like a single system, from the perspective of the person who has to update files on them. Normal system maintenance includes installing or replacing files associated with new versions of software, changes in configuration files, etc. Clearly, making these changes in one place is much easier than making them on a zillion different boxes. Achieving the goal of updating multiple systems by changing files on a single machine is the primary objective of the Reference System. The Ref System holds the "master" copy of all executables and configuration files for all of the target machines under its purvue. When a change is made to a file on the Ref System, it is promulgated to the target machines without human intervention. 2.7.2 Source and target directories In a System Manager's perfect world, all of the target machines would have identical configurations. Alas, that is not our experience. There may be different hardware architectures in use, requiring different executables, and even Intel-based PC users may choose to license different sets of applications. As a result, the Ref System must allow for multiple target "equivalence classes", and also allow for some individual variation in configuration. The Ref System keeps the master files in directories that relate to where they came from. For example, all of the files associated with Microsoft's Windows For Workgroups release would be in one place, as would be the files that are part of DEC's Ultrix distribution. The target directories for various equivalence classes then have links to the appropriate source directories. 2.7.3 Propagation rules At periodic intervals, the system will compare what is on the target system with what the Ref System claims is supposed to be there, and add, delete, or replace files as necessary. Exception reports are generated. 2.7.4 File integrity checker Even when the Ref System sees the correct file names on the target system, provision has been made for validating the integrity of each file by comparing it with the Ref System copy. This provides considerable peace-of-mind in a world of ever-more-plentiful viruses and computer criminals. 3. IMPLEMENTATION 3.1 The Reference System in detail o Purpose: faciliate building new workstations and keeping old ones in-sync o Directory structures o The "linksrc" tool o Ref uptime is not critical (ie not 7x24) o The rdisting process o Backup of the data -disk shadowing -backup ref host -tape 3.2 Unix/X clusters o Distributed + Centralized Management o Ref is master o Obscurity buys some security o DNS randomization o Unix commands in cluster environment (e.g. ps) o Password/Group synchronization o Printing Model o Backup Model o Account/Group location, usage, management o The tkplog and auto-pilot notifier o Terminal installation o Printer installation 3.3 PC clusters o Security - NFS export of homedir/groupdir only to certain PC's o The Management Agent for MS-Windows -sw integrity checking -system hw reporting (disk usage, RAM, etc) o New machine installation -ref system setup script (newpc) -Install disk -IP number and Host data all that is needed -auto detection of supported hardware o Software updates o Time synchronization o Network printer access o network CD-ROM access o DialUP model o PC-Pine verses Telnet to Unix Pine o File locking issues o User-installed software -User's perceive they need this -We can minimize it if we provide correct sw -Design allows for it, but risk is unknown o Local disk for sensitive data 3.4 PC Hardware Assumptions o Ethernet Card o Mouse o Video o Disk o CD-ROM optional o Sound optional 3.5 System Limitations o Windows designed for single user on each PC o Implications for labs 4. CONCLUSIONS o The Reference System model works well o Limitations of DOS/Windows greatly complicate the problem o Variations in PC hardware greatly complicate the problem o Windows API and PC networking more of a moving target than Unix/X o X not sufficient on Desktop (no single GUI for applications) o Other Lessons Learned...