diff options
Diffstat (limited to 'en_US.ISO8859-1/books/design-44bsd/book.sgml')
-rw-r--r-- | en_US.ISO8859-1/books/design-44bsd/book.sgml | 2858 |
1 files changed, 0 insertions, 2858 deletions
diff --git a/en_US.ISO8859-1/books/design-44bsd/book.sgml b/en_US.ISO8859-1/books/design-44bsd/book.sgml deleted file mode 100644 index 6c44b5a92a..0000000000 --- a/en_US.ISO8859-1/books/design-44bsd/book.sgml +++ /dev/null @@ -1,2858 +0,0 @@ -<!-- $FreeBSD: doc/en_US.ISO_8859-1/books/design-44bsd/book.sgml,v 1.2 2001/03/07 07:21:23 sheldonh Exp $ --> -<!-- FreeBSD Documentation Project --> - -<!DOCTYPE book PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [ -<!ENTITY % man PUBLIC "-//FreeBSD//ENTITIES DocBook Manual Page Entities//EN"> -%man; -]> - -<book> - <bookinfo> - <title>The Design and Implementation of the 4.4BSD Operating System</title> - - <authorgroup> - <author> - <firstname>Marshall</firstname> - <othername>Kirk</othername> - <surname>McKusick</surname> - </author> - - <author> - <firstname>Keith</firstname> - <surname>Bostic</surname> - </author> - - <author> - <firstname>Michael</firstname> - <othername>J.</othername> - <surname>Karels</surname> - </author> - - <author> - <firstname>John</firstname> - <othername>S.</othername> - <surname>Quarterman</surname> - </author> - </authorgroup> - - <copyright> - <year>1996</year> - <holder>Addison-Wesley Longman, Inc</holder> - </copyright> - -<!-- I seem to recall the editor wanting this notice to be bold. In html, I'd - use the _strong_ tag. What should I use instead? --> - - <legalnotice> - <para>The second chapter of the book, <citetitle>The Design and - Implementation of the 4.4BSD Operating System</citetitle> is - excerpted here with the permission of the publisher. No part of it - may be further reproduced or distributed without the publisher's - express written - <ulink url="mailto:peter.gordon@awl.com">permission</ulink>. The - rest of - <ulink url="http://cseng.aw.com/book/0,3828,0201549794,00.html">the - book</ulink> explores the concepts introduced in this chapter in - incredible detail and is an excellent reference for anyone with an - interest in BSD UNIX. More information about this book is available - from the publisher, with whom you can also sign up to receive news - of <ulink url="mailto:curt.johnson@awl.com">related titles</ulink>. - Information about <ulink url="http://www.mckusick.com/courses/">BSD - courses</ulink> is available from Kirk McKusick.</para> - </legalnotice> - </bookinfo> - - <chapter label="2"> - <title>Design Overview of 4.4BSD</title> - - <sect1> - <title>4.4BSD Facilities and the Kernel</title> - - <para>The 4.4BSD kernel provides four basic facilities: - processes, - a filesystem, - communications, and - system startup. - This section outlines where each of these four basic services - is described in this book.</para> - - <orderedlist> - <listitem> - <para>Processes constitute a thread of control in an address space. - Mechanisms for creating, terminating, and otherwise - controlling processes are described in - Chapter 4. - The system multiplexes separate virtual-address spaces - for each process; - this memory management is discussed in - Chapter 5.</para> - </listitem> - - <listitem> - <para>The user interface to the filesystem and devices is similar; - common aspects are discussed in - Chapter 6. - The filesystem is a set of named files, organized in a tree-structured - hierarchy of directories, and of operations to manipulate them, - as presented in - Chapter 7. - Files reside on physical media such as disks. - 4.4BSD supports several organizations of data on the disk, - as set forth in - Chapter 8. - Access to files on remote machines is the subject of - Chapter 9. - Terminals are used to access the system; their operation is - the subject of - Chapter 10.</para> - </listitem> - - <listitem> - <para>Communication mechanisms provided by traditional UNIX systems include - simplex reliable byte streams between related processes (see pipes, - Section 11.1), - and notification of exceptional events (see signals, - Section 4.7). - 4.4BSD also has a general interprocess-communication facility. - This facility, described in - Chapter 11, - uses access mechanisms distinct from those of the filesystem, - but, once a connection is set up, a process can access it - as though it were a pipe. - There is a general networking framework, - discussed in - Chapter 12, - that is normally used as a layer underlying the - IPC - facility. - Chapter 13 - describes a particular networking implementation in detail.</para> - </listitem> - - <listitem> - <para>Any real operating system has operational issues, such as how to - start it running. - Startup and operational issues are described in - Chapter 14.</para> - </listitem> - </orderedlist> - - <para>Sections 2.3 through 2.14 present introductory - material related to Chapters 3 through 14. - We shall define terms, mention basic system calls, - and explore historical developments. - Finally, we shall give the reasons for many major design decisions.</para> - - <sect2> - <title>The Kernel</title> - - <para>The - <emphasis>kernel</emphasis> - is the part of the system that runs in protected mode and mediates - access by all user programs to the underlying hardware (e.g., - CPU, - disks, terminals, network links) - and software constructs - (e.g., filesystem, network protocols). - The kernel provides the basic system facilities; - it creates and manages processes, - and provides functions to access the filesystem - and communication facilities. - These functions, called - <emphasis>system calls</emphasis> - appear to user processes as library subroutines. - These system calls are the only interface - that processes have to these facilities. - Details of the system-call mechanism are given in - Chapter 3, - as are descriptions of several kernel mechanisms that do not execute - as the direct result of a process doing a system call.</para> - - <para>A - <emphasis>kernel</emphasis> - in traditional operating-system terminology, - is a small nucleus of software that - provides only the minimal facilities necessary for implementing - additional operating-system services. - In contemporary research operating systems -- such as - Chorus - <xref linkend="biblio-rozier">, - Mach - <xref linkend="biblio-accetta">, - Tunis - <xref linkend="biblio-ewens">, - and the - V Kernel - <xref linkend="biblio-cheriton"> -- - this division of functionality is more than just a logical one. - Services such as filesystems and networking protocols are - implemented as client application processes of the nucleus or kernel.</para> - - <para>The - 4.4BSD kernel is not partitioned into multiple processes. - This basic design decision was made in the earliest versions of UNIX. - The first two implementations by - Ken Thompson had no memory mapping, - and thus made no hardware-enforced distinction - between user and kernel space - <xref linkend="biblio-ritchie">. - A message-passing system could have been implemented as readily - as the actually implemented model of kernel and user processes. - The monolithic kernel was chosen for simplicity and performance. - And the early kernels were small; - the inclusion of facilities such as networking - into the kernel has increased its size. - The current trend in operating-systems research - is to reduce the kernel size by placing - such services in user space.</para> - - <para>Users ordinarily interact with the system through a command-language - interpreter, called a - <emphasis>shell</emphasis>, - and perhaps through additional user application programs. - Such programs and the shell are implemented with processes. - Details of such programs are beyond the scope of this book, - which instead concentrates almost exclusively on the kernel.</para> - - <para>Sections 2.3 and 2.4 - describe the services provided by the 4.4BSD kernel, - and give an overview of the latter's design. - Later chapters describe the detailed design and implementation of these - services as they appear in 4.4BSD.</para> - </sect2> - </sect1> - - <sect1> - <title>Kernel Organization</title> - - <para>In this section, we view the organization of the 4.4BSD - kernel in two ways:</para> - - <orderedlist> - <listitem> - <para>As a static body of software, - categorized by the functionality offered by the modules - that make up the kernel</para> - </listitem> - - <listitem> - <para>By its dynamic operation, - categorized according to the services provided to users</para> - </listitem> - </orderedlist> - - <para>The largest part of the kernel implements - the system services that applications access through system calls. - In 4.4BSD, this software has been organized according to the following:</para> - - <itemizedlist> - <listitem> - <para>Basic kernel facilities: - timer and system-clock handling, - descriptor management, and process management</para> - </listitem> - - <listitem> - <para>Memory-management support: - paging and swapping</para> - </listitem> - - <listitem> - <para>Generic system interfaces: - the I/O, - control, and multiplexing operations performed on descriptors</para> - </listitem> - - <listitem> - <para>The filesystem: - files, directories, pathname translation, file locking, - and I/O buffer management</para> - </listitem> - - <listitem> - <para>Terminal-handling support: - the terminal-interface driver and terminal - line disciplines</para> - </listitem> - - <listitem> - <para>Interprocess-communication facilities: - sockets</para> - </listitem> - - <listitem> - <para>Support for network communication: - communication protocols and - generic network facilities, such as routing</para> - </listitem> - </itemizedlist> - - <table frame="none" id="table-mach-indep"> - <title>Machine-independent software in the 4.4BSD kernel</title> - <tgroup cols="3"> - <thead> - <row> - <entry>Category</entry> - <entry>Lines of code</entry> - <entry>Percentage of kernel</entry> - </row> - </thead> - - <tfoot> - <row> - <entry>total machine independent</entry> - <entry>162,617</entry> - <entry>80.4</entry> - </row> - </tfoot> - - <tbody> - <row> - <entry>headers</entry> - <entry>9,393</entry> - <entry>4.6</entry> - </row> - - <row> - <entry>initialization</entry> - <entry>1,107</entry> - <entry>0.6</entry> - </row> - - <row> - <entry>kernel facilities</entry> - <entry>8,793</entry> - <entry>4.4</entry> - </row> - - <row> - <entry>generic interfaces</entry> - <entry>4,782</entry> - <entry>2.4</entry> - </row> - - <row> - <entry>interprocess communication</entry> - <entry>4,540</entry> - <entry>2.2</entry> - </row> - - <row> - <entry>terminal handling</entry> - <entry>3,911</entry> - <entry>1.9</entry> - </row> - - <row> - <entry>virtual memory</entry> - <entry>11,813</entry> - <entry>5.8</entry> - </row> - - <row> - <entry>vnode management</entry> - <entry>7,954</entry> - <entry>3.9</entry> - </row> - - <row> - <entry>filesystem naming</entry> - <entry>6,550</entry> - <entry>3.2</entry> - </row> - - <row> - <entry>fast filestore</entry> - <entry>4,365</entry> - <entry>2.2</entry> - </row> - - <row> - <entry>log-structure filestore</entry> - <entry>4,337</entry> - <entry>2.1</entry> - </row> - - <row> - <entry>memory-based filestore</entry> - <entry>645</entry> - <entry>0.3</entry> - </row> - - <row> - <entry>cd9660 filesystem</entry> - <entry>4,177</entry> - <entry>2.1</entry> - </row> - - <row> - <entry>miscellaneous filesystems (10)</entry> - <entry>12,695</entry> - <entry>6.3</entry> - </row> - - <row> - <entry>network filesystem</entry> - <entry>17,199</entry> - <entry>8.5</entry> - </row> - - <row> - <entry>network communication</entry> - <entry>8,630</entry> - <entry>4.3</entry> - </row> - - <row> - <entry>internet protocols</entry> - <entry>11,984</entry> - <entry>5.9</entry> - </row> - - <row> - <entry>ISO protocols</entry> - <entry>23,924</entry> - <entry>11.8</entry> - </row> - - <row> - <entry>X.25 protocols</entry> - <entry>10,626</entry> - <entry>5.3</entry> - </row> - - <row> - <entry>XNS protocols</entry> - <entry>5,192</entry> - <entry>2.6</entry> - </row> - </tbody> - </tgroup> - </table> - - <para>Most of the software in these categories is machine independent - and is portable across different hardware architectures.</para> - - <para>The machine-dependent aspects of the kernel - are isolated from the mainstream code. - In particular, none of the machine-independent code contains - conditional code for specific architecture. - When an architecture-dependent action is needed, - the machine-independent code calls an architecture-dependent - function that is located in the machine-dependent code. - The software that is machine dependent includes</para> - - <itemizedlist> - <listitem> - <para>Low-level system-startup actions</para> - </listitem> - - <listitem> - <para>Trap and fault handling</para> - </listitem> - - <listitem> - <para>Low-level manipulation of the run-time context of a - process</para> - </listitem> - - <listitem> - <para>Configuration and initialization of hardware devices</para> - </listitem> - - <listitem> - <para>Run-time support for I/O devices</para> - </listitem> - </itemizedlist> - - <table frame="none" id="table-mach-dep"> - <title>Machine-dependent software for the HP300 in the 4.4BSD - kernel</title> - - <tgroup cols="3"> - <thead> - <row> - <entry>Category</entry> - <entry>Lines of code</entry> - <entry>Percentage of kernel</entry> - </row> - </thead> - - <tfoot> - <row> - <entry>total machine dependent</entry> - <entry>39,634</entry> - <entry>19.6</entry> - </row> - </tfoot> - - <tbody> - <row> - <entry>machine dependent headers</entry> - <entry>1,562</entry> - <entry>0.8</entry> - </row> - - <row> - <entry>device driver headers</entry> - <entry>3,495</entry> - <entry>1.7</entry> - </row> - - <row> - <entry>device driver source</entry> - <entry>17,506</entry> - <entry>8.7</entry> - </row> - - <row> - <entry>virtual memory</entry> - <entry>3,087</entry> - <entry>1.5</entry> - </row> - - <row> - <entry>other machine dependent</entry> - <entry>6,287</entry> - <entry>3.1</entry> - </row> - - <row> - <entry>routines in assembly language</entry> - <entry>3,014</entry> - <entry>1.5</entry> - </row> - - <row> - <entry>HP/UX compatibility</entry> - <entry>4,683</entry> - <entry>2.3</entry> - </row> - </tbody> - </tgroup> - </table> - - <para><xref linkend="table-mach-indep"> summarizes the machine-independent software that constitutes the - 4.4BSD kernel for the HP300. - The numbers in column 2 are for lines of C source code, - header files, and assembly language. - Virtually all the software in the kernel is written in the C - programming language; - less than 2 percent is written in - assembly language. - As the statistics in <xref linkend="table-mach-dep"> show, - the machine-dependent software, excluding - HP/UX - and device support, - accounts for a minuscule 6.9 percent of the kernel.</para> - - <para>Only a small part of the kernel is devoted to - initializing the system. - This code is used when the system is - <emphasis>bootstrapped</emphasis> - into operation and is responsible for setting up the kernel hardware - and software environment - (see - Chapter 14). - Some operating systems (especially those with limited physical memory) - discard or - <emphasis>overlay</emphasis> - the software that performs these functions after that software has - been executed. - The 4.4BSD kernel does not reclaim the memory used by the - startup code because that memory space is barely 0.5 percent - of the kernel resources used on a typical machine. - Also, the startup code does not appear in one place in the kernel -- it is - scattered throughout, and it usually appears - in places logically associated with what is being initialized.</para> - </sect1> - - <sect1> - <title>Kernel Services</title> - - <para>The boundary between the kernel- and user-level code is enforced by - hardware-protection facilities provided by the underlying hardware. - The kernel operates in a separate address space that is inaccessible to - user processes. - Privileged operations -- such as starting I/O - and halting the central processing unit - (CPU) -- - are available to only the kernel. - Applications request services from the kernel with - <emphasis>system calls</emphasis> - System calls are used to cause the kernel to execute complicated - operations, such as writing data to secondary storage, - and simple operations, such as returning the current time of day. - All system calls appear - <emphasis>synchronous</emphasis> - to applications: - The application does not run while the kernel does the actions associated - with a system call. - The kernel may finish some operations associated with a system call - after it has returned. - For example, a - <emphasis>write</emphasis> - system call will copy the data to be written - from the user process to a kernel buffer while the process waits, - but will usually return from the system call - before the kernel buffer is written to the disk.</para> - - <para>A system call usually is implemented as a hardware trap that changes the - CPU's - execution mode and the current address-space mapping. - Parameters supplied by users in system calls are validated by the kernel - before being used. - Such checking ensures the integrity of the system. - All parameters passed into the kernel are copied into the - kernel's address space, - to ensure that validated parameters are not changed - as a side effect of the system call. - System-call results are returned by the kernel, - either in hardware registers or by their values - being copied to user-specified memory addresses. - Like parameters passed into the kernel, - addresses used for - the return of results must be validated to ensure that they are - part of an application's address space. - If the kernel encounters an error while processing a system call, - it returns an error code to the user. - For the - C programming language, this error code - is stored in the global variable - <emphasis>errno</emphasis>, - and the function that executed the system call returns the value -1.</para> - - <para>User applications and the kernel operate - independently of each other. - 4.4BSD does not store I/O control blocks or other - operating-system-related - data structures in the application's address space. - Each user-level application is provided an independent address space in - which it executes. - The kernel makes most state changes, - such as suspending a process while another is running, - invisible to the processes involved.</para> - </sect1> - - <sect1> - <title>Process Management</title> - - <para>4.4BSD supports a multitasking environment. - Each task or thread of execution is termed a - <emphasis>process</emphasis>. - The - <emphasis>context</emphasis> - of a 4.4BSD process consists of user-level state, - including the contents of its address space - and the run-time environment, and kernel-level state, - which includes - scheduling parameters, - resource controls, - and identification information. - The context includes everything - used by the kernel in providing services for the process. - Users can create processes, control the processes' execution, - and receive notification when the processes' execution status changes. - Every process is assigned a unique value, termed a - <emphasis>process identifier</emphasis> - (PID). - This value is used by the kernel to identify a process when reporting - status changes to a user, and by a user when referencing a process - in a system call.</para> - - <para>The kernel creates a process by duplicating the context of another process. - The new process is termed a - <emphasis>child process</emphasis> - of the original - <emphasis>parent process</emphasis> - The context duplicated in process creation includes - both the user-level execution state of the process and - the process's system state managed by the kernel. - Important components of the kernel state are described in - Chapter 4.</para> - - <figure id="fig-process-lifecycle"> - <title>Process lifecycle</title> - - <mediaobject> - <imageobject> - <imagedata fileref="fig1" format="EPS"> - </imageobject> - - <textobject> - <literallayout class="monospaced">+----------------+ wait +----------------+ -| parent process |--------------------------------->| parent process |---> -+----------------+ +----------------+ - | ^ - | fork | - V | -+----------------+ execve +----------------+ wait +----------------+ -| child process |------->| child process |------->| zombie process | -+----------------+ +----------------+ +----------------+</literallayout> - </textobject> - - <textobject> - <phrase>Process-management system calls</phrase> - </textobject> - </mediaobject> - </figure> - - <para>The process lifecycle is depicted in <xref linkend="fig-process-lifecycle">. - A process may create a new process that is a copy of the original - by using the - <emphasis>fork</emphasis> - system call. - The - <emphasis>fork</emphasis> - call returns twice: once in the parent process, where - the return value is the process identifier of the child, - and once in the child process, where the return value is 0. - The parent-child relationship induces a hierarchical structure on - the set of processes in the system. - The new process shares all its parent's resources, such as - file descriptors, signal-handling status, and memory layout.</para> - - <para>Although there are occasions when the new process is intended - to be a copy of the parent, - the loading and execution of a different program is - a more useful and typical action. - A process can overlay itself with the memory image of another program, - passing to the newly created image a set of parameters, - using the system call - <emphasis>execve</emphasis>. - One parameter is the name of a file whose contents are - in a format recognized by the system -- either a binary-executable file - or a file that causes - the execution of a specified interpreter program to process its contents.</para> - - <para>A process may terminate by executing an - <emphasis>exit</emphasis> - system call, sending 8 bits of - exit status to its parent. - If a process wants to communicate more than a single byte of - information with its parent, - it must either set up an interprocess-communication channel - using pipes or sockets, - or use an intermediate file. - Interprocess communication is discussed extensively in - Chapter 11.</para> - - <para>A process can suspend execution until any of its child processes terminate - using the - <emphasis>wait</emphasis> - system call, which returns the - PID - and - exit status of the terminated child process. - A parent process can arrange to be notified by a signal when - a child process exits or terminates abnormally. - Using the - <emphasis>wait4</emphasis> - system call, the parent can retrieve information about - the event that caused termination of the child process - and about resources consumed by the process during its lifetime. - If a process is orphaned because its parent exits before it is finished, - then the kernel arranges for the child's exit status to be passed back - to a special system process - <!-- FIXME, the emphasis is wrong --> - <emphasis>init</emphasis>: - see Sections 3.1 and 14.6).</para> - - <para>The details of how the kernel creates and destroys processes are given in - Chapter 5.</para> - - <para>Processes are scheduled for execution according to a - <emphasis>process-priority</emphasis> - parameter. - This priority is managed by a kernel-based scheduling algorithm. - Users can influence the scheduling of a process by specifying - a parameter - (<emphasis>nice</emphasis>) - that weights the overall scheduling priority, - but are still obligated to share the underlying - CPU - resources according to the kernel's scheduling policy.</para> - - <sect2> - <title>Signals</title> - - <para>The system defines a set of - <emphasis>signals</emphasis> - that may be delivered to a process. - Signals in 4.4BSD are modeled after hardware interrupts. - A process may specify a user-level subroutine to be a - <emphasis>handler</emphasis> - to which a signal should be delivered. - When a signal is generated, - it is blocked from further occurrence while it is being - <emphasis>caught</emphasis> - by the handler. - Catching a signal involves saving the current process context - and building a new one in which to run the handler. - The signal is then delivered to the handler, which can either abort - the process or return to the executing process - (perhaps after setting a global variable). - If the handler returns, the signal is unblocked - and can be generated (and caught) again.</para> - - <para>Alternatively, a process may specify that a signal is to be - <emphasis>ignored</emphasis>, - or that a default action, as determined by the kernel, is to be taken. - The default action of certain signals is to terminate the process. - This termination may be accompanied by creation of a - <emphasis>core file</emphasis> - that contains the current memory image of the process for use - in postmortem debugging.</para> - - <para>Some signals cannot be caught or ignored. - These signals include - <emphasis>SIGKILL</emphasis>, - which kills runaway processes, - and the - job-control signal - <emphasis>SIGSTOP</emphasis>.</para> - - <para>A process may choose to have signals delivered on a - special stack so that sophisticated software stack manipulations - are possible. - For example, a language supporting - coroutines needs to provide a stack for each coroutine. - The language run-time system can allocate these stacks - by dividing up the single stack provided by 4.4BSD. - If the kernel does not support a separate signal stack, - the space allocated for each coroutine must be expanded by the - amount of space required to catch a signal.</para> - - <para>All signals have the same <emphasis>priority</emphasis>. - If multiple signals are pending simultaneously, the order in which - signals are delivered to a process is implementation specific. - Signal handlers execute with the signal that caused their - invocation to be blocked, but other signals may yet occur. - Mechanisms are provided so that processes can protect critical sections - of code against the occurrence of specified signals.</para> - - <para>The detailed design and implementation of signals is described in - Section 4.7.</para> - </sect2> - - <sect2> - <title>Process Groups and Sessions</title> - - <para>Processes are organized into - <emphasis>process groups</emphasis>. - Process groups are used to control access to terminals - and to provide a means of distributing signals to collections of - related processes. - A process inherits its process group from its parent process. - Mechanisms are provided by the kernel to allow a process to - alter its process group or the process group of its descendents. - Creating a new process group is easy; - the value of a new process group is ordinarily the - process identifier of the creating process.</para> - - <para>The group of processes in a process group is sometimes - referred to as a - <emphasis>job</emphasis> - and is manipulated by high-level system software, such as the shell. - A common kind of job created by a shell is a - <emphasis>pipeline</emphasis> - of several processes connected by pipes, such that the output of the first - process is the input of the second, the output of the second is the - input of the third, and so forth. - The shell creates such a job by forking a - process for each stage of the pipeline, - then putting all those processes into a separate process group.</para> - - <para>A user process can send a signal to each process in - a process group, as well as to a single process. - A process in a specific process group may receive - software interrupts affecting the group, causing the group to - suspend or resume execution, or to be interrupted or terminated.</para> - - <para>A terminal has a process-group identifier assigned to it. - This identifier is normally set to the identifier of a process group - associated with the terminal. - A job-control shell may create a number of process groups - associated with the same terminal; the terminal is the - <emphasis>controlling terminal</emphasis> - for each process in these groups. - A process may read from a descriptor for its controlling terminal - only if the terminal's process-group identifier - matches that of the process. - If the identifiers do not match, - the process will be blocked if it attempts to read from the terminal. - By changing the process-group identifier of the terminal, - a shell can arbitrate a terminal among several different jobs. - This arbitration is called - <emphasis>job control</emphasis> - and is described, with process groups, in - Section 4.8.</para> - - <para>Just as a set of related processes can be collected into a process group, - a set of process groups can be collected into a - <emphasis>session</emphasis>. - The main uses for sessions are to create an isolated environment for a - daemon process and its children, - and to collect together a user's login shell - and the jobs that that shell spawns.</para> - </sect2> - </sect1> - - <sect1> - <title>Memory Management</title> - - <para>Each process has its own private address space. - The address space is initially divided into three logical segments: - <emphasis>text</emphasis>, - <emphasis>data</emphasis>, - and - <emphasis>stack</emphasis>. - The text segment is read-only and contains the machine - instructions of a program. - The data and stack segments are both readable and writable. - The data segment contains the - initialized and uninitialized data portions of a program, whereas - the stack segment holds the application's run-time stack. - On most machines, the stack segment is extended automatically - by the kernel as the process executes. - A process can expand or contract its data segment by making a system call, - whereas a process can change the size of its text segment - only when the segment's contents are overlaid with data from the - filesystem, or when debugging takes place. - The initial contents of the segments of a child process - are duplicates of the segments of a parent process.</para> - - <para>The entire contents of a process address space do not need to be resident - for a process to execute. - If a process references a part of its address space that is not - resident in main memory, the system - <emphasis>pages</emphasis> - the necessary information into memory. - When system resources are scarce, the system uses a two-level - approach to maintain available resources. - If a modest amount of memory is available, the system will take - memory resources away from processes if these resources have not been - used recently. - Should there be a severe resource shortage, the system will resort to - <emphasis>swapping</emphasis> - the entire context of a process to secondary storage. - The - <emphasis>demand paging</emphasis> - and - <emphasis>swapping</emphasis> - done by the system are effectively transparent to processes. - A process may, however, advise the system - about expected future memory utilization as a performance aid.</para> - - <sect2> - <title>BSD Memory-Management Design Decisions</title> - - <para>The support of large sparse address spaces, mapped files, - and shared memory was a requirement for 4.2BSD. - An interface was specified, called - <emphasis>mmap</emphasis>, - that allowed unrelated processes to request a shared - mapping of a file into their address spaces. - If multiple processes mapped the same file into their address spaces, - changes to the file's portion of an address space - by one process would be reflected - in the area mapped by the other processes, as well as in the file itself. - Ultimately, 4.2BSD was shipped without the - <emphasis>mmap</emphasis> - interface, because of pressure to make other features, such as - networking, available.</para> - - <para>Further development of the - <emphasis>mmap</emphasis> - interface continued during the work on 4.3BSD. - Over 40 companies and research groups participated - in the discussions leading to the revised architecture - that was described in the Berkeley Software Architecture Manual - <xref linkend="biblio-mckusick-1">. - Several of the companies have implemented the revised interface - <xref linkend="biblio-gingell">.</para> - - <para>Once again, time pressure prevented 4.3BSD from providing an - implementation of the interface. - Although the latter could have been built into the existing - 4.3BSD virtual-memory system, - the developers decided not to put it in because - that implementation was nearly 10 years old. - Furthermore, the original virtual-memory design was based - on the assumption that computer - memories were small and expensive, whereas disks were - locally connected, fast, large, and inexpensive. - Thus, the virtual-memory system was designed to be frugal - with its use of memory at the expense of generating extra disk traffic. - In addition, the - 4.3BSD implementation was riddled with - VAX - memory-management hardware dependencies that impeded its portability - to other computer architectures. - Finally, the virtual-memory system was not designed - to support the tightly coupled - multiprocessors that are becoming - increasingly common and important today.</para> - - <para>Attempts to improve the old implementation incrementally - seemed doomed to failure. - A completely new design, - on the other hand, - could take advantage of large memories, - conserve disk transfers, - and have the potential to run on multiprocessors. - Consequently, the virtual-memory system was completely replaced in 4.4BSD. - The 4.4BSD virtual-memory system - is based on the Mach 2.0 VM system - <xref linkend="biblio-tevanian">. - with updates from Mach 2.5 and Mach 3.0. - It features - efficient support for sharing, - a clean separation of machine-independent and machine-dependent features, - as well as (currently unused) multiprocessor support. - Processes can map files anywhere in their address space. - They can share parts of their address space by - doing a shared mapping of the same file. - Changes made by one process are visible in the address space of - the other process, and also are written back to the file itself. - Processes can also request private mappings of a file, which prevents - any changes that they make from being visible to other processes - mapping the file or being written back to the file itself.</para> - - <para>Another issue with the virtual-memory system is the way that - information is passed into the kernel when a system call is made. - 4.4BSD always copies data from the process address space - into a buffer in the kernel. - For read or write operations - that are transferring large quantities of data, - doing the copy can be time consuming. - An alternative to doing the copying is to remap the - process memory into the kernel. - The 4.4BSD kernel always copies the data for several reasons:</para> - - <itemizedlist> - <listitem> - <para>Often, the user data are not page aligned and are not a multiple of - the hardware page length.</para> - </listitem> - - <listitem> - <para>If the page is taken away from the process, - it will no longer be able to reference that page. - Some programs depend on the data remaining in the - buffer even after those data have been written.</para> - </listitem> - - <listitem> - <para>If the process is allowed to keep a copy of the page - (as it is in current 4.4BSD semantics), - the page must be made - <emphasis>copy-on-write</emphasis>. - A copy-on-write page is one that is protected against being written - by being made read-only. - If the process attempts to modify the page, - the kernel gets a write fault. - The kernel then makes a copy of the page that the process can modify. - Unfortunately, the typical process will immediately - try to write new data to its output buffer, - forcing the data to be copied anyway.</para> - </listitem> - - <listitem> - <para>When pages are remapped to new virtual-memory addresses, - most memory-management hardware requires that the hardware - address-translation cache be purged selectively. - The cache purges are often slow. - The net effect is that remapping is slower than - copying for blocks of data less than 4 to 8 Kbyte.</para> - </listitem> - </itemizedlist> - - <para>The biggest incentives for memory mapping are the needs for - accessing big files and for passing large quantities of data - between processes. - The - <emphasis>mmap</emphasis> - interface provides a way for both of these tasks - to be done without copying.</para> - </sect2> - - <sect2> - <title>Memory Management Inside the Kernel</title> - - <para>The kernel often does allocations of memory that are - needed for only the duration of a single system call. - In a user process, such short-term - memory would be allocated on the run-time stack. - Because the kernel has a limited run-time stack, - it is not feasible to allocate even moderate-sized blocks of memory on it. - Consequently, such memory must be allocated - through a more dynamic mechanism. - For example, - when the system must translate a pathname, - it must allocate a 1-Kbyte buffer to hold the name. - Other blocks of memory must be more persistent than a single system call, - and thus could not be allocated on the stack even if there was space. - An example is protocol-control blocks that remain throughout - the duration of a network connection.</para> - - <para>Demands for dynamic memory allocation in the kernel have increased - as more services have been added. - A generalized memory allocator reduces the complexity - of writing code inside the kernel. - Thus, the 4.4BSD kernel has a single memory allocator that can be - used by any part of the system. - It has an interface similar to the C library routines - <emphasis>malloc</emphasis> - and - <emphasis>free</emphasis> - that provide memory allocation to application programs - <xref linkend="biblio-mckusick-2">. - Like the C library interface, - the allocation routine takes a parameter specifying the - size of memory that is needed. - The range of sizes for memory requests is not constrained; - however, physical memory is allocated and is not paged. - The free routine takes a pointer to the storage being freed, - but does not require the size - of the piece of memory being freed.</para> - </sect2> - </sect1> - - <sect1> - <title>I/O System</title> - - <para>The basic model of the UNIX - I/O system is a sequence of bytes - that can be accessed either randomly or sequentially. - There are no - <emphasis>access methods</emphasis> - and no - <emphasis>control blocks</emphasis> - in a typical UNIX user process.</para> - - <para>Different programs expect various levels of structure, - but the kernel does not impose structure on I/O. - For instance, the convention for text files is lines of - ASCII - characters separated by a single newline character - (the - ASCII - line-feed character), - but the kernel knows nothing about this convention. - For the purposes of most programs, - the model is further simplified to being a stream of data bytes, - or an - <emphasis>I/O stream</emphasis>. - It is this single common data form that makes the - characteristic UNIX tool-based approach work - <xref linkend="biblio-kernighan">. - An I/O stream from one program can be fed as input - to almost any other program. - (This kind of traditional UNIX - I/O stream should not be confused with the - Eighth Edition stream I/O system or with the - System V, Release 3 - STREAMS, - both of which can be accessed as traditional I/O streams.)</para> - - <sect2> - <title>Descriptors and I/O</title> - - <para>UNIX processes use - <emphasis>descriptors</emphasis> - to reference I/O streams. - Descriptors are small unsigned integers obtained from the - <emphasis>open</emphasis> - and - <emphasis>socket</emphasis> - system calls. - The - <emphasis>open</emphasis> - system call takes as arguments the name of a file and - a permission mode to - specify whether the file should be open for reading or for writing, - or for both. - This system call also can be used to create a new, empty file. - A - <emphasis>read</emphasis> - or - <emphasis>write</emphasis> - system call can be applied to a descriptor to transfer data. - The - <emphasis>close</emphasis> - system call can be used to deallocate any descriptor.</para> - - <para>Descriptors represent underlying objects supported by the kernel, - and are created by system calls specific to the type of object. - In 4.4BSD, three kinds of objects can be represented by descriptors: - files, pipes, and sockets.</para> - - <itemizedlist> - <listitem> - <para>A - <emphasis>file</emphasis> - is a linear array of bytes with at least one name. - A file exists until all its names are deleted explicitly - and no process holds a descriptor for it. - A process acquires a descriptor for a file - by opening that file's name with the - <emphasis>open</emphasis> - system call. - I/O devices are accessed as files.</para> - </listitem> - - <listitem> - <para>A - <emphasis>pipe</emphasis> - is a linear array of bytes, as is a file, but it is used solely - as an I/O stream, and it is unidirectional. - It also has no name, - and thus cannot be opened with - <emphasis>open</emphasis>. - Instead, it is created by the - <emphasis>pipe</emphasis> - system call, which returns two descriptors, - one of which accepts input that is sent to the other descriptor reliably, - without duplication, and in order. - The system also supports a named pipe or - FIFO. - A - FIFO - has properties identical to a pipe, except that it appears - in the filesystem; - thus, it can be opened using the - <emphasis>open</emphasis> - system call. - Two processes that wish to communicate each open the - FIFO: - One opens it for reading, the other for writing.</para> - </listitem> - - <listitem> - <para>A - <emphasis>socket</emphasis> - is a transient object that is used for - interprocess communication; - it exists only as long as some process holds a descriptor - referring to it. - A socket is created by the - <emphasis>socket</emphasis> - system call, which returns a descriptor for it. - There are different kinds of sockets that support various communication - semantics, such as reliable delivery of data, preservation of - message ordering, and preservation of message boundaries.</para> - </listitem> - </itemizedlist> - - <para>In systems before 4.2BSD, pipes were implemented using the filesystem; - when sockets were introduced in 4.2BSD, - pipes were reimplemented as sockets.</para> - - <para>The kernel keeps for each process a - <emphasis>descriptor table</emphasis>, - which is a table that the kernel uses - to translate the external representation - of a descriptor into an internal representation. - (The descriptor is merely an index into this table.) - The descriptor table of a process is inherited from that process's parent, - and thus access to the objects - to which the descriptors refer also is inherited. - The main ways that a process can obtain a descriptor are by - opening or creation of an object, - and by inheritance from the parent process. - In addition, socket - IPC - allows passing of descriptors in messages between unrelated processes - on the same machine.</para> - - <para>Every valid descriptor has an associated - <emphasis>file offset</emphasis> - in bytes from the beginning of the object. - Read and write operations start at this offset, which is - updated after each data transfer. - For objects that permit random access, - the file offset also may be set with the - <emphasis>lseek</emphasis> - system call. - Ordinary files permit random access, and some devices do, as well. - Pipes and sockets do not.</para> - - <para>When a process terminates, the kernel - reclaims all the descriptors that were in use by that process. - If the process was holding the final reference to an object, - the object's manager is notified so that it can do any - necessary cleanup actions, such as final deletion of a file - or deallocation of a socket.</para> - </sect2> - - <sect2> - <title>Descriptor Management</title> - - <para>Most processes expect three descriptors to be open already - when they start running. - These descriptors are 0, 1, 2, more commonly known as - <emphasis>standard input</emphasis>, - <emphasis>standard output</emphasis>, - and - <emphasis>standard error</emphasis>, - respectively. - Usually, all three are associated with the user's terminal - by the login process - (see - Section 14.6) - and are inherited through - <emphasis>fork</emphasis> - and - <emphasis>exec</emphasis> - by processes run by the user. - Thus, a program can read what the user types by reading standard - input, and the program can send output to the user's screen by - writing to standard output. - The standard error descriptor also is open for writing and is - used for error output, whereas standard output is used for ordinary output.</para> - - <para>These (and other) descriptors can be mapped to objects other than - the terminal; - such mapping is called - <emphasis>I/O redirection</emphasis>, - and all the standard shells permit users to do it. - The shell can direct the output of a program to a file - by closing descriptor 1 (standard output) and opening - the desired output file to produce a new descriptor 1. - It can similarly redirect standard input to come from a file - by closing descriptor 0 and opening the file.</para> - - <para>Pipes allow the output of one program to be input to another program - without rewriting or even relinking of either program. - Instead of descriptor 1 (standard output) - of the source program being set up to write to the terminal, - it is set up to be the input descriptor of a pipe. - Similarly, descriptor 0 (standard input) - of the sink program is set up to reference the output of the pipe, - instead of the terminal keyboard. - The resulting set of two processes and the connecting pipe is known as a - <emphasis>pipeline</emphasis>. - Pipelines can be arbitrarily long series of processes connected by pipes.</para> - - <para>The - <emphasis>open</emphasis>, - <emphasis>pipe</emphasis>, - and - <emphasis>socket</emphasis> - system calls produce new descriptors with the lowest unused number - usable for a descriptor. - For pipelines to work, - some mechanism must be provided to map such descriptors into 0 and 1. - The - <emphasis>dup</emphasis> - system call creates a copy of a descriptor that - points to the same file-table entry. - The new descriptor is also the lowest unused one, - but if the desired descriptor is closed first, - <emphasis>dup</emphasis> - can be used to do the desired mapping. - Care is required, however: If descriptor 1 is desired, - and descriptor 0 happens also to have been closed, descriptor 0 - will be the result. - To avoid this problem, the system provides the - <emphasis>dup2</emphasis> - system call; - it is like - <emphasis>dup</emphasis>, - but it takes an additional argument specifying - the number of the desired descriptor - (if the desired descriptor was already open, - <emphasis>dup2</emphasis> - closes it before reusing it).</para> - </sect2> - - <sect2> - <title>Devices</title> - - <para>Hardware devices have filenames, and may be - accessed by the user via the same system calls used for regular files. - The kernel can distinguish a - <emphasis>device special file</emphasis> - or - <emphasis>special file</emphasis>, - and can determine to what device it refers, - but most processes do not need to make this determination. - Terminals, printers, and tape drives are all accessed as though they - were streams of bytes, like 4.4BSD disk files. - Thus, device dependencies and peculiarities are kept in the kernel - as much as possible, and even in the kernel most of them are segregated - in the device drivers.</para> - - <para>Hardware devices can be categorized as either - <emphasis>structured</emphasis> - or - <emphasis>unstructured</emphasis>; - they are known as - <emphasis>block</emphasis> - or - <emphasis>character</emphasis> - devices, respectively. - Processes typically access devices through - <emphasis>special files</emphasis> - in the filesystem. - I/O operations to these files are handled by - kernel-resident software modules termed - <emphasis>device drivers</emphasis>. - Most network-communication hardware devices are accessible through only - the interprocess-communication facilities, - and do not have special files in the filesystem name space, - because the - <emphasis>raw-socket</emphasis> - interface provides a more natural interface than does a special file.</para> - - <para>Structured or block devices are typified by disks and magnetic tapes, - and include most random-access devices. - The kernel supports read-modify-write-type buffering actions - on block-oriented structured devices to allow the latter - to be read and written in a - totally random byte-addressed fashion, like regular files. - Filesystems are created on block devices.</para> - - <para>Unstructured devices are those devices that do not support a block - structure. - Familiar unstructured devices are communication lines, raster - plotters, and unbuffered magnetic tapes and disks. - Unstructured devices typically support large block I/O transfers.</para> - - <para>Unstructured files are called - <emphasis>character devices</emphasis> - because the first of these to be implemented were terminal device drivers. - The kernel interface to the driver for these devices proved convenient - for other devices that were not block structured.</para> - - <para>Device special files are created by the - <emphasis>mknod</emphasis> - system call. - There is an additional system call, - <emphasis>ioctl</emphasis>, - for manipulating the underlying device parameters of special files. - The operations that can be done differ for each device. - This system call allows the special characteristics of devices to - be accessed, rather than overloading the semantics of other system calls. - For example, there is an - <emphasis>ioctl</emphasis> - on a tape drive to write an end-of-tape mark, - instead of there being a special or modified version of - <emphasis>write</emphasis>.</para> - </sect2> - - <sect2> - <title>Socket IPC</title> - - <para>The 4.2BSD kernel introduced an - IPC - mechanism more flexible than pipes, based on - <emphasis>sockets</emphasis>. - A socket is an endpoint of communication referred to by - a descriptor, just like a file or a pipe. - Two processes can each create a socket, and then connect those - two endpoints to produce a reliable byte stream. - Once connected, the descriptors for the sockets can be read or written - by processes, just as the latter would do with a pipe. - The transparency of sockets allows the kernel to redirect the output - of one process to the input of another process residing on another machine. - A major difference between pipes and sockets is that - pipes require a common parent process to set up the - communications channel. - A connection between sockets can be set up by two unrelated processes, - possibly residing on different machines.</para> - - <para>System V provides local interprocess communication through - FIFOs - (also known as - <emphasis>named pipes</emphasis>). - FIFOs - appear as an object in the filesystem that unrelated - processes can open and send data through in the same - way as they would communicate through a pipe. - Thus, - FIFOs - do not require a common parent to set them up; - they can be connected after a pair of processes are up and running. - Unlike sockets, - FIFOs - can be used on only a local machine; - they cannot be used to communicate between processes on different machines. - FIFOs - are implemented in 4.4BSD only because they are required by the - POSIX.1 - standard. - Their functionality is a subset of the socket interface.</para> - - <para>The socket mechanism requires extensions to the traditional UNIX - I/O system calls to provide the associated naming and connection semantics. - Rather than overloading the existing interface, - the developers used the existing interfaces to the extent that - the latter worked without being changed, - and designed new interfaces to handle the added semantics. - The - <emphasis>read</emphasis> - and - <emphasis>write</emphasis> - system calls were used for byte-stream type connections, - but six new system calls were added - to allow sending and receiving addressed messages - such as network datagrams. - The system calls for writing messages include - <emphasis>send</emphasis>, - <emphasis>sendto</emphasis>, - and - <emphasis>sendmsg</emphasis>. - The system calls for reading messages include - <emphasis>recv</emphasis>, - <emphasis>recvfrom</emphasis>, - and - <emphasis>recvmsg</emphasis>. - In retrospect, the first two in each class are special cases of the others; - <emphasis>recvfrom</emphasis> - and - <emphasis>sendto</emphasis> - probably should have been added as library interfaces to - <emphasis>recvmsg</emphasis> - and - <emphasis>sendmsg</emphasis>, - respectively.</para> - </sect2> - - <sect2> - <title>Scatter/Gather I/O</title> - - <para>In addition to the traditional - <emphasis>read</emphasis> - and - <emphasis>write</emphasis> - system calls, 4.2BSD introduced the ability to do scatter/gather I/O. - Scatter input uses the - <emphasis>readv</emphasis> - system call to allow a single read - to be placed in several different buffers. - Conversely, the - <emphasis>writev</emphasis> - system call allows several different buffers - to be written in a single atomic write. - Instead of passing a single buffer and length parameter, as is done with - <emphasis>read</emphasis> - and - <emphasis>write</emphasis>, - the process passes in a pointer to an array of buffers and lengths, - along with a count describing the size of the array.</para> - - <para>This facility allows buffers in different parts of a process - address space to be written atomically, without the - need to copy them to a single contiguous buffer. - Atomic writes are necessary in the case where the underlying - abstraction is record based, such as tape drives that output a - tape block on each write request. - It is also convenient to be able to read a single request into - several different buffers (such as a record header into one place - and the data into another). - Although an application can simulate the ability to scatter data - by reading the data into a large buffer and then copying the pieces - to their intended destinations, - the cost of memory-to-memory copying in such cases often - would more than double the running time of the affected application.</para> - - <para>Just as - <emphasis>send</emphasis> - and - <emphasis>recv</emphasis> - could have been implemented as library interfaces to - <emphasis>sendto</emphasis> - and - <emphasis>recvfrom</emphasis>, - it also would have been possible to simulate - <emphasis>read</emphasis> - with - <emphasis>readv</emphasis> - and - <emphasis>write</emphasis> - with - <emphasis>writev</emphasis>. - However, - <emphasis>read</emphasis> - and - <emphasis>write</emphasis> - are used so much more frequently that the added cost - of simulating them would not have been worthwhile.</para> - </sect2> - - <sect2> - <title>Multiple Filesystem Support</title> - - <para>With the expansion of network computing, - it became desirable to support both local and remote filesystems. - To simplify the support of multiple filesystems, - the developers added a new virtual node or - <emphasis>vnode</emphasis> - interface to the kernel. - The set of operations exported from the vnode interface - appear much like the filesystem operations previously supported - by the local filesystem. - However, they may be supported by a wide range of filesystem types:</para> - - <itemizedlist> - <listitem> - <para>Local disk-based filesystems</para> - </listitem> - - <listitem> - <para>Files imported using a variety of remote filesystem protocols</para> - </listitem> - - <listitem> - <para>Read-only - CD-ROM - filesystems</para> - </listitem> - - <listitem> - <para>Filesystems providing special-purpose interfaces -- for example, the - <filename>/proc</filename> - filesystem</para> - </listitem> - </itemizedlist> - - <para>A few variants of 4.4BSD, such as FreeBSD, - allow filesystems to be loaded dynamically - when the filesystems are first referenced by the - <emphasis>mount</emphasis> - system call. - The vnode interface is described in - Section 6.5; - its ancillary support routines are described in - Section 6.6; - several of the special-purpose filesystems are described in - Section 6.7.</para> - </sect2> - </sect1> - - <sect1> - <title>Filesystems</title> - - <para>A regular file is a linear array of bytes, - and can be read and written starting at any byte in the file. - The kernel distinguishes no record boundaries in regular files, although - many programs recognize line-feed characters as distinguishing - the ends of lines, and other programs may impose other structure. - No system-related information about a file is kept in the file itself, - but the filesystem stores a small amount of ownership, protection, - and usage information with each file.</para> - - <para>A - <emphasis>filename</emphasis> - component is a string of up to 255 characters. - These filenames are stored in a type of file called a - <emphasis>directory</emphasis>. - The information in a directory about a file is called a - <emphasis>directory entry</emphasis> - and includes, in addition to the filename, - a pointer to the file itself. - Directory entries may refer to other directories, as well as to plain files. - A hierarchy of directories and files is thus formed, and is called a - <emphasis>filesystem</emphasis>;</para> - - <figure id="fig-small-fs"> - <title>A small filesystem</title> - - <mediaobject> - <imageobject> - <imagedata fileref="fig2" format="EPS"> - </imageobject> - - <textobject> - <literallayout class="monospaced"> +-------+ - | | - +-------+ - / \ - usr / \ vmunix - |/ \| - +-------+ +-------+ - | | | | - +-------+ +-------+ - / | \ - staff / | \ bin - |/ | tmp \| - +-------+ V +-------+ - | | +-------+ | | - +-------+ | | +-------+ - / | \ +-------+ / | \ - mckusick / | \| |/ | \ ls - |/ | karels | vi \| -+-------+ V V +-------+ -| | +-------+ +-------+ | | -+-------+ | | | | +-------+ - +-------+ +-------+</literallayout> - </textobject> - - <textobject> - <phrase>A small filesystem tree</phrase> - </textobject> - </mediaobject> - </figure> - - <para>a small one is shown in <xref linkend="fig-small-fs">. - Directories may contain subdirectories, and there is no inherent - limitation to the depth with which directory nesting may occur. - To protect the consistency of the filesystem, the kernel - does not permit processes to write directly into directories. - A filesystem may include not only plain files and directories, - but also references to other objects, such as devices and sockets.</para> - - <para>The filesystem forms a tree, the beginning of which is the - <emphasis>root directory</emphasis>, - sometimes referred to by the name - <emphasis>slash</emphasis>, - spelled with a single solidus character (/). - The root directory contains files; in our example in Fig 2.2, it contains - <filename>vmunix</filename>, - a copy of the kernel-executable object file. - It also contains directories; in this example, it contains the - <filename>usr</filename> - directory. - Within the - <filename>usr</filename> - directory is the - <filename>bin</filename> - directory, which mostly contains executable object code of programs, - such as the files - <!-- FIXME --> - <filename>ls</filename> - and - <filename>vi</filename>.</para> - - <para>A process identifies a file by specifying that file's - <emphasis>pathname</emphasis>, - which is a string composed of zero or more - filenames separated by slash (/) characters. - The kernel associates two directories with each process for use - in interpreting pathnames. - A process's - <emphasis>root directory</emphasis> - is the topmost point in the filesystem that the process can access; - it is ordinarily set to the root directory of the entire filesystem. - A pathname beginning with a slash is called an - <emphasis>absolute pathname</emphasis>, - and is interpreted by the kernel starting with the process's root directory.</para> - - <para>A pathname that does not begin with a slash is called a - <emphasis>relative pathname</emphasis>, - and is interpreted relative to the - <emphasis>current working directory</emphasis> - of the process. - (This directory also is known by the shorter names - <emphasis>current directory</emphasis> - or - <emphasis>working directory</emphasis>.) - The current directory itself may be referred to directly by the name - <emphasis>dot</emphasis>, - spelled with a single period - (<filename>.</filename>). - The filename - <emphasis>dot-dot</emphasis> - (<filename>..</filename>) - refers to a directory's parent directory. - The root directory is its own parent.</para> - - <para>A process may set its root directory with the - <emphasis>chroot</emphasis> - system call, - and its current directory with the - <emphasis>chdir</emphasis> - system call. - Any process may do - <emphasis>chdir</emphasis> - at any time, but - <emphasis>chroot</emphasis> - is permitted only a process with superuser privileges. - <emphasis>Chroot</emphasis> - is normally used to set up restricted access to the system.</para> - - <para>Using the filesystem shown in Fig. 2.2, - if a process has the root of the filesystem as its root directory, and has - <filename>/usr</filename> - as its current directory, it can refer to the file - <filename>vi</filename> - either from the root with the absolute pathname - <filename>/usr/bin/vi</filename>, - or from its current directory with the relative pathname - <filename>bin/vi</filename>.</para> - - <para>System utilities and databases are kept in certain well-known directories. - Part of the well-defined hierarchy includes a directory that contains the - <emphasis>home directory</emphasis> - for each user -- for example, - <filename>/usr/staff/mckusick</filename> - and - <filename>/usr/staff/karels</filename> - in Fig. 2.2. - When users log in, - the current working directory of their shell is set to the - home directory. - Within their home directories, - users can create directories as easily as they can regular files. - Thus, a user can build arbitrarily complex subhierarchies.</para> - - <para>The user usually knows of only one filesystem, but the system may - know that this one virtual filesystem - is really composed of several physical - filesystems, each on a different device. - A physical filesystem may not span multiple hardware devices. - Since most physical disk devices are divided into several logical devices, - there may be more than one filesystem per physical device, - but there will be no more than one per logical device. - One filesystem -- the filesystem that - anchors all absolute pathnames -- is called the - <emphasis>root filesystem</emphasis>, - and is always available. - Others may be mounted; - that is, they may be integrated into the - directory hierarchy of the root filesystem. - References to a directory that has a filesystem mounted on it - are converted transparently by the kernel - into references to the root directory of the mounted filesystem.</para> - - <para>The - <emphasis>link</emphasis> - system call takes the name of an existing file and another name - to create for that file. - After a successful - <emphasis>link</emphasis>, - the file can be accessed by either filename. - A filename can be removed with the - <emphasis>unlink</emphasis> - system call. - When the final name for a file is removed (and the final process that - has the file open closes it), the file is deleted.</para> - - <para>Files are organized hierarchically in - <emphasis>directories</emphasis>. - A directory is a type of file, - but, in contrast to regular files, - a directory has a structure imposed on it by the system. - A process can read a directory as it would an ordinary file, - but only the kernel is permitted to modify a directory. - Directories are created by the - <emphasis>mkdir</emphasis> - system call and are removed by the - <emphasis>rmdir</emphasis> - system call. - Before 4.2BSD, the - <emphasis>mkdir</emphasis> - and - <emphasis>rmdir</emphasis> - system calls were implemented by a series of - <emphasis>link</emphasis> - and - <emphasis>unlink</emphasis> - system calls being done. - There were three reasons for adding systems calls - explicitly to create and delete directories:</para> - - <orderedlist> - <listitem> - <para>The operation could be made atomic. - If the system crashed, - the directory would not be left half-constructed, - as could happen when a series of link operations were used.</para> - </listitem> - <listitem> - <para>When a - networked filesystem is being run, - the creation and deletion of files and directories need to be - specified atomically so that they can be serialized.</para> - </listitem> - <listitem> - <para>When supporting non-UNIX filesystems, such as an - MS-DOS - filesystem, on another partition of the disk, - the other filesystem may not support link operations. - Although other filesystems might support the concept of directories, - they probably would not create and delete the directories with links, - as the UNIX filesystem does. - Consequently, they could create and delete directories only - if explicit directory create and delete requests were presented.</para> - </listitem> - </orderedlist> - - <para>The - <emphasis>chown</emphasis> - system call sets the owner and group of a file, and - <emphasis>chmod</emphasis> - changes protection attributes. - <emphasis>Stat</emphasis> - applied to a filename can be used to read back such properties of a file. - The - <emphasis>fchown</emphasis>, - <emphasis>fchmod</emphasis>, - and - <emphasis>fstat</emphasis> - system calls are applied to a descriptor, instead of - to a filename, to do the same set of operations. - The - <emphasis>rename</emphasis> - system call can be used to give a file a new name in the filesystem, - replacing one of the file's old names. - Like the directory-creation and directory-deletion operations, the - <emphasis>rename</emphasis> - system call was added to 4.2BSD - to provide atomicity to name changes in the local filesystem. - Later, it proved useful explicitly to - export renaming operations to foreign filesystems and over the network.</para> - - <para>The - <emphasis>truncate</emphasis> - system call was added to 4.2BSD to allow files to be shortened - to an arbitrary offset. - The call was added primarily in support of the Fortran - run-time library, - which has the semantics such that the end of a random-access - file is set to be wherever the program most recently accessed that file. - Without the - <emphasis>truncate</emphasis> - system call, the only way to shorten a file was to - copy the part that was desired to a new file, to delete the old file, - then to rename the copy to the original name. - As well as this algorithm being slow, - the library could potentially fail on a full filesystem.</para> - - <para>Once the filesystem had the ability to shorten files, - the kernel took advantage of that ability - to shorten large empty directories. - The advantage of shortening empty directories is that it reduces the - time spent in the kernel searching them - when names are being created or deleted.</para> - - <para>Newly created files are assigned the user identifier of the process - that created them and the group identifier of the directory - in which they were created. - A three-level access-control mechanism is provided for - the protection of files. - These three levels specify the accessibility of a file to</para> - - <orderedlist> - <listitem> - <para>The user who owns the file</para> - </listitem> - <listitem> - <para>The group that owns the file</para> - </listitem> - <listitem> - <para>Everyone else</para> - </listitem> - </orderedlist> - - <para>Each level of access has separate indicators for read permission, - write permission, and execute permission.</para> - - <para>Files are created with zero length, and may grow when they are written. - While a file is open, the system maintains a pointer into - the file indicating the current location in - the file associated with the descriptor. - This pointer can be moved about in the file in a random-access fashion. - Processes sharing a file descriptor through a - <emphasis>fork</emphasis> - or - <emphasis>dup</emphasis> - system call share the current location pointer. - Descriptors created by separate - <emphasis>open</emphasis> - system calls have separate current location pointers. - Files may have - <emphasis>holes</emphasis> - in them. - Holes are void areas in the linear extent of the file where data have - never been written. - A process can create these holes by positioning - the pointer past the current end-of-file and writing. - When read, holes are treated by the system as zero-valued bytes.</para> - - <para>Earlier UNIX systems had a limit of 14 characters per filename component. - This limitation was often a problem. - For example, - in addition to the natural desire of users - to give files long descriptive names, - a common way of forming filenames is as - <filename><replaceable>basename</replaceable>.<replaceable>extension</replaceable></filename>, - where the extension (indicating the kind of file, such as - <literal>.c</literal> - for C source or - <literal>.o</literal> - for intermediate binary object) - is one to three characters, - leaving 10 to 12 characters for the basename. - Source-code\-control systems and editors usually take up another - two characters, either as a prefix or a suffix, for their purposes, - leaving eight to 10 characters. - It is easy to use 10 or 12 characters in a single - English word as a basename (e.g., ``multiplexer'').</para> - - <para>It is possible to keep within these limits, - but it is inconvenient or even dangerous, because other UNIX - systems accept strings longer than the limit when creating files, - but then - <emphasis>truncate</emphasis> - to the limit. - A C language source file named - <filename>multiplexer.c</filename> - (already 13 characters) might have a source-code-control file - with - <literal>s.</literal> - prepended, producing a filename - <filename>s.multiplexer</filename> - that is indistinguishable from the source-code-control file for - <filename>multiplexer.ms</filename>, - a file containing - <!-- FIXME --> - <literal>troff</literal> - source for documentation for the C program. - The contents of the two original files could easily get confused - with no warning from the source-code-control system. - Careful coding can detect this problem, but the - long filenames - first introduced in 4.2BSD practically eliminate it.</para> - </sect1> - - <sect1> - <title>Filestores</title> - - <para>The operations defined for local filesystems are divided into two parts. - Common to all local filesystems are hierarchical naming, - locking, quotas, attribute management, and protection. - These features are independent of how the data will be stored. - 4.4BSD has a single implementation to provide these semantics.</para> - - <para>The other part of the local filesystem is the organization - and management of the data on the storage media. - Laying out the contents of files on the storage media is - the responsibility of the filestore. - 4.4BSD supports three different filestore layouts:</para> - - <itemizedlist> - <listitem> - <para>The traditional Berkeley Fast Filesystem</para> - </listitem> - <listitem> - <para>The log-structured filesystem, - based on the Sprite operating-system design - <xref linkend="biblio-rosenblum"></para> - </listitem> - <listitem> - <para>A memory-based filesystem</para> - </listitem> - </itemizedlist> - - <para>Although the organizations of these filestores are completely different, - these differences are indistinguishable - to the processes using the filestores.</para> - - <para>The Fast Filesystem organizes data into cylinder groups. - Files that are likely to be accessed together, - based on their locations in the filesystem hierarchy, - are stored in the same cylinder group. - Files that are not expected to accessed together are moved into - different cylinder groups. - Thus, files written at the same time may be placed far apart on the - disk.</para> - - <para>The log-structured filesystem organizes data as a log. - All data being written at any point in time are gathered together, - and are written at the same disk location. - Data are never overwritten; - instead, a new copy of the file is written that replaces the old one. - The old files are reclaimed by a garbage-collection process that runs - when the filesystem becomes full and additional free space is needed.</para> - - <para>The memory-based filesystem is designed to store data in virtual memory. - It is used for filesystems that need to support - fast but temporary data, such as - <filename>/tmp</filename>. - The goal of the memory-based filesystem is to keep - the storage packed as compactly as possible to minimize - the usage of virtual-memory resources.</para> - </sect1> - - <sect1> - <title>Network Filesystem</title> - - <para>Initially, networking was used - to transfer data from one machine to another. - Later, it evolved to allowing users to log in remotely to another machine. - The next logical step was to bring the data to the user, - instead of having the user go to the data -- - and network filesystems were born. - Users working locally - do not experience the network delays on each keystroke, - so they have a more responsive environment.</para> - - <para>Bringing the filesystem to a local machine was among the first - of the major client-server applications. - The - <emphasis>server</emphasis> - is the remote machine that exports one or more of its filesystems. - The - <emphasis>client</emphasis> - is the local machine that imports those filesystems. - From the local client's point of view, - a remotely mounted filesystem appears in the file-tree name space - just like any other locally mounted filesystem. - Local clients can change into directories on the remote filesystem, - and can read, write, and execute binaries within that remote filesystem - identically to the way that they can do these operations - on a local filesystem.</para> - - <para>When the local client does an operation on a remote filesystem, - the request is packaged and is sent to the server. - The server does the requested operation and - returns either the requested information or an error - indicating why the request was denied. - To get reasonable performance, - the client must cache frequently accessed data. - The complexity of remote filesystems lies in maintaining cache - consistency between the server and its many clients.</para> - - <para>Although many remote-filesystem protocols - have been developed over the years, - the most pervasive one in use among UNIX - systems is the Network Filesystem - (NFS), - whose protocol and most widely used implementation were - done by Sun Microsystems. - The 4.4BSD kernel supports the - NFS - protocol, although the implementation was done independently - from the protocol specification - <xref linkend="biblio-macklem">. - The - NFS - protocol is described in - Chapter 9. - </para> - </sect1> - - <sect1> - <title>Terminals</title> - - <para>Terminals support the standard system I/O operations, as well - as a collection of terminal-specific operations to control input-character - editing and output delays. - At the lowest level are the terminal device drivers that control - the hardware terminal ports. - Terminal input is handled according to the underlying communication - characteristics, such as baud rate, - and according to a set of software-controllable - parameters, such as parity checking.</para> - - <para>Layered above the terminal device drivers are line disciplines - that provide various degrees of character processing. - The default line discipline is selected when a port is being - used for an interactive login. - The line discipline is run in - <emphasis>canonical mode</emphasis>; - input is processed to provide standard line-oriented editing functions, - and input is presented to a process on a line-by-line basis.</para> - - - <para>Screen editors and programs that communicate with other computers - generally run in - <emphasis>noncanonical mode</emphasis> - (also commonly referred to as - <emphasis>raw mode</emphasis> - or - <emphasis>character-at-a-time mode</emphasis>). - In this mode, input is passed through to the reading process immediately - and without interpretation. - All special-character input processing is disabled, - no erase or other line editing processing is done, - and all characters are passed to the program - that is reading from the terminal.</para> - - - <para>It is possible to configure the terminal in thousands - of combinations between these two extremes. - For example, - a screen editor that wanted to receive user interrupts asynchronously - might enable the special characters that - generate signals and enable output flow control, - but otherwise run in noncanonical mode; - all other characters would be passed through to the process uninterpreted.</para> - - <para>On output, the terminal handler provides simple formatting services, - including</para> - - - <itemizedlist> - <listitem> - <para>Converting the line-feed character - to the two-character carriage-return-line-feed sequence</para> - </listitem> - - <listitem> - <para>Inserting delays after certain standard control characters</para> - </listitem> - - <listitem> - <para>Expanding tabs</para> - </listitem> - - <listitem> - <para>Displaying echoed nongraphic - ASCII - characters as a two-character sequence of the - form ``^C'' - (i.e., the - ASCII - caret character followed by the - ASCII - character that is the character's value offset from the - ASCII - ``@'' character).</para> - </listitem> - </itemizedlist> - - <para>Each of these formatting services can be disabled individually by - a process through control requests.</para> - - </sect1> - - <sect1> - <title>Interprocess Communication</title> - - <para>Interprocess communication in 4.4BSD is organized in - <emphasis>communication domains</emphasis>. - Domains currently supported include the - <emphasis>local domain</emphasis>, - for communication between processes executing on the same machine; the - <emphasis>internet domain</emphasis>, - for communication between processes using the - TCP/IP - protocol suite (perhaps within the Internet); the - ISO/OSI - protocol family for communication between sites required to run them; - and the - <emphasis>XNS domain</emphasis>, - for communication between processes using the - XEROX - Network Systems - (XNS) - protocols.</para> - - <para>Within a domain, communication takes place between communication - endpoints known as - <emphasis>sockets</emphasis>. - As mentioned in - Section 2.6, - the - <emphasis>socket</emphasis> - system call creates a socket and returns a descriptor; - other - IPC - system calls are described in - Chapter 11. - Each socket has a type that defines its communications semantics; - these semantics include properties such as reliability, ordering, - and prevention of duplication of messages.</para> - - <para>Each socket has associated with it a - <emphasis>communication protocol</emphasis>. - This protocol provides the semantics required - by the socket according to the latter's type. - Applications may request a specific protocol when creating a socket, or - may allow the system to select a protocol that is appropriate for the type - of socket being created.</para> - - <para>Sockets may have addresses bound to them. - The form and meaning of socket addresses are dependent on the - communication domain in which the socket is created. - Binding a name to a socket in the - local domain causes a file to be created in the filesystem.</para> - - <para>Normal data transmitted and received through sockets are untyped. - Data-representation issues are the responsibility of libraries built - on top of the interprocess-communication facilities. - In addition to transporting normal data, communication domains may - support the transmission and reception of specially typed data, termed - <emphasis>access rights</emphasis>. - The local domain, for example, - uses this facility to pass descriptors between processes.</para> - - <para>Networking implementations on UNIX before 4.2BSD - usually worked by overloading the character-device interfaces. - One goal of the socket interface was for naive - programs to be able to work without change on stream-style connections. - Such programs can work only if the - <emphasis>read</emphasis> - and - <emphasis>write</emphasis> - systems calls are unchanged. - Consequently, the original interfaces were left intact, - and were made to work on stream-type sockets. - A new interface was added for more complicated sockets, - such as those used to send datagrams, with which a destination address - must be presented with each - <emphasis>send</emphasis> - call.</para> - - <para>Another benefit is that the new interface is highly portable. - Shortly after a test release was available from Berkeley, - the socket interface had been ported to System III - by a UNIX vendor - (although AT&T did not support the socket interface - until the release of System V Release 4, - deciding instead to use the - Eighth Edition stream mechanism). - The socket interface was also ported to run in many - Ethernet boards by vendors, such as Excelan and Interlan, that were - selling into the PC market, where the machines were - too small to run networking in the main processor. - More recently, the socket interface was used as the basis for - Microsoft's Winsock networking interface for Windows.</para> - </sect1> - - <sect1> - <title>Network Communication</title> - - <para>Some of the communication domains supported by the - <emphasis>socket</emphasis> - IPC - mechanism provide access to network protocols. - These protocols are implemented as a separate software - layer logically below the socket software in the kernel. - The kernel provides many ancillary services, such as - buffer management, message routing, standardized interfaces - to the protocols, and interfaces to the network interface drivers - for the use of the various network protocols.</para> - - <para>At the time that 4.2BSD was being implemented, - there were many networking protocols in use or under development, - each with its own strengths and weaknesses. - There was no clearly superior protocol or protocol suite. - By supporting multiple protocols, 4.2BSD - could provide interoperability and resource sharing - among the diverse set of machines that was available - in the Berkeley environment. - Multiple-protocol support also provides for future changes. - Today's protocols designed for 10- to 100-Mbit-per-second - Ethernets are likely to be inadequate for - tomorrow's 1- to 10-Gbit-per-second fiber-optic networks. - Consequently, the network-communication layer is - designed to support multiple protocols. - New protocols are added to the kernel without - the support for older protocols being affected. - Older applications can continue to operate using the old protocol - over the same physical network as is used by newer applications - running with a newer network protocol.</para> - </sect1> - - <sect1> - <title>Network Implementation</title> - - <para>The first protocol suite implemented in 4.2BSD was - DARPA's - Transmission Control Protocol/Internet Protocol - (TCP/IP). - The - CSRG - chose - TCP/IP - as the first network to incorporate into the socket - IPC - framework, - because a 4.1BSD-based implementation was publicly available from a - DARPA-sponsored - project at Bolt, Beranek, and Newman - (BBN). - That was an influential choice: - The 4.2BSD implementation - is the main reason for the extremely widespread use of this protocol suite. - Later performance and capability improvements to the - TCP/IP - implementation have also been widely adopted. - The - TCP/IP - implementation is described in detail in - Chapter 13.</para> - - <para>The release of 4.3BSD added the Xerox Network Systems - (XNS) - protocol suite, - partly building on work done at the - University of Maryland and at - Cornell University. - This suite was needed to connect - isolated machines that could not communicate using - TCP/IP.</para> - - <para>The release of 4.4BSD added the - ISO - protocol suite because of the latter's increasing - visibility both within and outside the United States. - Because of the somewhat different semantics defined for the - ISO - protocols, some minor changes were required in the socket interface - to accommodate these semantics. - The changes were made such that they were invisible to clients - of other existing protocols. - The - ISO - protocols also required extensive addition to the two-level routing - tables provided by the kernel in 4.3BSD. - The greatly expanded routing capabilities of 4.4BSD include - arbitrary levels of routing with variable-length addresses and - network masks.</para> - </sect1> - - <sect1> - <title>System Operation</title> - - <para>Bootstrapping mechanisms are used to start the system running. - First, the 4.4BSD - kernel must be loaded into the main memory of the processor. - Once loaded, it must go through an initialization phase to - set the hardware into a known state. - Next, the kernel must do - autoconfiguration, a process that finds - and configures the peripherals that are attached to the processor. - The system begins running in single-user mode while a start-up script does - disk checks and starts the accounting and quota checking. - Finally, the start-up script starts the general system services - and brings up - the system to full multiuser operation.</para> - - <para>During multiuser operation, processes wait for login requests - on the terminal lines and network ports that have been configured - for user access. - When a login request is detected, - a login process is spawned and user validation is done. - When the login validation is successful, a - login shell is created from which - the user can run additional processes.</para> - </sect1> - - <bibliography> - <title>References</title> - - <biblioentry id="biblio-accetta"> - <abbrev>Accetta et al, 1986</abbrev> - - <biblioset relation="article"> - <title>Mach: A New Kernel Foundation for UNIX Development"</title> - - <authorgroup> - <author> - <firstname>M. </firstname> - <surname>Accetta</surname> - </author> - <author> - <firstname>R.</firstname> - <surname>Baron</surname> - </author> - <author> - <firstname>W.</firstname> - <surname>Bolosky</surname> - </author> - <author> - <firstname>D.</firstname> - <surname>Golub</surname> - </author> - <author> - <firstname>R.</firstname> - <surname>Rashid</surname> - </author> - <author> - <firstname>A.</firstname> - <surname>Tevanian</surname> - </author> - <author> - <firstname>M.</firstname> - <surname>Young</surname> - </author> - </authorgroup> - - <pagenums>93-113</pagenums> - </biblioset> - - <biblioset relation="journal"> - <title>USENIX Association Conference Proceedings</title> - <publishername>USENIX Association</publishername> - <pubdate>June 1986</pubdate> - </biblioset> - </biblioentry> - - <biblioentry id="biblio-cheriton"> - <abbrev>Cheriton, 1988</abbrev> - - <biblioset relation="article"> - <title>The V Distributed System</title> - - <author> - <firstname>D. R.</firstname> - <surname>Cheriton</surname> - </author> - - <pagenums>314-333</pagenums> - </biblioset> - - <biblioset relation="journal"> - <title>Comm ACM, 31, 3</title> - - <pubdate>March 1988</pubdate> - </biblioset> - </biblioentry> - - <biblioentry id="biblio-ewens"> - <abbrev>Ewens et al, 1985</abbrev> - - <biblioset relation="article"> - <title>Tunis: A Distributed Multiprocessor Operating System</title> - - <authorgroup> - <author> - <firstname>P.</firstname> - <surname>Ewens</surname> - </author> - - <author> - <firstname>D. R.</firstname> - <surname>Blythe</surname> - </author> - - <author> - <firstname>M.</firstname> - <surname>Funkenhauser</surname> - </author> - - <author> - <firstname>R. C.</firstname> - <surname>Holt</surname> - </author> - </authorgroup> - - <pagenums>247-254</pagenums> - </biblioset> - - <biblioset relation="journal"> - <title>USENIX Assocation Conference Proceedings</title> - <publishername>USENIX Association</publishername> - <pubdate>June 1985</pubdate> - </biblioset> - </biblioentry> - - <biblioentry id="biblio-gingell"> - <abbrev>Gingell et al, 1987</abbrev> - - <biblioset relation="article"> - <title>Virtual Memory Architecture in SunOS</title> - - <authorgroup> - <author> - <firstname>R.</firstname> - <surname>Gingell</surname> - </author> - - <author> - <firstname>J.</firstname> - <surname>Moran</surname> - </author> - - <author> - <firstname>W.</firstname> - <surname>Shannon</surname> - </author> - </authorgroup> - - <pagenums>81-94</pagenums> - </biblioset> - - <biblioset relation="journal"> - <title>USENIX Association Conference Proceedings</title> - <publishername>USENIX Association</publishername> - <pubdate>June 1987</pubdate> - </biblioset> - </biblioentry> - - <biblioentry id="biblio-kernighan"> - <abbrev>Kernighan & Pike, 1984</abbrev> - - <title>The UNIX Programming Environment</title> - - <authorgroup> - <author> - <firstname>B. W.</firstname> - <surname>Kernighan</surname> - </author> - - <author> - <firstname>R.</firstname> - <surname>Pike</surname> - </author> - </authorgroup> - - <publisher> - <publishername>Prentice-Hall</publishername> - <address> - <city>Englewood Cliffs</city> - <state>NJ</state> - </address> - </publisher> - - <pubdate>1984</pubdate> - </biblioentry> - - <biblioentry id="biblio-macklem"> - <abbrev>Macklem, 1994</abbrev> - - <biblioset relation="chapter"> - <title>The 4.4BSD NFS Implementation</title> - - <author> - <firstname>R.</firstname> - <surname>Macklem</surname> - </author> - - <pagenums>6:1-14</pagenums> - </biblioset> - - <biblioset relation="book"> - <title>4.4BSD System Manager's Manual</title> - - <publisher> - <publishername>O'Reilly & Associates, Inc.</publishername> - <address> - <city>Sebastopol</city> - <state>CA</state> - </address> - </publisher> - - <pubdate>1994</pubdate> - </biblioset> - </biblioentry> - - <biblioentry id="biblio-mckusick-2"> - <abbrev>McKusick & Karels, 1988</abbrev> - - <biblioset relation="article"> - <title>Design of a General Purpose Memory Allocator for the 4.3BSD - UNIX Kernel</title> - - <authorgroup> - <author> - <firstname>M. K.</firstname> - <surname>McKusick</surname> - </author> - - <author> - <firstname>M. J.</firstname> - <surname>Karels</surname> - </author> - </authorgroup> - - <pagenums>295-304</pagenums> - </biblioset> - - <biblioset relation="journal"> - <title>USENIX Assocation Conference Proceedings</title> - <publishername>USENIX Assocation</publishername> - <pubdate>June 1998</pubdate> - </biblioset> - </biblioentry> - - <biblioentry id="biblio-mckusick-1"> - <abbrev>McKusick et al, 1994</abbrev> - - <biblioset relation="manual"> - <title>Berkeley Software Architecture Manual, 4.4BSD Edition</title> - - <authorgroup> - <author> - <firstname>M. K.</firstname> - <surname>McKusick</surname> - </author> - - <author> - <firstname>M. J.</firstname> - <surname>Karels</surname> - </author> - - <author> - <firstname>S. J.</firstname> - <surname>Leffler</surname> - </author> - - <author> - <firstname>W. N.</firstname> - <surname>Joy</surname> - </author> - - <author> - <firstname>R. S.</firstname> - <surname>Faber</surname> - </author> - </authorgroup> - - <pagenums>5:1-42</pagenums> - </biblioset> - - <biblioset relation="book"> - <title>4.4BSD Programmer's Supplementary Documents</title> - - <publisher> - <publishername>O'Reilly & Associates, Inc.</publishername> - <address> - <city>Sebastopol</city> - <state>CA</state> - </address> - </publisher> - - <pubdate>1994</pubdate> - </biblioset> - </biblioentry> - - <biblioentry id="biblio-ritchie"> - <abbrev>Ritchie, 1988</abbrev> - - <title>Early Kernel Design</title> - <subtitle>private communication</subtitle> - - <author> - <firstname>D. M.</firstname> - <surname>Ritchie</surname> - </author> - - <pubdate>March 1988</pubdate> - </biblioentry> - - <biblioentry id="biblio-rosenblum"> - <abbrev>Rosenblum & Ousterhout, 1992</abbrev> - - <biblioset relation="article"> - <title>The Design and Implementation of a Log-Structured File - System</title> - - <authorgroup> - <author> - <firstname>M.</firstname> - <surname>Rosenblum</surname> - </author> - - <author> - <firstname>K.</firstname> - <surname>Ousterhout</surname> - </author> - </authorgroup> - - <pagenums>26-52</pagenums> - </biblioset> - - <biblioset relation="journal"> - <title>ACM Transactions on Computer Systems, 10, 1</title> - - <publishername>Association for Computing Machinery</publishername> - <pubdate>February 1992</pubdate> - </biblioset> - </biblioentry> - - <biblioentry id="biblio-rozier"> - <abbrev>Rozier et al, 1988</abbrev> - - <biblioset relation="article"> - <title>Chorus Distributed Operating Systems</title> - - <authorgroup> - <author> - <firstname>M.</firstname> - <surname>Rozier</surname> - </author> - - <author> - <firstname>V.</firstname> - <surname>Abrossimov</surname> - </author> - - <author> - <firstname>F.</firstname> - <surname>Armand</surname> - </author> - - <author> - <firstname>I.</firstname> - <surname>Boule</surname> - </author> - - <author> - <firstname>M.</firstname> - <surname>Gien</surname> - </author> - - <author> - <firstname>M.</firstname> - <surname>Guillemont</surname> - </author> - - <author> - <firstname>F.</firstname> - <surname>Herrmann</surname> - </author> - - <author> - <firstname>C.</firstname> - <surname>Kaiser</surname> - </author> - - <author> - <firstname>S.</firstname> - <surname>Langlois</surname> - </author> - - <author> - <firstname>P.</firstname> - <surname>Leonard</surname> - </author> - - <author> - <firstname>W.</firstname> - <surname>Neuhauser</surname> - </author> - </authorgroup> - - <pagenums>305-370</pagenums> - </biblioset> - - <biblioset relation="journal"> - <title>USENIX Computing Systems, 1, 4</title> - <pubdate>Fall 1988</pubdate> - </biblioset> - </biblioentry> - - <biblioentry id="biblio-tevanian"> - <abbrev>Tevanian, 1987</abbrev> - - <title>Architecture-Independent Virtual Memory Management for Parallel - and Distributed Environments: The Mach Approach</title> - <subtitle>Technical Report CMU-CS-88-106,</subtitle> - - <author> - <firstname>A.</firstname> - <surname>Tevanian</surname> - </author> - - <publisher> - <publishername>Department of Computer Science, Carnegie-Mellon - University</publishername> - - <address> - <city>Pittsburgh</city> - <state>PA</state> - </address> - </publisher> - - <pubdate>December 1987</pubdate> - </biblioentry> - </bibliography> - </chapter> -</book> |