diff options
Diffstat (limited to 'en_US.ISO8859-1/captions/2006/mckusick-kernelinternals/mckusick-kernelinternals-1.sbv')
-rw-r--r-- | en_US.ISO8859-1/captions/2006/mckusick-kernelinternals/mckusick-kernelinternals-1.sbv | 4300 |
1 files changed, 0 insertions, 4300 deletions
diff --git a/en_US.ISO8859-1/captions/2006/mckusick-kernelinternals/mckusick-kernelinternals-1.sbv b/en_US.ISO8859-1/captions/2006/mckusick-kernelinternals/mckusick-kernelinternals-1.sbv deleted file mode 100644 index 16dcca9e84..0000000000 --- a/en_US.ISO8859-1/captions/2006/mckusick-kernelinternals/mckusick-kernelinternals-1.sbv +++ /dev/null @@ -1,4300 +0,0 @@ -0:00:09.469,0:00:11.309 -Hello my name is Marshall Kirk McKusick - -0:00:11.309,0:00:15.389 -and I've been around as long as dinosaurs -and mainframes have ruled the world - -0:00:15.389,0:00:18.429 -which is to say the sixties and seventies - -0:00:18.429,0:00:22.460 -however by 1970s a new breed of mammals had begun to show up -on the scene - -0:00:22.460,0:00:24.240 -known as mini computers - -0:00:24.240,0:00:28.230 -although they were just toys in the 1970s they would soon grow - -0:00:28.230,0:00:31.689 -and take over most of the computing market - -0:00:31.689,0:00:33.150 -In 1970 - -0:00:33.150,0:00:37.910 -at AT&T Bell laboratories two researchers Ken -Thompson and Dennis Ritchie began developing the - -0:00:37.910,0:00:39.900 -UNIX operating system - -0:00:39.900,0:00:42.040 -Ken Thompson who had been an alumnus at Berkeley - -0:00:42.040,0:00:46.100 -came back on a sabbatical in 1975 bringing UNIX -with him - -0:00:46.100,0:00:47.539 -In the year that he was there - -0:00:47.539,0:00:51.330 -he managed to get a number of graduate students interested -in UNIX - -0:00:51.330,0:00:53.940 -and by the time he left in 1976 - - -0:00:53.940,0:00:56.829 -Bill Joy has taken over in running the UNIX system - -0:00:56.829,0:01:00.470 -and in fact continuing to develop software for it. - -0:01:00.470,0:01:04.339 -Bill began packaging up the software that had -been developed under Berkeley UNIX and - -0:01:04.339,0:01:05.779 -and distributing it - -0:01:05.779,0:01:08.040 -as the Berkeley Software Distributions - -0:01:08.040,0:01:12.310 -whose name was quickly shortened to simply BSD - -0:01:12.310,0:01:16.330 -BSD continued to be distributed with -yearly distributions for almost fifteen - -0:01:16.330,0:01:17.490 -years - -0:01:17.490,0:01:21.920 -initially under Bill Joy and later under others including -yours truly. - -0:01:21.920,0:01:24.860 -By the late 1980s interest had began to grow - -0:01:24.860,0:01:27.400 -in freely redistributable software - -0:01:27.400,0:01:30.170 -so a number of us at Berkeley began separating -out - -0:01:30.170,0:01:32.649 -the AT&T proprietary bits of BSD - -0:01:32.649,0:01:35.710 -from those parts that were freely redistributable. - -0:01:35.710,0:01:40.590 -By the time of the final distribution at BSD -in 1992 - -0:01:40.590,0:01:43.620 -the entire distribution was freely redistributable. - -0:01:43.620,0:01:45.909 -I live in a capsule history here - -0:01:45.909,0:01:48.009 -but if you're interested in the entire story - -0:01:48.009,0:01:50.789 -I have this three-and-an-half hour epic - -0:01:50.789,0:01:54.590 -which is available from my website www.mckusick.com - -0:01:54.590,0:01:58.200 -that gives the entire history of Berkeley. - -0:01:58.200,0:02:00.239 -Following the final distribution from Berkeley - -0:02:00.239,0:02:01.450 -two groups sprung up - -0:02:01.450,0:02:03.600 -to continue supporting BSD - -0:02:03.600,0:02:08.080 -the first of this was the NetBSD whose primary -goal was to support - -0:02:08.080,0:02:10.459 -as many different architectures as possible - -0:02:10.459,0:02:14.769 -everything from your microwave oven all way -upto your cray XMP - -0:02:14.769,0:02:19.409 -In fact today NetBSD supports nearly -sixty architectures. - -0:02:19.409,0:02:22.419 -The other group that sprang up was FreeBSD. - -0:02:22.419,0:02:28.239 -Their goal was to bring up BSD and support -as wide a set of devices as possible on the - -0:02:28.239,0:02:29.719 -PC architecture. - -0:02:29.719,0:02:36.549 -They also had a goal of trying to make the - system as easy to install as possible to - -0:02:36.549,0:02:39.309 -attract by a wide group of developers - -0:02:39.309,0:02:42.319 -I chose to work primarily with the FreeBSD -group - -0:02:42.319,0:02:43.740 -both doing software - -0:02:43.740,0:02:46.140 -and also together with George Neville Neil - -0:02:46.140,0:02:51.069 -writing this book ""The Design and Implementation -of the FreeBSD Operating System"". - -0:02:51.069,0:02:52.060 -Together with this book - -0:02:52.060,0:02:53.959 -I developed a course - -0:02:53.959,0:02:56.500 -which runs for twelve chapters - -0:02:56.500,0:02:58.179 -and thirty hours. - -0:02:58.179,0:02:59.749 -The purpose of this video - -0:02:59.749,0:03:01.089 -is to give you a taste - -0:03:01.089,0:03:02.819 -of that course. - -0:03:02.819,0:03:07.249 -What follows are excerpts from the first lecture -of the course - -0:03:07.249,0:03:11.139 -which of course you can also get from my website -www.mckusick.com. - -0:03:11.139,0:03:13.069 - - -0:03:13.069,0:03:17.739 -Enjoy. - -0:03:17.739,0:03:22.239 -This class is nominally about FreeBSD -because well - -0:03:22.239,0:03:26.379 -that's what I know best and that's what -the textbook is organized around - -0:03:26.379,0:03:29.979 -but the fact of the matter is that it's really - -0:03:29.979,0:03:32.339 -a class about your UNIX and that - -0:03:32.339,0:03:36.539 -really covers sort of the broad range of things -in the open source arena as its FreeBSD - -0:03:36.539,0:03:37.689 -in Linux - -0:03:37.689,0:03:38.899 -which of course - -0:03:38.899,0:03:41.159 -you use a lot out - -0:03:41.159,0:03:41.550 -and - -0:03:41.550,0:03:44.349 -it also covers a commercial systems - -0:03:44.349,0:03:46.950 -%uh Solaris, HP-UX, - -0:03:46.950,0:03:49.279 -AIX and so on. - -0:03:49.279,0:03:52.419 -I am going to tend more towards the open -side - -0:03:52.419,0:03:56.389 -open source side of things.So it's really -going to be more FreeBSD in Linux than it's - -0:03:56.389,0:03:57.579 -going to be - -0:03:57.579,0:04:00.849 -Solaris and HP-UX and so on. - -0:04:00.849,0:04:06.959 -For the most part at the level of this course -we're dealing with the interfaces to the system - -0:04:06.959,0:04:07.329 -and - -0:04:07.329,0:04:11.599 -the fact that the matter is a those interfaces are highly -standardized at this point - -0:04:11.599,0:04:12.060 -and - -0:04:12.060,0:04:15.280 -whether it's FreeBSD or Linux or Solaris -or whatever - -0:04:15.280,0:04:19.460 -the Socket system call has to do the same -thing, it has to have the same arguments - -0:04:19.460,0:04:20.150 -in that, - -0:04:20.150,0:04:23.909 -it has to have the same effect - -0:04:23.909,0:04:27.319 -and so until you get down to the really nitty -details - -0:04:27.319,0:04:29.600 -of how they actually go about implementing -that - -0:04:29.600,0:04:31.960 -the differences are relatively minor. - -0:04:31.960,0:04:35.830 -So I would say that sixty to seventy percent -of the material that I'm covering - -0:04:35.830,0:04:40.779 -is just as true for FreeBSD as it would -be for Linux - -0:04:40.779,0:04:42.580 -or for Solaris - -0:04:42.580,0:04:44.659 -%uh AIX is a little bit - -0:04:44.659,0:04:45.629 -sort of off in the weeds - -0:04:45.629,0:04:48.709 -%uh as is HP-UX - -0:04:48.709,0:04:51.099 -but luckily we don't have to worry too much about -that. - -0:04:51.099,0:04:54.569 -Okay so - -0:04:54.569,0:04:59.279 -the other thing is that I'm going to assume that -all of you have used the system. I get - -0:04:59.279,0:05:00.910 -really sort of worried when people - -0:05:00.910,0:05:04.249 -you know raise the hands and ""Hey, what's a Shell?"" - -0:05:04.249,0:05:07.990 -or I don't -put a lot of code up but a one piece of code and someone said ""Why - -0:05:07.990,0:05:11.819 -are there two pipe symbols in the middle of -that that If statement?"". - -0:05:11.819,0:05:15.740 -No we're not programming the Shell we're programming -in C. - -0:05:15.740,0:05:19.970 -So hopefully you can tell the difference between -Shell scripts and C code. - -0:05:19.970,0:05:21.990 -so okay but I am but am gonna assume - -0:05:21.990,0:05:24.610 -you haven't really looked inside the system. - -0:05:24.610,0:05:28.289 -So I gonna start everything to at a very -high level. - -0:05:28.289,0:05:32.969 -The problem is I have already discovered you come -from a lot of different sort of - -0:05:32.969,0:05:33.819 -backgrounds - -0:05:33.819,0:05:35.180 -and - -0:05:35.180,0:05:36.280 -levels of knowledge - -0:05:36.280,0:05:37.900 -and so - -0:05:37.900,0:05:42.620 -the way that I find works best to sort of -be useful to everybody is that three pass - -0:05:42.620,0:05:43.860 -algorithm - -0:05:43.860,0:05:49.060 -so what I will do is start the first pass a very -broad brush high level - -0:05:49.060,0:05:50.569 -description of what's going on - -0:05:50.569,0:05:54.719 -and then I will go back and I'll go through the -same material again but at a lower level of - -0:05:54.719,0:05:55.300 -detail - -0:05:55.300,0:05:59.939 -then I finally go back and go through a very nittily -low-level of detail - -0:05:59.939,0:06:04.649 -and the fact of this is if you are learning new stuff -as I'm doing the high-level thing - -0:06:04.649,0:06:08.649 -you are gonna be utterly washed by the time I get to -low level niggly details - -0:06:08.649,0:06:10.699 -but since I'm going to do it topic by topic - -0:06:10.699,0:06:14.190 -when I get to the end of one of those nearly -low level niggly details - -0:06:14.190,0:06:17.900 -I'll give you a clue as I will say ""Brain -reset, I'm starting a new topic"" so even if - -0:06:17.900,0:06:19.330 -you're completely lost - -0:06:19.330,0:06:23.530 -you can now start listening again plus I'm gonna get -the broad brush up again. - -0:06:23.530,0:06:27.059 -okay and for those of you that know a lot of -this stuff already - -0:06:27.059,0:06:31.770 -you'll probably find the broad brush rather boring - -0:06:31.770,0:06:35.759 -but by the time we get down to nearly low level -details I think you'll actually - -0:06:35.759,0:06:37.860 -pick up some things that you will find - -0:06:37.860,0:06:39.710 -useful and interesting. - -0:06:39.710,0:06:43.759 -So in this way hopefully everybody will -get some - -0:06:43.759,0:06:47.699 -useful percentage of material out of the course. - -0:06:47.699,0:06:49.599 -I am gonna start out by just - -0:06:49.599,0:06:53.089 -walking through and giving you the - -0:06:53.089,0:06:56.919 -outline of what we're going to try and do here -here - -0:06:56.919,0:07:01.169 -As I said we're going to go roughly - -0:07:01.169,0:07:03.270 -just about two-and-an-half hours of lecture - -0:07:03.270,0:07:04.729 -about two hours forty minutes - -0:07:04.729,0:07:06.499 -per week - -0:07:06.499,0:07:07.619 -and - -0:07:07.619,0:07:11.770 -so we will start off this week with an introduction. - -0:07:11.770,0:07:13.860 -This is as I said we're going to start from the -top - -0:07:13.860,0:07:15.749 -and then just start working our way down - -0:07:15.749,0:07:19.350 -so the general thing I'm going to do is -to talk about the interface - -0:07:19.350,0:07:21.439 -%uh which is something that you - -0:07:21.439,0:07:25.319 -are presumably fairly familiar with since -you've worked with that system - -0:07:25.319,0:07:27.249 -and then - -0:07:27.249,0:07:29.739 -you have to sort of layout terminology - -0:07:29.739,0:07:32.080 -although we use normal English words - -0:07:32.080,0:07:34.419 -they have - -0:07:34.419,0:07:38.580 -sometimes rather bizarre meanings compared to their -common usage - -0:07:38.580,0:07:39.220 -and - -0:07:39.220,0:07:42.330 -so I will just sort of lay out the terminology -lay out the - -0:07:42.330,0:07:45.750 -the way we talk about how the system is structured - -0:07:45.750,0:07:50.780 -and this week we will also talk about the -basic services ""What is it that the kernel is - -0:07:50.780,0:07:52.929 -providing for us?"" - -0:07:52.929,0:07:54.060 -and then of course - -0:07:54.060,0:07:58.499 -we'll proceed to dive down in and and see how -that is done - -0:07:58.499,0:07:59.970 -so here in - -0:07:59.970,0:08:01.400 -Week number 2 - -0:08:01.400,0:08:05.450 -we're gonna look at the system from the -perspective of - -0:08:05.450,0:08:07.039 -something that - -0:08:07.039,0:08:08.720 -manages processes. - -0:08:08.720,0:08:12.170 -One way of looking at the kernel is it's really -just a - -0:08:12.170,0:08:16.440 -the resource manager and the resource that -its managing are things going to do with processes - -0:08:16.440,0:08:19.460 -So we'll look at a process, what the structure of -it is - -0:08:19.460,0:08:20.649 -and - -0:08:20.649,0:08:23.559 -talk about the different ways that they can -be structured. - -0:08:23.559,0:08:28.379 -Process can for example be an address space -and can have one thread running in it can have - -0:08:28.379,0:08:29.749 -multiple threads running in it. - -0:08:29.749,0:08:34.620 -so we'll talk about the different ways -that we think a process is. - -0:08:34.620,0:08:38.480 -We will look at the management of those processes - - -0:08:38.480,0:08:39.239 -we've got - -0:08:39.239,0:08:42.020 -to lay out the bits and pieces that -need to be managed - -0:08:42.020,0:08:44.660 -and then talk about - -0:08:44.660,0:08:47.190 -how we do that. - -0:08:47.190,0:08:51.740 -we'll talk about jails.. this is something -that you currently find only in FreeBSD - -0:08:51.740,0:08:55.060 -hasn't made it into - -0:08:55.060,0:08:56.320 -Linux yet although - -0:08:56.320,0:09:01.630 -the concept is being actively worked -on so my guess is that you'll see that - -0:09:01.630,0:09:03.500 -fairly soon. - -0:09:03.500,0:09:06.360 -we'll also then talk about scheduling - -0:09:06.360,0:09:10.579 -which is in essence how we decide what gets -to run, when it gets to run, how long it gets - -0:09:10.579,0:09:13.500 -to run, etc. - -0:09:13.500,0:09:14.330 -okay - -0:09:14.330,0:09:19.020 -The week after that we will go into virtual -memory. - -0:09:19.020,0:09:23.800 -Signals aren't really part of virtual memory -but they didn't fit into next week's - -0:09:23.800,0:09:26.400 -material so I just would dropped that at the -beginning - -0:09:26.400,0:09:29.850 -but the bulk of Week 3 is going to -be - -0:09:29.850,0:09:32.019 -the management of Virtual Memory. So we've got - -0:09:32.019,0:09:35.119 -a bunch of physical memory, a bunch of -processes that are - -0:09:35.119,0:09:37.940 -trying to use their address spaces - -0:09:37.940,0:09:39.590 -and we will talk about - -0:09:39.590,0:09:41.410 -essentially how you will make that all work - -0:09:41.410,0:09:43.510 -It's called a virtual memory because it's - -0:09:43.510,0:09:47.420 -sort of a cheat. We promise you the world and -then we deliver you - -0:09:47.420,0:09:51.480 -as small number of pages as we think we -can get away with. - -0:09:51.480,0:09:56.420 -Okay. So the first three weeks then essentially -get us through - -0:09:56.420,0:09:58.340 -looking at the world as if it was all - -0:09:58.340,0:10:00.560 -all about processes. - -0:10:00.560,0:10:03.880 -Then in Week 4 we change gears. we say -okay well you know - -0:10:03.880,0:10:07.570 -the kernel isn't just all about processes. You can sort of -look at it orthogonally and you can - -0:10:07.570,0:10:10.000 -say it's really just a giant I/O switch - -0:10:10.000,0:10:12.910 -it's just like a traffic cop that's just managing -these - -0:10:12.910,0:10:14.860 -I/O streams - -0:10:14.860,0:10:15.450 -and - -0:10:15.450,0:10:18.610 -so let's look at it from that perspective. - -0:10:18.610,0:10:19.310 -And - -0:10:19.310,0:10:24.740 -we'll start with special files, again this -sort of the interface when you talk about UNIX - -0:10:24.740,0:10:25.880 -systems, when you talk about - -0:10:25.880,0:10:27.950 -what's normally /dev - -0:10:27.950,0:10:34.170 -interface that gets you access -to the various I/O streams that are available - -0:10:34.170,0:10:37.220 -and we'll look at how that's organized and -the structure of it - -0:10:37.220,0:10:41.840 -which used to be fairly simple but in the -last decade has gotten - -0:10:41.840,0:10:43.670 -incredibly complicated. - -0:10:43.670,0:10:48.540 -We will also talk about pseudo terminals in -job control - -0:10:48.540,0:10:53.330 -this is about as interesting as watching the -grass grow but unfortunately it's - -0:10:53.330,0:10:55.490 -a major component of the system - -0:10:55.490,0:10:59.520 -and especially people that deal with system -administration have to know far more about - -0:10:59.520,0:11:06.520 -this than they probably ever thought they -wanted to. - -0:11:06.900,0:11:11.430 -Okay we will then continue in Week 5 with -the kernel I/O structure, - -0:11:11.430,0:11:16.090 -We will start with multiplexing of I/O. The -kernel of course has done this - -0:11:16.090,0:11:17.360 -always - -0:11:17.360,0:11:22.110 -but we're really talking more about how do -we export I/O multiplexing to - -0:11:22.110,0:11:25.970 -user applications. - -0:11:25.970,0:11:29.250 -We will then move into auto configuration strategy - -0:11:29.250,0:11:31.370 -Auto configuration - -0:11:31.370,0:11:32.770 -is what happens - -0:11:32.770,0:11:36.619 -typically or historically I guess you -could say as the system boots. - -0:11:36.619,0:11:39.500 -so all that stuff that comes out about - -0:11:39.500,0:11:40.810 -what - -0:11:40.810,0:11:43.550 -hardwares are on the machine and how it's all -interconnected - -0:11:43.550,0:11:47.350 -all of that is tied up in auto configuration - -0:11:47.350,0:11:50.040 -and that used to happen just once it boots - -0:11:50.040,0:11:52.000 -but in modern systems today - -0:11:52.000,0:11:55.839 -it's an ongoing process. It happens at boot -but it also happens - -0:11:55.839,0:12:00.550 -anytime you plug a new I/O device, a -PCMCIA card, - -0:12:00.550,0:12:03.680 -or you remove a disk or you put in a new disk. - -0:12:03.680,0:12:07.010 -or any sort of activity that changes the I/O - -0:12:07.010,0:12:08.360 -structure of the machine - -0:12:08.360,0:12:10.870 -auto configuration has to get fired back up - -0:12:10.870,0:12:13.050 -and figure out what's disappeared - -0:12:13.050,0:12:18.330 -and cleanup and figure out what new has arrived -to configure it in. - -0:12:18.330,0:12:19.320 -and then we'll talk - -0:12:19.320,0:12:23.870 -a little bit about the configuration of the -device driver - -0:12:23.870,0:12:27.390 -this actually gets into an area that - -0:12:27.390,0:12:28.660 -is - -0:12:28.660,0:12:33.440 -one well let me just give it as a bit -of advice to the class especially those of - -0:12:33.440,0:12:36.780 -you who work in system administration. - -0:12:36.780,0:12:42.010 -You really want to be careful that -you don't learn too much about device drivers - -0:12:42.010,0:12:44.670 -because there is really these three things that - -0:12:44.670,0:12:48.580 -it's not good to learn about and if you do -learn about it it's really good to keep it - -0:12:48.580,0:12:49.740 -to yourself - -0:12:49.740,0:12:51.949 -because if you become an expert or - -0:12:51.949,0:12:54.960 -viewed as an expert in any of these areas - -0:12:54.960,0:12:59.370 -you will become the designated stuccy for -that and your site you'll never get to do - -0:12:59.370,0:13:01.760 -anything - -0:13:01.760,0:13:02.610 -but that - -0:13:02.610,0:13:07.360 -so The three things that I highly -recommend not learning very much about are - -0:13:07.360,0:13:09.060 -device drivers, - -0:13:09.060,0:13:12.320 -send mail configuration files - -0:13:12.320,0:13:13.970 -or anything having to do - -0:13:13.970,0:13:19.350 -with LDAP or anything in -that general domain - -0:13:19.350,0:13:22.660 -because as I say - -0:13:22.660,0:13:24.900 -that will become your life's work - -0:13:24.900,0:13:25.920 -and - -0:13:25.920,0:13:32.920 -there's other things that you might find more interesting. -""Do you have a question?"" - -0:13:33.870,0:13:36.659 -so one of my students empathizes with my point - -0:13:36.659,0:13:39.640 -I believe you said you worked on that mail -system - -0:13:39.640,0:13:43.120 -so you you might know something about -Sendmail configuration files but you don't - -0:13:43.120,0:13:47.850 -have to answer that - -0:13:47.850,0:13:52.100 -okay so we're going to talk about what a device -driver does and really just sort of the entry - -0:13:52.100,0:13:53.170 -points to it - -0:13:53.170,0:13:57.180 -but we're not going to talk about how you -write such a thing, how you debug such a thing - -0:13:57.180,0:14:01.490 -or much of anything about it. I actually used -to teach an entire class believe it or not - -0:14:01.490,0:14:02.720 -about device drivers - -0:14:02.720,0:14:05.849 -but then I realized the error of my ways and I have -since - -0:14:05.849,0:14:12.580 - gone through and made a point of forgetting -every slide in that talk. - -0:14:12.580,0:14:16.860 -okay so then we will move on to File system - -0:14:16.860,0:14:21.540 -and as always we'll start at the high level -talk about the interface what is it that is - -0:14:21.540,0:14:23.020 -exported out of the system - -0:14:23.020,0:14:27.840 -and then we will start diving down in the C and -how do we go about implementing that - -0:14:27.840,0:14:29.010 -so - -0:14:29.010,0:14:31.010 -we'll start with the - -0:14:31.010,0:14:32.560 -so called - -0:14:32.560,0:14:33.680 -Block I/O system - -0:14:33.680,0:14:36.140 -it's historically been called buffer -cache - -0:14:36.140,0:14:38.590 -and you still hear it called that periodically - -0:14:38.590,0:14:42.720 -and the fact of the matter is that there isn't really -about buffer cache anymore, there is just one big - -0:14:42.720,0:14:44.620 -cache in it.Its the VM cache - -0:14:44.620,0:14:47.810 -and the Filesystem has a view into it -and - -0:14:47.810,0:14:50.829 -the processes have a view into it but at -the end of the day - -0:14:50.829,0:14:54.660 -you really don't want the same information -on two different - -0:14:54.660,0:14:56.030 -pages of memory - -0:14:56.030,0:14:59.390 -because that just leads to trouble. - -0:14:59.390,0:15:03.390 -But Filesystems think they have buffers and so -there's this maneuver where we make - -0:15:03.390,0:15:06.149 -these things that look like what historically -were buffers - -0:15:06.149,0:15:08.830 -that really just map into VM system - -0:15:08.830,0:15:11.720 -but they're still managed in the way that -they have been - -0:15:11.720,0:15:15.020 -managed historically - -0:15:15.020,0:15:20.670 -okay We will then get down into Filesystem implementation -the local file system if you will - -0:15:20.670,0:15:23.400 -and into also - -0:15:23.400,0:15:25.730 -soft updates and snapshots. - -0:15:25.730,0:15:26.440 - this - -0:15:26.440,0:15:31.100 -for the time being is something that you see -only in FreeBSD - -0:15:31.100,0:15:35.310 -the alternative to soft updates is journalling -which is %uh more commonly used - -0:15:35.310,0:15:39.630 -for example what is used by ext3 - -0:15:39.630,0:15:41.179 -and so I'll go through soft updates and - -0:15:41.179,0:15:45.260 -a lot of the issues in soft updates are the -same issues that you have to deal with journalling - -0:15:45.260,0:15:48.370 -what is it that we're protecting and how do we -go about doing that - -0:15:48.370,0:15:51.150 -and the difference is in the detail. - -0:15:51.150,0:15:54.630 -There is actually a paper in the back to your -notes if this is something that interests - -0:15:54.630,0:15:55.240 -you - -0:15:55.240,0:15:59.930 -it's a comparison of journalling versus -soft updates that was done - -0:15:59.930,0:16:02.120 -about five or eight years ago. - -0:16:02.120,0:16:08.460 -and not to spoil the punch line but the answers -they both work about are the same - -0:16:08.460,0:16:12.500 -Okay snapshots again is something that -if - -0:16:12.500,0:16:15.920 -you've worked with things like the network -appliance box you're probably quite - -0:16:15.920,0:16:19.640 -aware of what snapshots are and how they do -or don't work for you - -0:16:19.640,0:16:21.959 -this is the same functionality - -0:16:21.959,0:16:27.380 -in the Filesystem implemented in a -somewhat different way - -0:16:27.380,0:16:28.449 -okay so this - -0:16:28.449,0:16:31.940 -Week 6 is really going to be the local -file system - -0:16:31.940,0:16:34.750 -the disk connected to the machine -that we are dealing with. - -0:16:34.750,0:16:39.140 -Week 7 then we get into multiple -Filesystem support so how do we abstract out that - -0:16:39.140,0:16:41.190 -Filesystem layer - -0:16:41.190,0:16:46.430 -and support Multiple Filesystems at the -same time so for example in FreeBSD - -0:16:46.430,0:16:50.199 -you can of course run with their traditional -fast Filesystem - -0:16:50.199,0:16:54.540 -but if you happen to like the Linux Filesystem -better or you have to share a disk - -0:16:54.540,0:16:55.690 -with a Linux machine - -0:16:55.690,0:16:58.310 -you can run the ext2 or ext3 - -0:16:58.310,0:17:01.020 -and it will perfectly happily do that - -0:17:01.020,0:17:01.620 -so - -0:17:01.620,0:17:05.589 -we will have to look then at how do we provide -interface so that we can plug in all these different - -0:17:05.589,0:17:09.260 -Filesystems that we want to support - -0:17:09.260,0:17:12.250 -another area of which there's been a great - -0:17:12.250,0:17:15.309 -deal of growth at least in code complexity - - -0:17:15.309,0:17:17.840 -is so-called Volume Management - -0:17:17.840,0:17:19.370 -so in the - -0:17:19.370,0:17:24.480 -good old days a Filesystem lived on a disk or -piece of disk and that was that - -0:17:24.480,0:17:26.130 -but in this day and age - -0:17:26.130,0:17:31.150 -that won't do any more so we aggregate disks -together by striping them or RAID - -0:17:31.150,0:17:31.980 -arraying them - -0:17:31.980,0:17:33.380 -or various other things - -0:17:33.380,0:17:39.210 -and we need a whole layer in the system just to -manage those disks - -0:17:39.210,0:17:44.280 -we'll then get to the as an example of an alternative -Filesystem we're going to talk about the - -0:17:44.280,0:17:46.530 -Network Filesystem or NFS - -0:17:46.530,0:17:48.500 -but that's not because this is - -0:17:48.500,0:17:51.090 -the world's best remote file system - -0:17:51.090,0:17:55.240 -or the cleanest design or any of the -properties you might hope that - -0:17:55.240,0:17:57.049 -such a class as this one would have - -0:17:57.049,0:17:58.600 -but it's ubiquitous - -0:17:58.600,0:18:00.210 -very widely used - -0:18:00.210,0:18:01.350 -and - -0:18:01.350,0:18:06.850 -so we're going to talk about that one - -0:18:06.850,0:18:07.740 -okay we'll - -0:18:07.740,0:18:10.970 -then once again switch gears in Week 8 - -0:18:10.970,0:18:17.120 -and turn our attention to of Networking and -Interprocess communication - -0:18:17.120,0:18:18.200 -and - -0:18:18.200,0:18:23.210 -again we'll start from the very top so we'll -go through, we'll go with concepts, the terminology - -0:18:23.210,0:18:24.450 -that gets used - -0:18:24.450,0:18:30.230 -and what's the difference between domain -based addressing and an address domain you know - -0:18:30.230,0:18:30.910 -we'll go through - -0:18:30.910,0:18:34.910 - what the basic IPC services are, - -0:18:34.910,0:18:39.080 -essentially what are all the system calls that -have anything to do with networking - -0:18:39.080,0:18:40.590 -and - -0:18:40.590,0:18:43.720 -just sort of describe what each of them are -and I'm going to go through - -0:18:43.720,0:18:45.830 -a somewhat contrived example - -0:18:45.830,0:18:49.840 -that makes use of every one of those interfaces - -0:18:49.840,0:18:52.860 -and just to sort of show how they all connect -together - -0:18:52.860,0:18:54.169 -and for those of you that work - -0:18:54.169,0:18:57.400 -in networking or had done any kind of network -programming - -0:18:57.400,0:19:00.480 -if you're looking for a week to miss and the -Week 8 is the one to miss that's 'cause that's - -0:19:00.480,0:19:02.780 -the sort of most basic - -0:19:02.780,0:19:04.210 -lecture that I'm going to give - -0:19:04.210,0:19:07.910 -If you are not sure whether or not you need to -go through that, there is - -0:19:07.910,0:19:09.540 -one of the papers in the back - -0:19:09.540,0:19:12.620 -it is an introduction to Interprocess communication - -0:19:12.620,0:19:18.279 -read that paper if you say yeah yeah yeah -yeah yeah you are done with Week 8. - -0:19:18.279,0:19:20.590 -on the other hand if you don't come to Week -8 - -0:19:20.590,0:19:22.790 -and then in Week 9 I say - -0:19:22.790,0:19:26.860 -I call on you and say alright what is it - -0:19:26.860,0:19:30.560 -that listen system call does and you -can't tell me - -0:19:30.560,0:19:32.610 -you're gonna get a demerit - -0:19:32.610,0:19:34.340 -okay - -0:19:34.340,0:19:37.770 -then in Week 9 we will get into the actual - -0:19:37.770,0:19:41.419 -networking implementation itself, we go -through system layers as we did - -0:19:41.419,0:19:43.310 -in all the other areas - -0:19:43.310,0:19:44.130 -and - -0:19:44.130,0:19:48.330 -we will spend a significant portion of that -class talking about routing - -0:19:48.330,0:19:50.230 -routing - -0:19:50.230,0:19:53.610 -for those of you that haven't had the pleasure -of dealing with it - -0:19:53.610,0:19:55.540 -is a black art - -0:19:55.540,0:19:58.050 -or at least a dark science - -0:19:58.050,0:19:59.170 -and - -0:19:59.170,0:19:59.930 -so - -0:19:59.930,0:20:02.490 -we'll talk about it - -0:20:02.490,0:20:06.270 -from the perspective first of all of what -do we do locally within the machine - -0:20:06.270,0:20:10.090 -and then what are some of the bigger strategies -that we can use for doing routing - -0:20:10.090,0:20:11.910 -enterprise - -0:20:11.910,0:20:14.840 -wide routing or - -0:20:14.840,0:20:20.190 -area wide routing something like throughout the -state of California or throughout the US whatever - -0:20:20.190,0:20:25.379 -this again like device drivers is really -just sort of a nickel - -0:20:25.379,0:20:26.480 -tour through the - -0:20:27.800,0:20:31.820 -what the choices are what that the basic -strategies are that are used - -0:20:31.820,0:20:33.989 -If you're thinking you're going to walk out -of here - -0:20:33.989,0:20:36.110 -knowing how to set up a routing well sorry - -0:20:36.110,0:20:38.430 -we are not going to get that far - -0:20:38.430,0:20:41.559 -but you should at least have a pretty good idea -of what the issues are - -0:20:41.559,0:20:44.430 -and what the general solutions are - -0:20:44.430,0:20:48.950 -okay then finally in Week 10 well not finally -but next few weeks and - -0:20:48.950,0:20:52.380 -we will go through the Internet Protocols - -0:20:52.380,0:20:54.320 -primarily TCP/IP - -0:20:54.320,0:20:56.560 -and this is - -0:20:56.560,0:20:58.809 -what are the algorithms that are used - -0:20:58.809,0:21:01.030 -and I'm putting a particular emphasis - -0:21:01.030,0:21:03.050 -for this particular class - -0:21:03.050,0:21:05.080 -on - -0:21:05.080,0:21:07.730 -changes that have been made in the protocols - -0:21:07.730,0:21:14.310 -to deal with a lot of the sort of attacks that -we've been seeing the SYN attacks and - -0:21:14.310,0:21:16.880 -that sort of thing - -0:21:16.880,0:21:19.440 -rather than just a straight - -0:21:19.440,0:21:22.440 -iteration of what the actual protocols -are - -0:21:22.440,0:21:24.940 -I'll talk primarily about IPv4 - -0:21:24.940,0:21:31.940 -but I will also try and talk a bit about -IPv6 as well - -0:21:33.510,0:21:35.850 -all right so the first ten weeks are - -0:21:35.850,0:21:38.100 -sort of the kernel course - -0:21:38.100,0:21:40.800 -now we attack two weeks at the end - -0:21:40.800,0:21:42.010 -to talk about - -0:21:42.010,0:21:43.990 -sort of the bigger picture of - -0:21:43.990,0:21:48.240 -System Tuning,Crash dump analysis that level of -thing - -0:21:48.240,0:21:52.940 -The idea is to really consolidate what -we figured out or talked about in the first - -0:21:52.940,0:21:54.710 -ten weeks and - -0:21:54.710,0:21:58.760 -how that applies to tools that we have available -to us to - -0:21:58.760,0:22:00.760 -look at what the system is doing, - -0:22:00.760,0:22:02.649 - analyze what the system is doing - -0:22:02.649,0:22:03.650 -and hopefully - -0:22:03.650,0:22:04.720 -improve - -0:22:04.720,0:22:07.130 -the performance of what the system is doing - -0:22:07.130,0:22:07.750 -and - -0:22:07.750,0:22:12.169 -for the most part the kind of tuning that I'm -talking about is not - -0:22:12.169,0:22:14.740 -going in and hack hack hacking your kernel - -0:22:14.740,0:22:16.510 -because the fact that the matter is - -0:22:16.510,0:22:18.600 -most of the time you can't do that anyway - -0:22:18.600,0:22:22.340 -so it's more looking at it from the perspective -of saying - -0:22:22.340,0:22:26.390 -is this system running badly because it doesn't -have enough memory on it? - -0:22:26.390,0:22:29.470 -or is it running badly because there isn't enough -I/O capacity? - -0:22:29.470,0:22:33.549 -or is it running badly because it's got -enough I/O capacity but - -0:22:33.549,0:22:35.940 -certain drives are being overloaded - -0:22:35.940,0:22:37.309 -or is it - -0:22:37.309,0:22:42.220 -being overrun because we're simply trying -to do too much on this machine? etc. - -0:22:42.220,0:22:45.440 -so that's the sort of level of thing that we're -looking at it - -0:22:45.440,0:22:47.080 -but tied into - -0:22:47.080,0:22:52.130 -lot of concepts that we talked before so we can talk -about active virtual memory - -0:22:52.130,0:22:53.710 -and what that means - -0:22:53.710,0:22:55.120 -and - -0:22:55.120,0:22:58.750 -essentially measure what it is and hopefully -then you will understand in the context of what - -0:22:58.750,0:23:00.690 -we talked about in the VM section - -0:23:00.690,0:23:03.990 -what that really means - -0:23:03.990,0:23:07.460 -the Crash dump analysis is one of these -topics that - -0:23:07.460,0:23:08.730 -you are gonna love or hate - -0:23:08.730,0:23:12.530 -you actually have to deal with crashed -dumps - -0:23:12.530,0:23:13.679 -its people find it invaluable - -0:23:13.679,0:23:15.580 -and if you don't have to deal with Crash dumps - -0:23:15.580,0:23:18.790 -it's an incredible mass of boring detail - -0:23:18.790,0:23:23.240 -the only good part of it is that that's the -whole session is only about an hour long - -0:23:23.240,0:23:25.529 -If it interests you, listen closely - -0:23:25.529,0:23:28.950 -and if it bores you, well, its only an hour long - -0:23:28.950,0:23:32.880 -okay lastly we'll talk a little bit about -security issues - -0:23:32.880,0:23:36.250 -again this is really more to the tools that -are available - -0:23:36.250,0:23:40.750 -to deal with security staff as opposed to a -complete tutorial on - -0:23:40.750,0:23:45.120 -how to implement security so those of you -that deal with security - -0:23:45.120,0:23:48.400 -this is just gonna to be sort of security one oh -one - -0:23:48.400,0:23:50.029 -for those of you - -0:23:50.029,0:23:51.500 -that have but - -0:23:51.500,0:23:54.399 -you'll have to deal with it but haven't really -thought about it - -0:23:54.399,0:23:58.549 -it'll probably scare you to death and -you wonder how to keep the machines from - -0:23:58.549,0:24:02.840 -being hijacked everyday - -0:24:02.840,0:24:08.030 -Okay so that's in essence what we're going -to try and do here - -0:24:08.030,0:24:15.030 -anybody have any comments, questions, thoughts. -No? All right well. - -0:24:16.130,0:24:17.840 -Let's get started - -0:24:17.840,0:24:22.180 -we will be begin on page fifteen with an -overview of the kernel. - -0:24:22.180,0:24:26.040 -Hopefully nobody's lost yet. - -0:24:26.040,0:24:29.310 -What's a kernel? All right. - -0:24:29.310,0:24:31.370 -so starting at the very top - -0:24:31.370,0:24:33.070 -the big broad brush - -0:24:33.070,0:24:35.140 -what we have is - -0:24:35.140,0:24:38.330 -a UNIX virtual machine and - -0:24:38.330,0:24:41.660 - virtual machines are actually something -that has been around - -0:24:41.660,0:24:44.539 -as a concept since the sixties - -0:24:44.539,0:24:48.919 - difference is really just sort of the level -of the interface that people have dealt with - -0:24:48.919,0:24:51.360 -when they talk about Virtual Machines - -0:24:51.360,0:24:53.610 -in the 1960s - -0:24:53.610,0:24:56.770 -computers were these enormous things you would -have - -0:24:56.770,0:24:58.870 -your computer room would be something that'd be - -0:24:58.870,0:25:01.909 -three times the size of this conference -room if you had - -0:25:01.909,0:25:03.230 -a computer - -0:25:03.230,0:25:05.530 -the computer itself was - -0:25:05.530,0:25:07.840 -tall as a refrigerator freezer - -0:25:07.840,0:25:08.950 -imagine - -0:25:08.950,0:25:13.909 -five or eight or ten of these units -side by side that itself made up the computer - -0:25:13.909,0:25:16.080 -that would be one big - -0:25:16.080,0:25:20.030 -for the core processor and the one which -should be the floating point unit and several - -0:25:20.030,0:25:24.080 -of them that would be the memory the core memory -literally the core memory - -0:25:24.080,0:25:29.110 -and then they'd be other rows of these -disk drives which were about the size of the washing - -0:25:29.110,0:25:29.660 -machine - -0:25:29.660,0:25:34.169 -and then behind that since you couldn't store -everything on disks so - -0:25:34.169,0:25:36.300 -then you had rows of tape drives - -0:25:36.300,0:25:37.880 -and then you had this little - -0:25:37.880,0:25:39.610 -set of sort of - -0:25:39.610,0:25:43.330 -munchkins that would run around and and tend -to the machine and they'd mount tapes and take - -0:25:43.330,0:25:46.710 -off tapes and mount disc packs and remove disc packs -because - -0:25:46.710,0:25:49.760 -the drives themselves were very expensive and -so - -0:25:49.760,0:25:53.110 -you wouldn't just as today we have a - - -0:25:53.110,0:25:56.090 -one spindle that was dedicated just to one set -of platters - -0:25:56.090,0:25:57.130 -you could take out a - -0:25:57.130,0:25:59.460 -set of platters and put in another - -0:25:59.460,0:26:02.540 -hundred megabytes set of platters and these are -platters that are - -0:26:02.540,0:26:05.280 -this big around and it's like six or eight -of them and - -0:26:05.280,0:26:09.140 - giant head assemblies they comes rumbling in and -out - -0:26:09.140,0:26:12.440 -anyway one of these giant giant machines - -0:26:12.440,0:26:17.380 -that costs many millions of dollars would run -at about ten - -0:26:17.380,0:26:21.120 -million instructions per second, 10 mips - -0:26:21.120,0:26:21.630 -and 10 mips - -0:26:21.630,0:26:28.330 - was more computing power than anybody -could possibly imagine using in a single application - -0:26:28.330,0:26:28.880 -just - -0:26:28.880,0:26:31.050 -by contrast you know this - -0:26:31.050,0:26:34.070 -four-year-old laptop here is probably on -the order of - -0:26:34.070,0:26:36.440 -one or two hundred mips - -0:26:36.440,0:26:37.140 -but anyway - -0:26:37.140,0:26:40.760 -people couldn't really view what we would -do with a lot of computing power - -0:26:40.760,0:26:44.640 -and the other thing was that you didn't have -a notion of sort of an operating system that had - -0:26:44.640,0:26:45.890 -applications running on it - -0:26:45.890,0:26:46.760 -because - -0:26:46.760,0:26:50.160 -everybody wanted to write straight to -the raw hardware - -0:26:50.160,0:26:51.750 -and so - -0:26:51.750,0:26:55.900 -what IBM who was a big manufacturer -of machines in those days - -0:26:55.900,0:26:59.060 -did what they came up with this thing called -the VM - -0:26:59.060,0:27:00.770 -and this was a little - -0:27:00.770,0:27:02.549 -you'd call an operating system really - -0:27:02.549,0:27:05.130 -but what it did is it cloned - -0:27:05.130,0:27:09.270 -independent copies of the machine that worked just -like the original machines so you could boot - -0:27:09.270,0:27:11.769 -something that you thought it was an operating -system - -0:27:11.769,0:27:13.380 -on top of VM - -0:27:13.380,0:27:16.750 -so you take one least ten mip machines and -it would clone - -0:27:16.750,0:27:20.050 -six identical one mip copies - -0:27:20.050,0:27:22.030 -and then you could boot - -0:27:22.030,0:27:24.700 -whatever you wanted on each one of those machines -so - -0:27:24.700,0:27:29.510 -if you were doing database stuff you would boot your -database because database cannot ran on the raw hardware - -0:27:29.510,0:27:32.920 -or if you're doing payroll who would boot up the payroll -program - -0:27:32.920,0:27:37.950 -or if you actually tried to service -users you could boot a time sharing batch thing - -0:27:37.950,0:27:40.790 -that would read card images and print -stuff out - -0:27:40.790,0:27:44.460 -or they even had TSO the Time Sharing -Option where you could interactively sit - -0:27:44.460,0:27:45.559 -and type and send - -0:27:45.559,0:27:47.560 -stuffs in and get answers back - -0:27:47.560,0:27:48.570 - and - -0:27:48.570,0:27:51.429 -also you could boot TSO so whatever set -of - -0:27:51.429,0:27:52.219 - - -0:27:52.219,0:27:55.339 -things you need you could boot them and they ran -independently as if they were running on their - -0:27:55.339,0:27:56.470 -own machine - -0:27:56.470,0:28:03.150 -but all the VM did was it give you an exact -raw copy of the hardware - -0:28:03.150,0:28:04.529 -so when UNIX came along - -0:28:04.529,0:28:07.350 -they sort of liked the notion of - -0:28:07.350,0:28:11.509 -providing the concept of independent -things that you could operate in - -0:28:11.509,0:28:13.610 -but they wanted it at a higher level - -0:28:13.610,0:28:15.610 -so you're looking really to do it - -0:28:15.610,0:28:17.480 -instead of at the raw hardware level - -0:28:17.480,0:28:19.679 -to do it at a process level - -0:28:19.679,0:28:23.799 -and the idea that then was that the interface you -would program to would be what we think of as - -0:28:23.799,0:28:26.090 -a System call interface today - -0:28:26.090,0:28:27.849 -and the idea then was that - -0:28:27.849,0:28:30.740 -you would be given a process or set of processes - -0:28:30.740,0:28:34.990 -and those were independent. your process -couldn't affect - -0:28:34.990,0:28:38.830 -the address space of another processor. You couldn't reach -over and mess around with their addresses, - -0:28:38.830,0:28:41.030 -you couldn't mess around with their I/O -channels - -0:28:41.030,0:28:43.179 -you could slow them down by - -0:28:43.179,0:28:44.299 -being a pig but - -0:28:44.299,0:28:47.980 -that was about the only way that you could affect -other processes - -0:28:47.980,0:28:48.480 -and - -0:28:48.480,0:28:49.830 -so - -0:28:49.830,0:28:52.669 -what the interfaces that they had there - -0:28:52.669,0:28:58.660 -was one that had these characteristics -had a paged virtual address space - -0:28:58.660,0:29:02.980 -so you didn't have to know as in the old days how much physical -memory is on the machine and make your application - -0:29:02.980,0:29:04.740 -fit into that amount of memory - -0:29:04.740,0:29:07.950 -you just had what looked like a large - -0:29:07.950,0:29:11.710 -uniform address space even if the underlying -hardware had segments or some other - -0:29:11.710,0:29:13.580 -hardware brain damage - -0:29:13.580,0:29:17.390 -it looked to you like he just had a big uniform -address space and - -0:29:17.390,0:29:21.070 -the size of your address space was independent -of the amount of memory that was on your machine - -0:29:21.070,0:29:23.900 -your address space couldn't be bigger than amount of -physical memory - -0:29:23.900,0:29:26.499 -cause we sort of move pages around underneath - -0:29:26.499,0:29:29.320 -whatever part address space was actually -active - -0:29:29.320,0:29:34.260 -and there's obviously limits to this if -you are trying to run a 1 gigabyte of - -0:29:34.260,0:29:35.630 -application on top of - -0:29:35.630,0:29:37.240 -ten megabytes of memory - -0:29:37.240,0:29:40.880 -it's probably going to bring new meaning to -same day service - -0:29:40.880,0:29:45.519 -but if you're willing to wait long enough it -will eventually move the pages around and you will - -0:29:45.519,0:29:49.740 -progress through getting your application run - -0:29:49.740,0:29:53.890 -another thing was dealing with software -interrupts - -0:29:53.890,0:29:55.789 -in the old days - -0:29:55.789,0:29:58.749 -you had to understand how the hardware worked - -0:29:58.749,0:30:03.900 -in order to deal with exceptional conditions -so for example if you did a divide by zero - -0:30:03.900,0:30:08.170 -the hardware would jump through some -vector location or - -0:30:08.170,0:30:08.630 -something - -0:30:08.630,0:30:12.799 -and you had know how that worked and make -sure that you had your program - -0:30:12.799,0:30:16.510 -usually some little bit of assembly language -set up to deal with that - -0:30:16.510,0:30:19.870 -and UNIX said let's let's get away -from the hardware here - -0:30:19.870,0:30:22.080 -and so they did this thing called signals - -0:30:22.080,0:30:25.700 -and so they just define a set of the signals is that -if you do divide by zero - -0:30:25.700,0:30:29.529 -you simply register a routine you -want to have called you don't have to know - -0:30:29.529,0:30:31.220 -how the hardware figured it out - -0:30:31.220,0:30:36.740 -you just know that that routine is going to get -called and you can deal with it at that point - -0:30:36.740,0:30:40.960 -well we got set of timers and counters to keep -track of what we're doing, this is really more - -0:30:40.960,0:30:43.490 -for counting than anything else but - -0:30:43.490,0:30:46.970 -applications may want to have access to that. - -0:30:46.970,0:30:51.720 -we have a set of identifiers that we're -going to use for things like accounting, - -0:30:51.720,0:30:54.830 -protection and scheduling and so on - -0:30:54.830,0:30:55.820 -and one of the - -0:30:55.820,0:31:00.320 -the early philosophies of UNIX was to try -and keep it simple. - -0:31:00.320,0:31:02.630 -operating systems have gotten very baroque - -0:31:02.630,0:31:04.490 -in particular the thing that - -0:31:04.490,0:31:07.350 -pre dated UNIX was a thing called -Multix - -0:31:07.350,0:31:12.820 -Multix was was a joint project between -Honeywell, a big computer manufacturer of the - -0:31:12.820,0:31:15.740 -time - -0:31:15.740,0:31:17.129 -AT&T bell laboratories - -0:31:17.129,0:31:19.750 -the big industrial laboratory at that time - -0:31:19.750,0:31:21.380 -and MIT - -0:31:21.380,0:31:23.430 -a big university then and - -0:31:23.430,0:31:24.690 -still today - -0:31:24.690,0:31:29.259 -and those three organizations got -together to try and build this - -0:31:29.259,0:31:31.400 -time sharing operating system - -0:31:31.400,0:31:32.280 -and it - -0:31:32.280,0:31:33.770 -it just got bigger and - -0:31:33.770,0:31:37.160 -more grandiose and more complex and never -finished - -0:31:37.160,0:31:38.979 -because as soon as they sort of see - -0:31:38.979,0:31:42.709 -oh we know how to do that but we could -do this other thing too and so then they would tear it - -0:31:42.709,0:31:43.429 -apart and - -0:31:43.429,0:31:46.440 -they never really got to something that - -0:31:46.440,0:31:48.210 -could be put into production - -0:31:48.210,0:31:49.919 -and so the - -0:31:49.919,0:31:50.570 -AT&T - -0:31:50.570,0:31:54.340 -Bell laboratories decided to pull out of -that project - -0:31:54.340,0:31:55.940 -and - -0:31:55.940,0:32:00.000 -the two of the people that had been working on -that project, Ken Thompson and Dennis Richie - -0:32:00.000,0:32:04.390 -were sort of bummed because they were now -back to typing cards and putting them through - -0:32:04.390,0:32:05.259 -card readers and - -0:32:05.259,0:32:07.960 -they had gotten used to the idea that you could -actually - -0:32:07.960,0:32:11.559 -sit at an ASSR33 teletype and interact -with your computer - -0:32:11.559,0:32:13.440 -and so - -0:32:13.440,0:32:18.230 -they found an old %uh PDP-8 sitting off in -the corner that had been abandoned - -0:32:18.230,0:32:22.120 -and started working on this little tiny operating -system which they called UNIX - -0:32:22.120,0:32:26.549 -which eventually moved to the PDP-11 and -became what we have today - -0:32:26.549,0:32:28.050 -but because it was - -0:32:28.050,0:32:32.120 -they were coming first of all from Multix -where everything had been done and - -0:32:32.120,0:32:34.110 -in great grandiose detail - -0:32:34.110,0:32:37.549 -and because they're fundamentally were two - of them working on it and they wanted to get something - -0:32:37.549,0:32:38.370 -done and - -0:32:38.370,0:32:40.130 -within a year or so - -0:32:40.130,0:32:41.529 -one of their philosophies was - -0:32:41.529,0:32:44.099 -let's find the one way of doing things - -0:32:44.099,0:32:48.180 -let's not have eight ways from Sunday let's just -get the one way - -0:32:48.180,0:32:53.860 -and that's what we will provide. So what is -the sort of core set of things that we need. - -0:32:53.860,0:32:58.620 -well first thing is when it comes to identifiers, -let's not have you know - -0:32:58.620,0:33:00.430 -eighty thousand different identifiers - -0:33:00.430,0:33:03.140 -so they came up with process identifiers, - -0:33:03.140,0:33:09.620 -user identifier and at that time a single group -identifier and later expanded - -0:33:09.620,0:33:14.200 -and they used that sort of identifiers for everything -so its used for counting, used for making - -0:33:14.200,0:33:17.410 -protection decisions, used for scheduling -decisions - -0:33:17.410,0:33:19.470 -and - -0:33:19.470,0:33:24.279 -again it was the simplicity of thing which -was what was driving their decision - -0:33:24.279,0:33:28.840 -but they're really sort of two key ideas -that they had - -0:33:28.840,0:33:30.880 -that really made the difference that - -0:33:30.880,0:33:32.539 -that's what set them up side - -0:33:32.539,0:33:34.749 -from what everybody else had done before them - -0:33:34.749,0:33:35.450 -and which - -0:33:35.450,0:33:39.740 -in retrospect is something that has been pervasive -more or less ever since - -0:33:39.740,0:33:41.869 -the first of these was the notion - -0:33:41.869,0:33:44.840 -that we have a unique descriptor space - -0:33:44.840,0:33:46.289 -that is - -0:33:46.289,0:33:51.250 -given a descriptor it can reference -any I/O device - -0:33:51.250,0:33:53.650 -so or even any kind of I/O channel - -0:33:53.650,0:33:58.270 -so you can have a descriptor for terminal -or descriptor for a file or descriptive for - -0:33:58.270,0:34:02.240 -a disk or descriptor for a pipe or descriptor -for a socket - -0:34:02.240,0:34:03.500 -and - -0:34:03.500,0:34:04.790 -you don't need to know - -0:34:04.790,0:34:07.940 -what it references in order to be able to read -and write that thing - -0:34:07.940,0:34:11.290 -so if I hand you a descriptor -you can read from that the descriptor or you can write - -0:34:11.290,0:34:13.259 -to that descriptor - -0:34:13.259,0:34:15.189 -and - -0:34:15.189,0:34:17.359 -the correct thing will happen - -0:34:17.359,0:34:19.089 -and you'd say well - -0:34:19.089,0:34:23.629 -that's so obvious I mean how else could you -possibly think of doing it? - -0:34:23.629,0:34:25.179 -well predating UNIX - -0:34:25.179,0:34:28.059 -everything was done with - -0:34:28.059,0:34:29.379 -a little subsystem - -0:34:29.379,0:34:33.419 -that would open a file, read a file, write a -file, close a file - -0:34:33.419,0:34:37.429 -and there was another set of system calls which -would open a terminal, read a terminal, write terminal, - -0:34:37.429,0:34:38.089 -close terminal - -0:34:38.089,0:34:39.210 -and yet another one - -0:34:39.210,0:34:42.409 -which was create a pipe, read a pipe, -write a pipe and so on. - -0:34:42.409,0:34:47.699 -so if you are just a drop dead stupid -program like say CAD - -0:34:47.699,0:34:51.579 -you would have to have code in there and say was -my input a terminal which in case I need to - -0:34:51.579,0:34:53.159 -use the read terminal - -0:34:53.159,0:34:57.419 -or is it a file which in case I need -to use read file or is it a pipe in which in case - -0:34:57.419,0:34:59.189 -I need to use read pipe - -0:34:59.189,0:35:01.860 -and so the program itself had to have all -this - -0:35:01.860,0:35:02.859 -coding in it - -0:35:02.859,0:35:04.409 -whereas when they went to - -0:35:04.409,0:35:07.159 -the uniform descriptor space - -0:35:07.159,0:35:09.630 -CAD doesn't know it doesn't need to know -it just says - -0:35:09.630,0:35:10.819 -read my input, - -0:35:10.819,0:35:13.979 -write the output - -0:35:13.979,0:35:17.059 -and it works and we add a new type of descriptor - -0:35:17.059,0:35:17.600 -and - -0:35:17.600,0:35:21.700 -CAD just continues to work just as it always -did. - -0:35:21.700,0:35:24.199 -So this proved to be a very powerful construct - -0:35:24.199,0:35:27.019 -and pretty much every operating system after -UNIX - -0:35:27.019,0:35:28.659 -did that there's - -0:35:28.659,0:35:30.210 -one exception of %uh - -0:35:30.210,0:35:32.549 -large company in the Pacific North-West - -0:35:32.549,0:35:35.830 -that still has not quite uniform descriptor -space - -0:35:35.830,0:35:38.380 -but %uh that's part of their legacy that really - -0:35:38.380,0:35:39.900 -they're working on that. - -0:35:39.900,0:35:42.009 -Longhorn will be here. - -0:35:42.009,0:35:43.939 -and anyway - -0:35:43.939,0:35:46.190 -this set of facilities then - -0:35:46.190,0:35:50.150 -makes up the UNIX virtual machine - -0:35:50.150,0:35:51.559 -and - -0:35:51.559,0:35:55.559 -in some sense we still see virtual machines -being used today in fact we're seeing sort - -0:35:55.559,0:35:56.749 -of a reversion - -0:35:56.749,0:36:01.429 -back to some of the IBM stuff in things -like the VMware - -0:36:01.429,0:36:03.079 -which is - -0:36:03.079,0:36:07.029 -essentially allow you to go back to booting -native operating systems again so sort of - -0:36:07.029,0:36:08.280 -interesting to watch - -0:36:08.280,0:36:09.060 -that the sort of - -0:36:09.060,0:36:12.919 -pendulum of back going back and forth -of what's the correct layer - -0:36:12.919,0:36:14.609 -for for doing - -0:36:14.609,0:36:18.890 -virtual machines - -0:36:18.890,0:36:22.499 -Okay? so far so good? - -0:36:22.499,0:36:24.719 -all right so I said that there were - -0:36:24.719,0:36:27.160 -two key ideas that UNIX had - -0:36:27.160,0:36:30.279 -the first of these being the uniform descriptor -space - -0:36:30.279,0:36:35.819 -the second one which was really critical was -this notion of processes as a commodity - -0:36:35.819,0:36:37.309 -item - -0:36:37.309,0:36:40.220 -so here on Page 17 I've tried to lay -it out - -0:36:40.220,0:36:41.090 -the - -0:36:41.090,0:36:44.159 -that the components that make up a process - -0:36:44.159,0:36:45.759 -and - -0:36:45.759,0:36:50.359 -what do I really mean when I say a process as -a commodity item - -0:36:50.359,0:36:53.650 -okay leading up to - -0:36:53.650,0:36:54.689 -UNIX - -0:36:54.689,0:36:56.800 -the systems that pre-dated it, - -0:36:56.800,0:36:59.200 -processes were these very large - -0:36:59.200,0:37:02.169 -heavyweight expensive things - -0:37:02.169,0:37:02.779 -and - -0:37:02.779,0:37:04.539 -if you look at - -0:37:04.539,0:37:08.629 -MVS which was the operating system -that ran on IBM for doing multiple processing - -0:37:08.629,0:37:10.509 -and - -0:37:10.509,0:37:13.799 -the system administrator would decide at boot -time - -0:37:13.799,0:37:17.019 -what degree of multiprocessing they wish -to support - -0:37:17.019,0:37:18.140 -so they'd say well - -0:37:18.140,0:37:20.739 -well, we'll let upto six things happen at once - -0:37:20.739,0:37:22.490 -and so as part of booting up - -0:37:22.490,0:37:24.419 -they would create six - -0:37:24.419,0:37:25.349 -processes - -0:37:25.349,0:37:30.059 -and now you as a user if you wanted to do -something let's say you wanted to - -0:37:30.059,0:37:32.009 -compile and run a program - -0:37:32.009,0:37:34.960 -you would be given a process - -0:37:34.960,0:37:36.019 -and it was up to you - -0:37:36.019,0:37:39.369 -to figure out how to stage what you needed -done - -0:37:39.369,0:37:39.819 -and - -0:37:39.819,0:37:43.930 -that this was often fairly complex - -0:37:43.930,0:37:47.880 -and so you would have to write out all the -steps that you wanted - -0:37:47.880,0:37:50.300 -in this wonderful thing called JCL - -0:37:50.300,0:37:52.259 -Job Control Language. - -0:37:52.259,0:37:56.650 -Job Control Language was send mail configuration -file of the sixties - -0:37:56.650,0:38:00.679 -there where people whose sole job at the company -was how to put this stuff together 'cause - -0:38:00.679,0:38:04.189 -all you had to do is get one extra space or -a missing comma - -0:38:04.189,0:38:05.000 -something in there - -0:38:05.000,0:38:08.630 -and the whole thing would just blow up. it would -just sort of spit the card deck back at - -0:38:08.630,0:38:09.799 -you and say well - -0:38:09.799,0:38:13.500 -somewhere in there is a mistake that's sort of -in the general area of this card - -0:38:13.500,0:38:15.549 -and I can't deal with it. Fix it. - -0:38:15.549,0:38:16.489 -and of course - -0:38:16.489,0:38:20.550 -in those days it wasn't just a matter of hitting -carriage when you know make carriage return you have to - -0:38:20.550,0:38:25.239 -get your deck pull out the card, and type the -new one, put it back in and re-submit it - -0:38:25.239,0:38:28.729 -As heaven forbid you couldn't touch that -card reader you know, it had to be done by - -0:38:28.729,0:38:29.970 -an operator - -0:38:29.970,0:38:32.869 -so the card deck will read through it would -disappear and - -0:38:32.869,0:38:36.800 -you know if you're lucky a few minutes later -if you were not lucky a few hours later - -0:38:36.800,0:38:37.849 -you would get - -0:38:37.849,0:38:39.570 -a print out - -0:38:39.570,0:38:43.419 -which was what had happened and then you could -look at it and you know - -0:38:43.419,0:38:47.209 -I put a comma in the wrong place I guess -I get to do it all again - -0:38:47.209,0:38:49.930 -so - -0:38:49.930,0:38:54.940 -the thing you would need to do there for compiling and running a program - -0:38:54.940,0:38:59.579 -was you'd have to break into these steps. well -I need to run the preprocessor - -0:38:59.579,0:39:04.670 -and so clean out whatever gump that was left -over on that process from the previous user - -0:39:04.670,0:39:06.240 -put the preprocessor in there - -0:39:06.240,0:39:10.530 -and then read from this file here let's -say I gotta put it somewhere so creative - -0:39:10.530,0:39:12.510 -scratch file over on this disk and - -0:39:12.510,0:39:17.299 -it was excruciating detail like how many cylinders -and how many tracks and this and that - -0:39:17.299,0:39:19.139 -blocks blah blah blah - -0:39:19.139,0:39:23.119 -and don't forget any of those parameters 'cause -it'll spit it out if you do - -0:39:23.119,0:39:26.890 -and so then it would run the first step in that -if its successful then you'd have sitting - -0:39:26.890,0:39:28.899 -in this scratch file that you had created - -0:39:28.899,0:39:33.100 -the output of the preprocessor and then -you'd load the first pass of the compiler - -0:39:33.100,0:39:36.930 -and you say now read from that scratch file -and create this other scratch file over here and - -0:39:36.930,0:39:39.450 -when thats successful and we need to delete that -one - -0:39:39.450,0:39:43.830 -and then load the second pass, put that back -into another scratch file and then we run this - -0:39:43.830,0:39:45.950 -assembler, and the optimizer then the - -0:39:45.950,0:39:47.750 -loader this and that - -0:39:47.750,0:39:49.410 -finally run the program - -0:39:49.410,0:39:50.900 -and if all goes well - -0:39:50.900,0:39:57.029 -you know at step sixteen out comes the answer - -0:39:57.029,0:39:58.129 -forty two. so UNIX - -0:39:58.129,0:40:00.819 -said, look this is silly - -0:40:00.819,0:40:02.880 -a lot of this is just - -0:40:02.880,0:40:04.310 -bookkeeping - -0:40:04.310,0:40:07.249 -and computers do bookkeeping really well - -0:40:07.249,0:40:12.179 -and you'll recall yeah but it's going to take -all these cycles it's like - -0:40:12.179,0:40:16.309 -computers are supposed to be labor-saving -devices right? so - -0:40:16.309,0:40:20.150 -they came up with this notion that they would -create processes on the fly as needed - -0:40:20.150,0:40:21.159 -you had - -0:40:21.159,0:40:25.549 -you've had a preprocessor in two -steps of the compiler and then - -0:40:25.549,0:40:27.109 -optimizer and then a loader - -0:40:27.109,0:40:29.410 -we just create Boom seven processes - -0:40:29.410,0:40:31.920 -and we connect them together with pipes - -0:40:31.920,0:40:35.180 -and so we take the input and you know run -through in - -0:40:35.180,0:40:38.270 -through the pipes and you know out the end -you get the the - -0:40:38.270,0:40:39.629 -executable - -0:40:39.629,0:40:40.030 -and - -0:40:40.030,0:40:42.880 -we will simply create each of these processes - -0:40:42.880,0:40:44.650 -and - -0:40:44.650,0:40:46.549 -so you as a user just - -0:40:46.549,0:40:49.479 -type you know the C compiler and it just - -0:40:49.479,0:40:52.429 -fork these things pipe them together got the result - -0:40:52.429,0:40:53.640 -and - -0:40:53.640,0:40:57.509 -then once it was done with this processes is -just threw them away so any time you'd create a - -0:40:57.509,0:41:00.479 -new process and it came to you pristine clean - -0:41:00.479,0:41:04.239 -and you needed a bunch of things it did -put everything in intermediate files - -0:41:04.239,0:41:07.549 -the fact of the matter is in the early days - -0:41:07.549,0:41:08.129 -those computers - -0:41:08.129,0:41:11.910 -didn't really have enough memory to support -all that stuff at once so - -0:41:11.910,0:41:15.809 -behind you those pipes were actually implemented -as files - -0:41:15.809,0:41:19.319 -but you didn't have at least to remember to create -them and delete them - -0:41:19.319,0:41:20.200 -and deal with them - -0:41:20.200,0:41:24.020 -as far as you were concerned it just look stuff -flowing through pipes and of course today it - -0:41:24.020,0:41:24.490 -just - -0:41:24.490,0:41:27.989 -does flow through pipes in memory - -0:41:27.989,0:41:29.439 -okay so - -0:41:29.439,0:41:33.689 -this notion then that that we're just gonna -create processes on the fly is needed and - -0:41:33.689,0:41:35.559 -connect them together as needed - -0:41:35.559,0:41:38.039 -it was a novel concept - -0:41:38.039,0:41:43.599 -and it wasn't that somehow mysteriously figured -out how to create processes cheaply - -0:41:43.599,0:41:44.839 -cause they hadn't - -0:41:44.839,0:41:46.180 -they were still - -0:41:46.180,0:41:49.959 -really expensive to create - -0:41:49.959,0:41:52.210 -but that extra effort - -0:41:52.210,0:41:53.029 -was - -0:41:53.029,0:41:56.089 -worth it because it was saving a lot of programming -time - -0:41:56.089,0:41:59.809 -so my favorite example is you run ls - -0:41:59.809,0:42:01.810 -so we have to create a process - -0:42:01.810,0:42:04.259 -load the ls binary into it - -0:42:04.259,0:42:06.180 -it prints a line or two on your screen - -0:42:06.180,0:42:10.609 -and we tear the entire thing down and return -all its resources back to the system - -0:42:10.609,0:42:14.979 -more than ninety percent of the cost of running -ls is creating and destroying the process - -0:42:14.979,0:42:19.239 -a tiny fraction of it is actually running -ls - -0:42:19.239,0:42:24.259 -but it goes so fast, who cares right - -0:42:24.259,0:42:25.749 -so the point is that - -0:42:25.749,0:42:30.039 -that concept of just creating things as -needed - -0:42:30.039,0:42:31.780 -again was very powerful - -0:42:31.780,0:42:35.709 -and is one that is just pervasive today - -0:42:35.709,0:42:38.639 -okay so what is a process actually made up -of - -0:42:38.639,0:42:43.179 -it gets some amount of CPU time or at -least we do dearly hope that it gets some - -0:42:43.179,0:42:46.050 -amount of CPU time, the lack of getting -CPU time - -0:42:46.050,0:42:46.670 -that makes it - -0:42:46.670,0:42:47.979 -a computer so sluggish - -0:42:47.979,0:42:49.409 -of course - -0:42:49.409,0:42:51.920 -others really boils down to scheduling - -0:42:51.920,0:42:54.249 -and we're going to talk about scheduling - -0:42:54.249,0:42:56.279 -probably more than you care to - -0:42:56.279,0:42:59.219 -in a couple weeks time - -0:42:59.219,0:43:01.619 -we have the asynchronous events - -0:43:01.619,0:43:04.569 -these are the external events that - -0:43:04.569,0:43:05.659 -are coming in - -0:43:05.659,0:43:07.679 -so - -0:43:07.679,0:43:10.169 -they may be either things that - -0:43:10.169,0:43:14.339 -were coming in from the outside world like -start, stop and quit - -0:43:14.339,0:43:15.279 -oh - -0:43:15.279,0:43:18.170 -out-of-band data arrival notification that kind -of thing - -0:43:18.170,0:43:22.339 -or it may in fact be things that the program -is bringing down upon itself - -0:43:22.339,0:43:25.590 -such as a segment fault, a divide by zero - -0:43:25.590,0:43:26.910 -and some other - -0:43:26.910,0:43:31.959 -what would normally be viewed as incorrect -operation - -0:43:31.959,0:43:35.849 -and so we'll talk about that when we talk about -signals - -0:43:35.849,0:43:37.039 -every program - -0:43:37.039,0:43:38.899 -gets some amount of memory - -0:43:38.899,0:43:42.659 -it gets an initial amount when it starts -up injured generally allocates more as it - -0:43:42.659,0:43:45.229 -goes along - -0:43:45.229,0:43:49.429 -this of course we will deal with very extensively -will spend an entire week on it - -0:43:49.429,0:43:54.249 -when we talk about how virtual memory is implemented - -0:43:54.249,0:43:54.609 -and - -0:43:54.609,0:43:57.429 -then we get I/O descriptors - -0:43:57.429,0:44:02.259 -I used to say that every program had to have -at least one I/O descriptor since - -0:44:02.259,0:44:04.910 -it absolutely had no input - -0:44:04.910,0:44:06.329 -absolutely no output - -0:44:06.329,0:44:09.049 -then it was sort of pointless - -0:44:09.049,0:44:12.900 -of course I had to have one of my students -come up and point out to me there is an a - -0:44:12.900,0:44:13.849 -class of programs - -0:44:13.849,0:44:16.469 -which don't need I/O descriptors - -0:44:16.469,0:44:17.440 -and that is - -0:44:17.440,0:44:19.549 -these things called benchmarks - -0:44:19.549,0:44:23.249 -it just compute something all we really care -about is how long it takes them to compute - -0:44:23.249,0:44:24.959 -we don't actually care what the answer is - -0:44:24.959,0:44:26.019 -In theory we don't - -0:44:26.019,0:44:29.779 -I personally like my benchmark stop with -something so I can see it there - -0:44:29.779,0:44:31.489 -doing computing the right thing - -0:44:31.489,0:44:33.169 -but in theory - -0:44:33.169,0:44:35.919 -that wouldn't be necessary - -0:44:35.919,0:44:38.650 -outside of that class of programs - -0:44:38.650,0:44:42.670 -everything needs some sort of descriptors and -of course we'll talk about descriptors - -0:44:42.670,0:44:43.659 -quite extensively - -0:44:43.659,0:44:47.349 -as we go through the I/O subsystem - -0:44:47.349,0:44:50.969 -okay so the executive summary is that processes -are - -0:44:50.969,0:44:54.969 -the fundamental service that is provided by -UNIX - -0:44:54.969,0:44:58.430 -and - -0:44:58.430,0:45:02.849 -what we're going to spend essentially the -next two and a half weeks working on - -0:45:02.849,0:45:04.769 -is - -0:45:04.769,0:45:07.079 -what what makes up processes - -0:45:07.079,0:45:10.180 -we'll go into much more detail about each of these -four points - -0:45:10.180,0:45:11.769 -and - -0:45:11.769,0:45:13.630 -then how do we actually go about - -0:45:13.630,0:45:14.390 -providing that - -0:45:14.390,0:45:16.639 -bit of service - -0:45:16.639,0:45:17.900 -the next thing that I'm - -0:45:17.900,0:45:22.210 -going to do now is this go through and lay -out some of the terminology that - -0:45:22.210,0:45:23.239 -we have when - -0:45:23.239,0:45:25.130 -we're talking about processes - -0:45:25.130,0:45:29.229 -so this is sort of the big picture here were -on page eighteen - -0:45:29.229,0:45:30.669 -and - -0:45:30.669,0:45:33.669 -you can see we have sort of three bits that -make up - -0:45:33.669,0:45:36.640 -the system - -0:45:36.640,0:45:39.029 -we have the currently running user process - -0:45:39.029,0:45:41.180 -and then what we call the top half of the kernel - -0:45:41.180,0:45:43.699 -and the bottom half of the kernel - -0:45:43.699,0:45:47.049 -now this would be a picture for a uniprocessor - -0:45:47.049,0:45:49.299 -so one CPU - -0:45:49.299,0:45:51.209 -if we had a multiprocessor - -0:45:51.209,0:45:54.009 -%uh then we would have - -0:45:54.009,0:45:57.130 -one instance of the kernel - -0:45:57.130,0:45:59.529 -but multiple instances of the user process - -0:45:59.529,0:46:02.879 -but for any given CPU on a multiprocessor - -0:46:02.879,0:46:05.709 -it is running exactly one process - -0:46:05.709,0:46:09.309 -so you may think they we're running for four-five -processes all at once - -0:46:09.309,0:46:14.319 -but the fact of the matter is that any instant -in time there's only one process which is - -0:46:14.319,0:46:16.299 -actually running - -0:46:16.299,0:46:18.609 -and - -0:46:18.609,0:46:21.429 -that is the one that we have loaded in the system - -0:46:21.429,0:46:25.199 -now we give the illusion that were running -lots of things because we switch between them - -0:46:25.199,0:46:26.100 -rather quickly - -0:46:26.100,0:46:29.269 -so it looks like things are happening in all -windows at once - -0:46:29.269,0:46:31.430 -but in reality - -0:46:31.430,0:46:33.619 -that's not really happening - -0:46:33.619,0:46:36.440 -okay so there is a set of properties that I want to -look at - -0:46:36.440,0:46:40.899 -that had to do with each one of these parts here - -0:46:40.899,0:46:44.359 -but just to sort of look at it from the -big picture perspective - -0:46:44.359,0:46:45.970 -what you see here - -0:46:45.970,0:46:47.180 -is - -0:46:47.180,0:46:51.549 -there is boundary between the user process -and the top half of the kernel - -0:46:51.549,0:46:54.949 -which is really just like a glorified sovereignty -call - -0:46:54.949,0:46:59.539 -it's a lot like calling into a library routine -like calling strcat, strcpy or something - -0:46:59.539,0:47:00.319 -like that - -0:47:00.319,0:47:03.679 -when you do a system call - -0:47:03.679,0:47:05.650 -we take that same set of parameters - -0:47:05.650,0:47:08.009 -now this is sort of - -0:47:08.009,0:47:09.780 -brick Wall here if you will - -0:47:09.780,0:47:11.380 -that is protecting - -0:47:11.380,0:47:13.680 -the top half of the kernel - -0:47:13.680,0:47:15.299 -from the application - -0:47:15.299,0:47:18.899 -I'll go more into some detail about how that -actually gets implemented - -0:47:18.899,0:47:22.729 -but in essence you can think of it -is is there sort of this whaling Wall and these little - -0:47:22.729,0:47:24.990 -chinks there and you can sort of push a request -through - -0:47:24.990,0:47:28.230 -and somebody other sides sort of pulls that -looks at it and decides whether they're going - -0:47:28.230,0:47:28.690 -to - -0:47:28.690,0:47:30.769 -dain to provide service to you - -0:47:30.769,0:47:34.229 -and if they do then they sort of send it back - -0:47:34.229,0:47:37.649 -well like a library where you can just sort -of reach in and walk around if you want to - -0:47:37.649,0:47:38.290 -you - -0:47:38.290,0:47:40.950 -good programming practices you don't do that -but - -0:47:40.950,0:47:43.049 -you could - -0:47:43.049,0:47:44.579 -all right so - -0:47:44.579,0:47:49.089 -the the top half of the kernel is really looks -a lot like - -0:47:49.089,0:47:50.509 -a big library - -0:47:50.509,0:47:53.509 -%uh it just happens to be a library -routines - -0:47:53.509,0:47:57.599 -that deal with things where processes need -to interact with each other - -0:47:57.599,0:48:01.399 -in fact for many people they don't understand -for what's the difference between the C - -0:48:01.399,0:48:03.259 -library and the top half of the kernel - -0:48:03.259,0:48:08.020 -if it's something that you're doing that -no other process needs to know about - -0:48:08.020,0:48:09.799 -then it can be in the C library - -0:48:09.799,0:48:13.829 -so if you call strcat to concatenate two -strings together - -0:48:13.829,0:48:17.599 -nobody else needs to know you're doing that -you don't need to coordinate with anybody - -0:48:17.599,0:48:19.000 -else that you're doing that - -0:48:19.000,0:48:20.160 -it's just happening - -0:48:20.160,0:48:21.979 -so that goes in the C library. - -0:48:21.979,0:48:24.489 -on the other hand if you're reading or writing -the file - -0:48:24.489,0:48:28.029 -there may be other processes that are also -reading and writing that file - -0:48:28.029,0:48:29.910 -and therefore that - -0:48:29.910,0:48:31.579 -has to be done by the kernel - -0:48:31.579,0:48:33.120 -because they can coordinate - -0:48:33.120,0:48:37.189 -all the different processes that are trying to access -that file. - -0:48:37.189,0:48:40.529 -so the top half of the kernel is pretty straightforward -code - -0:48:40.529,0:48:45.539 -it looks a lot like any other library that -you would write if you look at top half kernel - -0:48:45.539,0:48:49.640 -code you know you see all read, come in -it's got these parameters we Mark around we - -0:48:49.640,0:48:53.719 -get some data that we put it in the buffer and -we return back - -0:48:53.719,0:48:57.470 -and in fact writing code for the top half of -the kernel is - -0:48:57.470,0:48:59.729 -not all that difficult to do - -0:48:59.729,0:49:00.989 -it's - -0:49:00.989,0:49:01.959 -you have - -0:49:01.959,0:49:05.939 -for many of the same properties that you would -when you're writing user level application - -0:49:05.939,0:49:07.529 -code - -0:49:07.529,0:49:11.779 -the bottom half of the kernel is where things -start to get nasty - -0:49:11.779,0:49:14.820 -because the bottom half of the kernel is the part -of the system - -0:49:14.820,0:49:18.769 -that deals with all of the asynchronous events -in the system - -0:49:18.769,0:49:22.179 -is things like device drivers, - -0:49:22.179,0:49:23.779 -timers - -0:49:23.779,0:49:25.010 -that level of thing - -0:49:25.010,0:49:28.029 -that are driven by hardware events - -0:49:28.029,0:49:28.659 -so - -0:49:28.659,0:49:31.459 -for example a packet arrives on the network - -0:49:31.459,0:49:33.670 -that causes an interrupt to come and - -0:49:33.670,0:49:36.729 -that will be handled by the bottom half of -the kernel - -0:49:36.729,0:49:38.829 -and historically - -0:49:38.829,0:49:43.079 -when an interrupt came in it preempted whatever -else was going on - -0:49:43.079,0:49:45.400 -and it ran until it finished and then it returned - -0:49:45.400,0:49:46.539 -and it could not - -0:49:46.539,0:49:49.439 -go to sleep to wait for resources or other -things - -0:49:49.439,0:49:51.339 -%uh in current systems - -0:49:51.339,0:49:54.549 -you can actually go to sleep in the interrupt driver -and waiting for - -0:49:54.549,0:49:56.739 -some other activity to complete - -0:49:56.739,0:49:58.259 -it is however - -0:49:58.259,0:50:00.799 -not a good idea to do that - -0:50:00.799,0:50:01.909 -because - -0:50:01.909,0:50:06.739 -the usual case of most device drivers is they -can finish whatever they're doing in an interrupt - -0:50:06.739,0:50:08.579 -without ever blocking - -0:50:08.579,0:50:09.580 -and so - -0:50:09.580,0:50:13.649 -when an interrupt comes in we assume that you're -not going to sleep - -0:50:13.649,0:50:14.710 -and if you actually - -0:50:14.710,0:50:17.219 -then go to sleep.oh man - -0:50:17.219,0:50:20.469 -you didn't tell us you're going to do this we -have to go off to do a whole lot of other work - -0:50:20.469,0:50:23.029 -that we had originally planned on doing - -0:50:23.029,0:50:25.460 -so if you go to sleep in a device driver - -0:50:25.460,0:50:28.209 -you are taking a very serious performance hit - -0:50:28.209,0:50:31.019 -so it's highly recommended that you don't -do that - -0:50:31.019,0:50:33.130 -but if you have to you can - -0:50:33.130,0:50:35.809 -on it's because of this historic behavior -or - -0:50:35.809,0:50:39.899 -of not being able to sleep in the bottom half -of the kernel - -0:50:39.899,0:50:42.119 -that you have certain properties that have - -0:50:42.119,0:50:44.769 -taken over in device drivers - -0:50:44.769,0:50:45.940 -and that is - -0:50:45.940,0:50:50.369 -that a device driver should be handed all -the resources it needs to get his job done - -0:50:50.369,0:50:54.490 -you don't give a disk device driver -Go read this - -0:50:54.490,0:50:56.549 -and put it somewhere - -0:50:56.549,0:50:57.580 -you have to say - -0:50:57.580,0:50:59.410 -Go read this particular block - -0:50:59.410,0:51:02.650 -here is a chunk of memory that I want that - data to put in - -0:51:02.650,0:51:03.959 -and - -0:51:03.959,0:51:06.169 -notify me when it's done - -0:51:06.169,0:51:06.970 -because - -0:51:06.970,0:51:10.660 -things like allocating memory are classic -places where you end up having to go to sleep - -0:51:10.660,0:51:12.939 -to wait for stuff to happen - -0:51:12.939,0:51:14.449 -and - -0:51:14.449,0:51:16.390 -historically you couldn't do that - -0:51:16.390,0:51:18.640 -even currently don't want to have to do that - -0:51:18.640,0:51:23.400 -so device drivers generally have all -resources pre allocated - -0:51:23.400,0:51:25.169 -and then they can just go - -0:51:25.169,0:51:27.279 -the one place where this doesn't work - -0:51:27.279,0:51:29.029 -is the network - -0:51:29.029,0:51:30.929 -and in particular - -0:51:30.929,0:51:34.630 -you don't know when somebody's going to send -packets to you - -0:51:34.630,0:51:37.040 -you say well you're looking to open connections - -0:51:37.040,0:51:39.360 -but if you're doing something like IP forwarding - -0:51:39.360,0:51:40.969 -there's no - -0:51:40.969,0:51:45.039 -top half state it's dealing with this packets -they're just coming in on one interface being - -0:51:45.039,0:51:46.719 -sent out on another interface - -0:51:46.719,0:51:50.630 -they never pass through any part of the top -half of the kernel - -0:51:50.630,0:51:53.529 -and so in the case of network device drivers - -0:51:53.529,0:51:56.149 -they need to allocate memory - -0:51:56.149,0:51:56.640 -and - -0:51:56.640,0:51:58.829 -if memory gets into short supply - -0:51:58.829,0:52:01.689 -and they try to allocate memory and it's not -available - -0:52:01.689,0:52:05.049 -they historically couldn't wait for memory to be -available - -0:52:05.049,0:52:08.380 -and even in practice today don't wait - - -0:52:08.380,0:52:09.580 -for memory to become available - -0:52:09.580,0:52:12.469 -they simply drop the packet on the floor - -0:52:12.469,0:52:18.109 -it's like well I didn't have any place to -put it sorry oops - -0:52:18.109,0:52:20.940 -now that doesn't cause incorrect behavior - -0:52:20.940,0:52:24.369 -because the higher level protocols will retransmit - -0:52:24.369,0:52:29.140 -but it does cause great performance problems -because retransmission means that connections - -0:52:29.140,0:52:29.879 -stall - -0:52:29.879,0:52:31.110 -they have to back up - -0:52:31.110,0:52:33.010 -they have to resend data - -0:52:33.010,0:52:33.739 -and so on - -0:52:33.739,0:52:38.739 -so you really want to avoid dropping packets -if you can possibly help it - -0:52:38.739,0:52:42.029 -and consequently - -0:52:42.029,0:52:43.420 -we tend to - -0:52:43.420,0:52:46.499 -pre allocate a certain amount of memory for -the network drivers - -0:52:46.499,0:52:48.299 -and - -0:52:48.299,0:52:52.169 -we try very hard to make sure that we're not -going to run out of memory but - -0:52:52.169,0:52:54.869 -if packets come fast enough and we can't deal -with them - -0:52:54.869,0:52:57.940 -as quickly as they are arriving then over short period -of time - -0:52:57.940,0:53:03.489 -we get to the point where we simply have to start -dropping packets - -0:53:03.489,0:53:07.649 -okay this is a part of kernel that you do not wish to -write code for - -0:53:07.649,0:53:10.919 -because it is extremely difficult to -debug - -0:53:10.919,0:53:12.759 -you get these bugs where - -0:53:12.759,0:53:18.779 -the only time it happens is on the third Tuesday -when there's a full moon - -0:53:18.779,0:53:19.300 -and - -0:53:19.300,0:53:24.199 -we have a disk interrupt followed by %uh a -terminal character coming in - -0:53:24.199,0:53:28.289 -and the network packet arriving of size fifteen -twenty two - -0:53:28.289,0:53:30.109 -and when all those things happened - -0:53:30.109,0:53:32.719 -the system panics - -0:53:32.719,0:53:37.380 -and of course there's like it panics -cause you're following some bad pointer - -0:53:37.380,0:53:40.969 -something that should have been there -but was freed some time in the distant past - -0:53:40.969,0:53:42.930 -we are not sure when - -0:53:42.930,0:53:44.049 -and - -0:53:44.049,0:53:47.400 -try to debug things like that is extremely -difficult - -0:53:47.400,0:53:48.509 -and you can - -0:53:48.509,0:53:52.120 -think well I think I found the problem but -it's not reproduceable - -0:53:52.120,0:53:55.530 -you know you have to wait for the next third -Tuesday with a full moon and blah blah blah - -0:53:55.530,0:53:56.950 -to happen - -0:53:56.950,0:53:57.469 -and - -0:53:57.469,0:54:01.449 -you know so you sort of statistically -guess that you fix that you know I was getting - -0:54:01.449,0:54:03.510 -this bug once every three days - -0:54:03.510,0:54:06.099 -and now it's gone for two weeks without happening - -0:54:06.099,0:54:07.239 -did you fix that? - -0:54:07.239,0:54:08.969 -or if you've been lucky - -0:54:08.969,0:54:10.459 -and and it's - -0:54:10.459,0:54:14.349 -that coupled with the fact that you're -dealing with hardware - -0:54:14.349,0:54:18.049 -and hardware rarely works the way it's documented -to work - -0:54:18.049,0:54:21.770 -and so you know they're doing everything that -it says you're supposed to do - -0:54:21.770,0:54:26.260 -it still doesn't work because you didn't set -the fiddle bit over on that other place over - -0:54:26.260,0:54:26.660 -there - -0:54:26.660,0:54:30.479 -that's not documented anywhere but if it's -not said it doesn't work - -0:54:30.479,0:54:33.769 -occasionally - -0:54:33.769,0:54:36.110 -so this is another reason that you really want -of avoid - -0:54:36.110,0:54:40.459 -dealing with this part of the system if -you can possibly help - -0:54:40.459,0:54:44.369 -okay but lets go through and and look at some -of the properties here starting up at - -0:54:44.369,0:54:45.789 -the user process - -0:54:45.789,0:54:47.980 -we're running with - -0:54:47.980,0:54:51.449 -preemptive scheduling - -0:54:51.449,0:54:53.409 -now there's several caveats here - -0:54:53.409,0:54:55.239 -preemptive scheduling is the default - -0:54:55.239,0:54:56.970 -so called shared scheduler - -0:54:56.970,0:55:01.360 -that is what you normally use there are other -schedulers like the real time scheduler - -0:55:01.360,0:55:02.869 -where what I'm saying isn't that true - -0:55:02.869,0:55:05.709 -we'll talk about some of the schedulers was -later - -0:55:05.709,0:55:09.930 -but the usual scheduler that you're running -on under UNIX is a shared scheduler - -0:55:09.930,0:55:13.229 -and under the shared scheduler user applications - -0:55:13.229,0:55:15.159 -run with pre emptive scheduling - -0:55:15.159,0:55:17.449 -and pre emptive scheduling means that - -0:55:17.449,0:55:20.019 -you run at the whim of the system - -0:55:20.019,0:55:21.420 -if it wants you to run - -0:55:21.420,0:55:22.140 -you run - -0:55:22.140,0:55:25.490 -once you to start running you have no guarantee -of how long you're going to run - -0:55:25.490,0:55:29.370 -it might like to run for three instructions -and then decide it doesn't like you many more - -0:55:29.370,0:55:31.150 -it wants to run something else - -0:55:31.150,0:55:35.920 -or you might get to run for several seconds -and in a row with the with no intervening - -0:55:35.920,0:55:37.469 -things interrupting you - -0:55:37.469,0:55:39.719 -you just don't know - -0:55:39.719,0:55:40.969 -and - -0:55:40.969,0:55:42.839 -really all you know is - -0:55:42.839,0:55:43.569 -that - -0:55:43.569,0:55:48.239 -they claim that they're using statistics -and that and that the statistics are fair - -0:55:48.239,0:55:55.059 -and so on average you're going to get a reasonable -amount of time but thats - -0:55:55.059,0:55:57.129 -up to the system you don't control that - -0:55:57.129,0:55:58.439 -the real point here - -0:55:58.439,0:56:01.940 -is that you don't have any way of creating -a critical section - -0:56:01.940,0:56:04.950 -you can't say okay I don't want to be interrupted - -0:56:04.950,0:56:07.429 -during this particular sequence of things - -0:56:07.429,0:56:09.809 -so you have to program - -0:56:09.809,0:56:13.469 -assuming that you may be interrupted at any -point - -0:56:13.469,0:56:14.979 -okay - -0:56:14.979,0:56:18.909 -the next thing is that when you're running -in a user process - -0:56:18.909,0:56:20.719 -you are running in - -0:56:20.719,0:56:24.150 -with the processor in what's called unprivileged -mode - -0:56:24.150,0:56:28.109 -one of the requirements for running any kind -of a UNIX system - -0:56:28.109,0:56:31.759 -is that you have to have a processor that -support privileged and unprivileged - -0:56:31.759,0:56:33.709 -two different modes of operation - -0:56:33.709,0:56:37.049 -in privileged mode which is what the kernel -runs in - -0:56:37.049,0:56:38.950 -the entire repertoire - -0:56:38.950,0:56:40.869 -of the hardware is available - -0:56:40.869,0:56:45.339 -by this I mean you can set all the registers -you can fiddle with the memory management - -0:56:45.339,0:56:47.460 -unit you can initiate I/O - -0:56:47.460,0:56:50.519 -you can access any memory anywhere - -0:56:50.519,0:56:51.919 -etc - -0:56:51.919,0:56:56.540 -when you're running in unprivileged -mode which is what user processes run in and - -0:56:56.540,0:57:00.709 -this a large subset of the instructions which -you cannot execute - -0:57:00.709,0:57:03.480 -you cannot initiate I/O on - -0:57:03.480,0:57:04.209 -devices - -0:57:04.209,0:57:06.770 -you cannot change the memory mapping - -0:57:06.770,0:57:10.209 -you cannot access memory that's not part of -your address space - -0:57:10.209,0:57:13.299 -you cannot execute certain instructions -like halt - -0:57:13.299,0:57:15.589 -and - -0:57:15.589,0:57:19.039 -so in general you are protected - -0:57:19.039,0:57:21.789 -from manipulating anything that's outside of your -address space - -0:57:21.789,0:57:23.759 -this of course is desirable because - -0:57:23.759,0:57:27.059 -when you're running in this unprivileged -mode - -0:57:27.059,0:57:28.300 -you're protected - -0:57:28.300,0:57:31.910 -from other processes manipulating you -and they're protected from you manipulating - -0:57:31.910,0:57:33.079 -them - -0:57:33.079,0:57:36.430 -for those of you that have had that misfortune -to have to use - -0:57:36.430,0:57:39.339 -early versions of windows up to about ninety -eight - -0:57:39.339,0:57:42.470 -they always ran with the processor -running in privileged mode - -0:57:42.470,0:57:44.009 -even in applications - -0:57:44.009,0:57:46.459 -and so either maliciously or accidentally - -0:57:46.459,0:57:50.000 -you could stop on other people address space -or you could stop on the kernel - -0:57:50.000,0:57:53.020 -and a lot of the blue screen of death was -people just - -0:57:53.020,0:57:56.319 -following wild pointers and trashing different -parts of the system - -0:57:56.319,0:57:58.819 -taking everything down - -0:57:58.819,0:58:00.020 -it also makes it - -0:58:00.020,0:58:02.320 -far easier to - -0:58:02.320,0:58:05.459 -implement things like viruses and worms and -other things because - -0:58:05.459,0:58:09.619 -a user application can we rewrite the boot -block on the disk they can just the write down - -0:58:09.619,0:58:13.109 -and manipulate the registers that allow them -to do whatever they want - -0:58:13.109,0:58:16.730 -whereas when you're running in unprivileged -mode you can't write those kinds of - -0:58:16.730,0:58:20.179 -of things - -0:58:20.179,0:58:24.119 -so modern versions of Windows anything from about -2000 on - -0:58:24.119,0:58:26.630 -now run with privileged and unprivileged mode - -0:58:26.630,0:58:28.649 -but UNIX has always required that - -0:58:28.649,0:58:30.219 -and so when you're running an - -0:58:30.219,0:58:31.319 - user process - -0:58:31.319,0:58:33.389 -you cannot block I mean - -0:58:33.389,0:58:37.969 -you cannot execute the instructions which -cause a context switching to occur - -0:58:37.969,0:58:40.349 -you can't pick what's going to run next - -0:58:40.349,0:58:43.140 -you can't make that thing run next all you can -do - -0:58:43.140,0:58:45.189 -is go to the operating system and say - -0:58:45.189,0:58:49.269 -hey I've got nothing to do. pick somebody else -to run - -0:58:49.269,0:58:53.449 -and the operating system is the think they can -then execute the instructions which cause - -0:58:53.449,0:58:57.609 -a different process to be loaded - -0:58:57.609,0:58:59.049 -and run - -0:58:59.049,0:59:03.400 -alright.finally while you're in a user application you're -running on a user stack - -0:59:03.400,0:59:06.410 -that's part of the user's address space - -0:59:06.410,0:59:07.889 -so - -0:59:07.889,0:59:10.819 -part of creating a process gives you a runtime -stack - -0:59:10.819,0:59:14.369 -as part of a virtual address space and so it -can be - -0:59:14.369,0:59:18.199 -more or less up to the limits of the hardware -as big as you want it to be - -0:59:18.199,0:59:19.949 -so if you are running on thirty two-bit processor - -0:59:19.949,0:59:22.819 -you're stack can get the 2 gigabytes - -0:59:22.819,0:59:23.319 -and - -0:59:23.319,0:59:26.839 -the what this means is that anytime you -allocate local variables - -0:59:26.839,0:59:28.529 -you don't have to worry about Oh - -0:59:28.529,0:59:30.609 -is that gonna overrun my stack? - -0:59:30.609,0:59:31.610 -so if you need - -0:59:31.610,0:59:35.519 -a hundred thousand double precision floating -point numbers - -0:59:35.519,0:59:37.189 -you can just as a local variable allocate - -0:59:37.189,0:59:40.269 -an array of size a hundred-thousand type -double - -0:59:40.269,0:59:44.029 -and it just decrements your stack pointer by -hundred hundred thousand bytes - -0:59:44.029,0:59:45.009 -away you go - -0:59:45.009,0:59:47.299 -it's just virtual address space - -0:59:47.299,0:59:49.020 -as you'll see when we get into the kernel - -0:59:49.020,0:59:50.210 -that ceases to be the case |