Diffstat (limited to 'en_US.ISO8859-1/captions/2009/asiabsdcon/rao-kernellocking-2.sbv')
1 files changed, 150 insertions, 155 deletions
diff --git a/en_US.ISO8859-1/captions/2009/asiabsdcon/rao-kernellocking-2.sbv b/en_US.ISO8859-1/captions/2009/asiabsdcon/rao-kernellocking-2.sbv
index 16f08a914a..bf785730a9 100644
@@ -1,8 +1,8 @@
-we are going to look mainly in this second part
+we are going to look, mainly in this second part,
at how to
@@ -15,32 +15,32 @@ locking problems
that categorize in the kernel.
-Here there are described two kinds of problems
-you can get with locks, that are pretty much common
+Here, there are described two kinds of problems
+you can get with locks, that are pretty much common.
-The first one is called Lock Order Reversal
+The first one is called Lock Order Reversal (LOR).
-when you have for example a thread A
+When you have for example a thread A,
-a lock code, for example, L1
+a lock code, for example L1
and another thread B
-which owns another lock, L2
+which owns the lock, L2
-Then thread A tries to
+Then thread A tries to..
-Right, it's wrong.
+Right.. it's wrong.
The slide is wrong.
@@ -96,7 +96,7 @@ knows that
locks should maintain
-an ordering regard of each other.
+an ordering in regard of each other.
That's not very simple when
@@ -110,25 +110,26 @@ that there are 3 kinds of classes of locks
is going to count because you can
-never mix two different kinds of locks
+never mix two different kinds of locks.
-a spin lock
-and a mutex.
+and a mutex
-You can mix in this way.
+can be mixed in this way.
+You can have the mutex first and the spinlock later,
+while the opposite is not actually true.
-So, you will see that this kind
+So, you will see that these kind
of deadlocks are possible
@@ -141,26 +142,26 @@ like for example 2 mutex or 2 spin mutex,
-Even if it's not very well documented,
+even if it's not very well documented,
-for example, spin locks,
+for example spinlocks
-in previous deep, as a way to
+in FreeBSD, have a way to
identify such kind of deadlocks.
-And it's pretty much implemented...
+And it's pretty much implemented.
-a very much in it would
It's a feature enabled in the code.
@@ -178,7 +179,7 @@ if it exceeds
an exaggerated result,
-it means that they are probable
+it means that they are probably
under a deadlock and the system panics.
@@ -224,7 +225,7 @@ you can end up having some threads
sleeping on this wait channel
-and nobody is going to wake up them again.
+and nobody is going to wake them up again.
This is usually called missed wakeup
@@ -240,10 +241,10 @@ The problem is that
it's very difficult to differentiate
-between missed wakeup and,
+between missed wakeup and
@@ -256,7 +257,7 @@ that is not likely to be awaken.
So these kind of deadlocks are
-very difficult to be discovered
+very very difficult to be discovered
and will require some bit of
@@ -278,7 +279,7 @@ kernel systems
and some things integrated into the debugger.
@@ -292,7 +293,7 @@ with kernel problems.
The first one (and the most important)
-is called witness.
+is called WITNESS.
It was introduced
@@ -301,7 +302,7 @@ It was introduced
in the context of SMPng
-and has been written in the recent past,
+and has been rewritten in the recent past,
mainly by a contribution of
@@ -313,7 +314,7 @@ Isilon systems.
They contributed back then
-to the writing of witness.
+to the writing of WITNESS.
This subsystem is very important
@@ -322,7 +323,7 @@ This subsystem is very important
because it tracks down exactly every order
-of the locks
+of the locks.
So that, if there is an ordering violation like a LOR,
@@ -356,7 +357,7 @@ Doing that,
we can identify
@@ -368,7 +369,7 @@ on the
-We could say that witness is pretty big,
+We could say that WITNESS is pretty big,
so activating it
@@ -381,10 +382,10 @@ It's mainly used when you are going to
develop a new feature in the kernel
-and you are going to test it heavily,
+and you are going to test it heavily.
-in particular if it has
+In particular if it has
@@ -393,7 +394,7 @@ some
relation to locking.
We could also tell that with the new code
@@ -403,10 +404,10 @@ provided by Isilon and Nokia,
-offered by witness is greatly reduced to about
+offered by WITNESS is greatly reduced to about
the 10th part of
@@ -415,7 +416,7 @@ the 10th part of
what we had before.
-Witness is very good at tracking LOR,
+WITNESS is very good at tracking LOR,
@@ -448,7 +449,7 @@ and
-and basically, it's in the 8th release,
+it's in the 8th release,
we have new features
@@ -467,18 +468,18 @@ and their orderings
-and shows some graphs of the relations.
-Even from the user space,
+it shows some graphs of the relations
+even from the user space.
-you don't have to go into the kernel
-degubber to look at it's output.
+You don't have to go into the kernel
+debugger to look at it's output.
Well, I see that sometimes when
@@ -507,36 +508,36 @@ when a deadlock
is in the kernel.
if you want to find a deadlock
-that's happening in the kernel
+that's happening in the kernel,
-your first line of analysis start from the DDB
+your first line of analysis starts from the DDB
instead of a post-mortem analysis,
-which is even more important,
+which is even more important.
-but using DDB you will get more
+But, using DDB you will get more
processes and better information.
The most important unit in order to find the deadlock
-are the LORs reported by witness in order
+are the LORs reported by WITNESS in order
to see if there is something strange
@@ -547,10 +548,10 @@ You want to know the state of all the threads
that are running on the system that is deadlocking.
-You can see that you're deadlocking if you see that
+You can see that you're deadlocking, if you see that
on the runqueue
@@ -570,21 +571,21 @@ and you have all the threads sleeping
in their own containers.
-You need to know which are exactly locks
+You need to know which are the exact locks
that are acquired
in the system
-and that's something that witness provides,
+and that's something that WITNESS provides
-and the very important things is
+and the very important thing is
to know why the threads are stopping.
-So one on the most important things is
+So one of the most important things is
retrieving what the threads were doing
@@ -594,13 +595,13 @@ when
they were put asleep.
The backtraces of all the threads involved
-are so printed out in order to identify deadlocks.
+are printed out in order to identify deadlocks.
In the case that
@@ -609,13 +610,13 @@ In the case that
buffered cache and VFS are
-probably parts of the deadlocking
+probably parts of the deadlocking,
you should also print out
-the info about vnodes
+the information about vnodes
and what we're interested in is which vnodes are called,
@@ -630,10 +631,10 @@ are actually referenced
-which way they were called.
+in which way they were called.
this is an example
@@ -648,7 +649,7 @@ thread states
in the case of a deadlock.
-This is an example of a real deadlock
+This is an real example of a deadlock
but you can see
@@ -663,10 +664,10 @@ this is not totally complete.
But you can see that all the threads are sleeping.
-And this one is the message
+This one is the message
used by the wait channel
@@ -678,7 +679,7 @@ on which they're sleeping on
or used by
-the container like the turnsile or the sleep queue.
+the container like the turnstile or the sleepqueue.
If I recall correctly, it's a forced amount
@@ -688,7 +689,7 @@ that does deadlocking at some point.
I'm not really sure
-because I have to take a look at it.
+because I should have looked at it.
You can see that the revelant command here
@@ -698,7 +699,7 @@ is -ps
that DDB supports.
Another important thing
@@ -716,43 +717,39 @@ As you can see there,
-is because you can add some data structures corrupted
+its because you can add some data structures corrupted
in the per-CPU datas.
-That's a very common situation when you get deadlocks,
+That's a very common situation where you can get deadlocks,
-because, for example,
+because, for example,
leaving a corrupted LPD will lead
-I loved you too much review shellacking double
-falls and things like that about that
to you having a bigger massive breakage like
-double-faults. In general. it's a good idea to
-look at all the CPUs involved in the system.
+double-faults and things like that. Usually it's always a
+good idea to look at all the CPUs involved in the system.
-is ""show allpcpu""
+is """"-show allpcpu"".
-this one is a witness specific command -show alllocks
+is a WITNESS specific command ""-show alllocks""
and it's going to show all the locks,
@@ -771,7 +768,7 @@ a mount,
and the thread is this one,
-what lock is holding,
+what the lock is holding,
that's the address
@@ -783,7 +780,7 @@ and where it was acquired.
It gives you lines and file.
@@ -792,7 +789,7 @@ Actually,
that's just possible
-with witness, because otherwise,
+with WITNESS, because otherwise,
trying to keep the oldest information
@@ -806,13 +803,13 @@ Then, the most important thing is
the backtrace for any thread.
It's going to show the backtrace
-for old threads.
+for all the threads.
@@ -827,7 +824,7 @@ the thread with these addresses TID and PID
basically got sleeping
-on a pnode.
+on a vnode.
You will see that the backend in this case is FFF
@@ -839,7 +836,7 @@ and
that's the context switching function,
those are the sleepqueues of the containter
@@ -867,10 +864,11 @@ you will have a lot of these kinds of traces,
but they are very important
-so as developers to understand what is going on.
+for the developers in order to understand
+what is going on.
-This ones are the locked vnodes
+These ones are the locked vnodes
that are also very important when
@@ -891,10 +889,10 @@ they are specific
to some handling of the vnodes such as recycling,
-and completely freeing
+and completely freeing.
-that's the mount point
+That's the mount point
where the vnodes
@@ -927,7 +925,7 @@ For example, it tells you that
-the lock is in exclusive mode
+is in exclusive mode
@@ -942,13 +940,13 @@ on its queues.
-the node number
+the node number.
-there are also under information you could receive
+There is also other information you could receive
from the DDB linked to, for example,
@@ -961,18 +959,18 @@ like sleep chains,
-wait channel if you have the address
+wait channel, if you have the address
-and, for example,
+and for example,
you can also print the wall table of
-the lock relations from witness
+the lock relations from WITNESS
but it's mostly never useful
-because you already know that.
+because you should already know that.
So you will just need to know which is the one
@@ -981,14 +979,12 @@ So you will just need to know which is the one
-can give trouble.
+can give the trouble.
-if you're going to say I mean some problem
-is a problem
So if you are going to submit some problems
@@ -1020,8 +1016,8 @@ I think that
it is a very good thing to talk about it.
-Along with the witness, we have another
-mechanism that could help us,
+Along with the WITNESS, we have another
+important mechanism that could help us with deadlocks
and it's called KTR.
@@ -1030,17 +1026,17 @@ and it's called KTR.
-basically a logger of events,
+basically a logger, a kernel logger, of events.
as you can, for example,
-handle different classes of events,
+handle different classes of events.
In FreeBSD we have
@@ -1071,7 +1067,7 @@ enable several classes,
-the time class of the KTR
+the ten classes of the KTR
and then you are just interested in three of them
@@ -1114,7 +1110,7 @@ and not the information,
it doesn't make copies, you need to just pass
@@ -1139,10 +1135,10 @@ you can also look at it from the user space
through the ktrdump interface.
Why is that important for locking?
@@ -1163,8 +1159,8 @@ on which CPU branches,
and the order it happened in.
-and this is very important when you're
-going to track down for example races,
+This is very important when you're
+going to track down for example traces,
when you are not sure about the order of operations and
@@ -1185,14 +1181,14 @@ a typical trace of KTR,
where you have
-the CPU where the event happened, the index,
+the CPU where the event happened, thats the index,
that's a timestamp,
I think it's retrieved directly from the TSC,
-but i'm not actually sure.
+but i'm actually not sure.
In this case,
@@ -1202,7 +1198,7 @@ i was tracking down the scheduler class,
so I was interested mainly in scheduler
-workloads and i could see
+workloads and I could see
@@ -1238,10 +1234,10 @@ this one
and other things.
-you can enable it
+You can enable
the option KTR, but you must handle it carefully.
@@ -1266,7 +1262,7 @@ enough entries
to have a reliable tracking.
For example, if you are going to track a lot of events,
@@ -1293,23 +1289,23 @@ let you compile some classes,
or mask them,
-or mask a CPU,
+or even mask the CPU.
-if you have a big SMP environment,
+If you have a big SMP environment,
so that you can selectively enable some of them.
For example, this is very good for
-tracking down races in the sleeping queue.
+tracking down traces in the sleeping queue.
You can find referrals here.
@@ -1332,7 +1328,7 @@ I think that actually our locking system
is pretty complete,
-but it's also confusing for newcomers,
+but it's also pretty confusing for newcomers,
it's not widely documented.
@@ -1361,14 +1357,14 @@ who just need to do simple tasks.
For example, I saw a lot of guys coming from Linux World
-who wanted to actually use spinlocks for time
+who wanted to actually use spinlocks for time.
-it's obvious they are missing something from our
+It's obvious they are missing something from our
-From, uh, ...
just a technical point of view,
@@ -1384,7 +1380,7 @@ we have lockmgr and sxlog
which are both read/write locks and
-are both serverd by sleep queues.
+are both servered by sleep queues.
They have some differences, obviously,
@@ -1394,21 +1390,20 @@ but, mainly,
we could manage the missing bits and
-just one of the 2 interfaces.
+just use one of the two interfaces.
-on the scene where he hasn't told you before
In the same way, as i told you before,
-the sleeping point, true end sleep,
+the sleeping point, true-end sleep,
read/write sleep and sxsleep
-should probably be merged with cond vars
+should probably be managed with cond vars
and superdoff our kernel
@@ -1418,10 +1413,10 @@ and we should probably drop sema,
because it is obsolete, and can be
-replaced by condova and mutex.
+replaced by condvars and mutex.
@@ -1500,7 +1495,7 @@ Instead, the other one
uses spinning on a local variable
-which is not shared by the threads
+which is not shared by the threads.
and the time spent
@@ -1512,10 +1507,10 @@ on that
local variable increases
with the passing of time.
@@ -1527,10 +1522,10 @@ Another interesting thing would be benchmarking
different wake-up algorithms for blocking primitives.
-We have an algorithms that has proven to be
+We have an algorithm that has proven to be
@@ -1549,10 +1544,10 @@ a higher overhead but could give time improvements
on a big SMP environment.
-another thing that would be very interesting
+Another thing that would be very interesting
to fix is the priority inversion problem
@@ -1584,7 +1579,7 @@ is often just a single atomic operation,
-if it fails
+if it fails,
it falls down and the art pattern tries to do
@@ -1595,7 +1590,7 @@ in this case the owner of record technic
was going to make the fastpack too simple
it just considers
@@ -1611,7 +1606,7 @@ And it practically lands the priority to this
owner of record which does it's right log.
Another important thing obviously is improving locking
@@ -1647,4 +1642,4 @@ like the one we saw before with the malloc command,
that needs to sleep.