aboutsummaryrefslogtreecommitdiff
path: root/documentation/content/en/articles/linux-emulation/_index.adoc
diff options
context:
space:
mode:
Diffstat (limited to 'documentation/content/en/articles/linux-emulation/_index.adoc')
-rw-r--r--documentation/content/en/articles/linux-emulation/_index.adoc55
1 files changed, 32 insertions, 23 deletions
diff --git a/documentation/content/en/articles/linux-emulation/_index.adoc b/documentation/content/en/articles/linux-emulation/_index.adoc
index e9d86ae201..a252df5f54 100644
--- a/documentation/content/en/articles/linux-emulation/_index.adoc
+++ b/documentation/content/en/articles/linux-emulation/_index.adoc
@@ -17,17 +17,26 @@ tags: ["Emulation", "Linuxulator", "kernel", "FreeBSD"]
:sectnumlevels: 6
:source-highlighter: rouge
:experimental:
+:images-path: articles/linux-emulation/
-ifeval::["{backend}" == "html5"]
+ifdef::env-beastie[]
+ifdef::backend-html5[]
include::shared/authors.adoc[]
+include::shared/mirrors.adoc[]
+include::shared/releases.adoc[]
+include::shared/attributes/attributes-{{% lang %}}.adoc[]
+include::shared/{{% lang %}}/teams.adoc[]
+include::shared/{{% lang %}}/mailing-lists.adoc[]
+include::shared/{{% lang %}}/urls.adoc[]
+:imagesdir: ../../../images/{images-path}
+endif::[]
+ifdef::backend-pdf,backend-epub3[]
+include::../../../../shared/asciidoctor.adoc[]
endif::[]
-
-ifeval::["{backend}" == "pdf"]
-include::../../../../shared/authors.adoc[]
endif::[]
-ifeval::["{backend}" == "epub3"]
-include::../../../../shared/authors.adoc[]
+ifndef::env-beastie[]
+include::../../../../../shared/asciidoctor.adoc[]
endif::[]
[.abstract-title]
@@ -52,7 +61,7 @@ toc::[]
In the last few years the open source UNIX(R) based operating systems started to be widely deployed on server and client machines.
Among these operating systems I would like to point out two: FreeBSD, for its BSD heritage, time proven code base and many interesting features and Linux(R) for its wide user base, enthusiastic open developer community and support from large companies.
-FreeBSD tends to be used on server class machines serving heavy duty networking tasks with less usage on desktop class machines for ordinary users.
+FreeBSD tends to be used on server class machines serving heavy duty networking tasks with less usage on desktop class machines for ordinary users.
While Linux(R) has the same usage on servers, but it is used much more by home based users.
This leads to a situation where there are many binary only programs available for Linux(R) that lack support for FreeBSD.
@@ -100,7 +109,7 @@ Common UNIX(R) API defines a syscall as a way to issue commands from a user spac
The most common implementation is either by using an interrupt or specialized instruction (think of `SYSENTER`/`SYSCALL` instructions for ia32).
Syscalls are defined by a number.
For example in FreeBSD, the syscall number 85 is the man:swapon[2] syscall and the syscall number 132 is man:mkfifo[2].
-Some syscalls need parameters, which are passed from the user-space to the kernel-space in various ways (implementation dependant).
+Some syscalls need parameters, which are passed from the user-space to the kernel-space in various ways (implementation dependent).
Syscalls are synchronous.
Another possible way to communicate is by using a _trap_.
@@ -210,7 +219,7 @@ The parameters to the actual syscall handler are passed in the form of `struct t
Handling of traps in FreeBSD is similar to the handling of syscalls.
Whenever a trap occurs, an assembler handler is called.
It is chosen between alltraps, alltraps with regs pushed or calltrap depending on the type of the trap.
-This handler prepares arguments for a call to a C function `trap()` (defined in [.filename]#sys/i386/i386/trap.c#), which then processes the occurred trap.
+This handler prepares arguments for a call to a C function `trap()` (defined in [.filename]#sys/i386/i386/trap.c#), which then processes the occurred trap.
After the processing it might send a signal to the process and/or exit to userland using `userret()`.
[[freebsd-exits]]
@@ -501,14 +510,14 @@ numbers are not casual
Spin locks let waiters to spin until they cannot acquire the lock.
An important matter do deal with is when a thread contests on a spin lock if it is not descheduled.
-Since the FreeBSD kernel is preemptive, this exposes spin lock at the risk of deadlocks that can be solved just disabling interrupts while they are acquired.
+Since the FreeBSD kernel is preemptive, this exposes spin lock at the risk of deadlocks that can be solved just disabling interrupts while they are acquired.
For this and other reasons (like lack of priority propagation support, poorness in load balancing schemes between CPUs, etc.), spin locks are intended to protect very small paths of code, or ideally not to be used at all if not explicitly requested (explained later).
[[freebsd-blocking]]
===== Blocking
Block locks let waiters to be descheduled and blocked until the lock owner does not drop it and wakes up one or more contenders.
-In order to avoid starvation issues, blocking locks do priority propagation from the waiters to the owner.
+To avoid starvation issues, blocking locks do priority propagation from the waiters to the owner.
Block locks must be implemented through the turnstile interface and are intended to be the most used kind of locks in the kernel, if no particular conditions are met.
[[freebsd-sleeping]]
@@ -539,7 +548,7 @@ Among these locks only mutexes, sxlocks, rwlocks and lockmgrs are intended to ha
[[freebsd-scheduling]]
===== Scheduling barriers
-Scheduling barriers are intended to be used in order to drive scheduling of threading.
+Scheduling barriers are intended to be used to drive scheduling of threading.
They consist mainly of three different stubs:
* critical sections (and preemption)
@@ -552,11 +561,11 @@ Generally, these should be used only in a particular context and even if they ca
===== Critical sections
The FreeBSD kernel has been made preemptive basically to deal with interrupt threads.
-In fact, in order to avoid high interrupt latency, time-sharing priority threads can be preempted by interrupt threads (in this way, they do not need to wait to be scheduled as the normal path previews).
+In fact, to avoid high interrupt latency, time-sharing priority threads can be preempted by interrupt threads (in this way, they do not need to wait to be scheduled as the normal path previews).
Preemption, however, introduces new racing points that need to be handled, as well.
-Often, in order to deal with preemption, the simplest thing to do is to completely disable it.
+Often, to deal with preemption, the simplest thing to do is to completely disable it.
A critical section defines a piece of code (borderlined by the pair of functions man:critical_enter[9] and man:critical_exit[9], where preemption is guaranteed to not happen (until the protected code is fully executed).
-This can often replace a lock effectively but should be used carefully in order to not lose the whole advantage that preemption brings.
+This can often replace a lock effectively but should be used carefully to not lose the whole advantage that preemption brings.
[[freebsd-schedpin]]
===== sched_pin/sched_unpin
@@ -569,7 +578,7 @@ The latter condition will determine a critical section as a too strong condition
[[freebsd-schedbind]]
===== sched_bind/sched_unbind
-`sched_bind` is an API used in order to bind a thread to a particular CPU for all the time it executes the code, until a `sched_unbind` function call does not unbind it.
+`sched_bind` is an API used to bind a thread to a particular CPU for all the time it executes the code, until a `sched_unbind` function call does not unbind it.
This feature has a key role in situations where you cannot trust the current state of CPUs (for example, at very early stages of boot), as you want to avoid your thread to migrate on inactive CPUs.
Since `sched_bind` and `sched_unbind` manipulate internal scheduler structures, they need to be enclosed in `sched_lock` acquisition/releasing when used.
@@ -732,7 +741,7 @@ On the other hand `close` is just an alias for real FreeBSD man:close[2] so it h
The Linux(R) emulation layer is not complete, as some syscalls are not implemented properly and some are not implemented at all.
The emulation layer employs a facility to mark unimplemented syscalls with the `DUMMY` macro.
These dummy definitions reside in [.filename]#linux_dummy.c# in a form of `DUMMY(syscall);`, which is then translated to various syscall auxiliary files and the implementation consists of printing a message saying that this syscall is not implemented.
-The `UNIMPL` prototype is not used because we want to be able to identify the name of the syscall that was called in order to know what syscalls are more important to implement.
+The `UNIMPL` prototype is not used because we want to be able to identify the name of the syscall that was called to know what syscalls are more important to implement.
[[signal-handling]]
=== Signal handling
@@ -766,7 +775,7 @@ It also unmasks the signal in process signal mask.
[[ptrace]]
=== Ptrace
-Many UNIX(R) derivates implement the man:ptrace[2] syscall in order to allow various tracking and debugging features.
+Many UNIX(R) derivates implement the man:ptrace[2] syscall to allow various tracking and debugging features.
This facility enables the tracing process to obtain various information about the traced process, like register dumps, any memory from the process address space, etc. and also to trace the process like in stepping an instruction or between system entries (syscalls and traps).
man:ptrace[2] also lets you set various information in the traced process (registers etc.).
man:ptrace[2] is a UNIX(R)-wide standard implemented in most UNIX(R)es around the world.
@@ -784,7 +793,7 @@ Also `PT_SYSCALL` is not implemented.
[[traps]]
=== Traps
-Whenever a Linux(R) process running in the emulation layer traps the trap itself is handled transparently with the only exception of the trap translation.
+Whenever a Linux(R) process running in the emulation layer traps the trap itself is handled transparently with the only exception of the trap translation.
Linux(R) and FreeBSD differs in opinion on what a trap is so this is dealt with here.
The code is actually very short:
@@ -842,7 +851,7 @@ Then we talk briefly about some syscalls.
One of the major areas of progress in development of Linux(R) 2.6 was threading.
Prior to 2.6, the Linux(R) threading support was implemented in the linuxthreads library.
The library was a partial implementation of POSIX(R) threading.
-The threading was implemented using separate processes for each thread using the `clone` syscall to let them share the address space (and other things).
+The threading was implemented using separate processes for each thread using the `clone` syscall to let them share the address space (and other things).
The main weaknesses of this approach was that every thread had a different PID, signal handling was broken (from the pthreads perspective), etc.
Also the performance was not very good (use of `SIGUSR` signals for threads synchronization, kernel resource consumption, etc.) so to overcome these problems a new threading system was developed and named NPTL.
@@ -945,7 +954,7 @@ More about locking later.
[[pid-mangling]]
==== PID mangling
-As there is a difference in view as what to the idea of a process ID and thread ID is between FreeBSD and Linux(R) we have to translate the view somehow.
+As there is a difference in view as what to the idea of a process ID and thread ID is between FreeBSD and Linux(R) we have to translate the view somehow.
We do it by PID mangling.
This means that we fake what a PID (=TGID) and TID (=PID) is between kernel and userland.
The rule of thumb is that in kernel (in Linuxulator) PID = PID and TGID = shared -> group pid and to userland we present `PID = shared -> group_pid` and `TID = proc -> p_pid`.
@@ -997,7 +1006,7 @@ Newer glibc in a case of 2.6 kernel uses `clone` to implement man:fork[2] and ma
==== Locking
The locking is implemented to be per-subsystem because we do not expect a lot of contention on these.
-There are two locks: `emul_lock` used to protect manipulating of `linux_emuldata` and `emul_shared_lock` used to manipulate `linux_emuldata_shared`.
+There are two locks: `emul_lock` used to protect manipulating of `linux_emuldata` and `emul_shared_lock` used to manipulate `linux_emuldata_shared`.
The `emul_lock` is a nonsleepable blocking mutex while `emul_shared_lock` is a sleepable blocking `sx_lock`.
Due to of the per-subsystem locking we can coalesce some locks and that is why the em find offers the non-locking access.
@@ -1383,7 +1392,7 @@ We are able to run the most used applications like package:www/linux-firefox[],
Some of the programs exhibit bad behavior under 2.6 emulation but this is currently under investigation and hopefully will be fixed soon.
The only big application that is known not to work is the Linux(R) Java(TM) Development Kit and this is because of the requirement of `epoll` facility which is not directly related to the Linux(R) kernel 2.6.
-We hope to enable 2.6.16 emulation by default some time after FreeBSD 7.0 is released at least to expose the 2.6 emulation parts for some wider testing.
+We hope to enable 2.6.16 emulation by default some time after FreeBSD 7.0 is released at least to expose the 2.6 emulation parts for some wider testing.
Once this is done we can switch to Fedora Core 6 linux_base, which is the ultimate plan.
[[future-work]]