| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Neither of these routines allocate stacks.
Reviewed by: kib
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29227
|
|
|
|
|
|
|
| |
Reviewed by: kib
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29208
|
|
|
|
|
|
|
|
|
|
| |
POSIX states that new threads created via pthread_create() should
inherit the "floating point environment" from the creating thread.
Discussed with: kib
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29204
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Use update_pcb_bases() when updating FS or GS base addresses to
permit use of FSBASE and GSBASE in Linux processes. This also sets
PCB_FULL_IRET. linux32 was setting PCB_32BIT which should be a
no-op (exec sets it).
- Remove write-only variables to construct unused segment descriptors
for linux32.
Reviewed by: kib
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29026
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before the pcb is copied to the new thread during cpu_fork() and
cpu_copy_thread(), the kernel re-reads the current register values in
case they are stale. This is done by setting PCB_FULL_IRET in
pcb_flags.
This works fine for user threads, but the creation of kernel processes
and kernel threads do not follow the normal synchronization rules for
pcb_flags. Specifically, new kernel processes are always forked from
thread0, not from curthread, so adjusting pcb_flags via a simple
instruction without the LOCK prefix can race with thread0 running on
another CPU. Similarly, kthread_add() clones from the first thread in
the relevant kernel process, not from curthread. In practice, Netflix
encountered a panic where the pcb_flags in the first kthread of the
KTLS process were trashed due to update_pcb_bases() in
cpu_copy_thread() running from thread0 to create one of the other KTLS
threads racing with the first KTLS kthread calling fpu_kern_thread()
on another CPU. In the panicking case, the write to update pcb_flags
in fpu_kern_thread() was lost triggering an "Unregistered use of FPU
in kernel" panic when the first KTLS kthread later tried to use the
FPU.
Reported by: gallatin
Discussed with: kib
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29023
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change converts most of the counters in the amd64 pmap from
global atomics to scalable counter(9) counters. Per discussion
with kib@, it also removes the handrolled per-CPU PCID save count
as it isn't considered generally useful.
The bulk of these counters remain guarded by PV_STATS, as it seems
unlikely that they will be useful outside of very specific debugging
scenarios. However, this change does add two new counters that
are available without PV_STATS. pt_page_count and pv_page_count
track the number of active physical-to-virtual list pages and page
table pages, respectively. These will be useful in evaluating
the memory footprint of pmap structures under various workloads,
which will help to guide future changes in this area.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D28923
|
|
|
|
|
|
|
|
|
|
|
|
| |
Other kernel sanitizers (KMSAN, KASAN) require interceptors as well, so
put these in a more generic place as a step towards importing the other
sanitizers.
No functional change intended.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29103
|
|
|
|
|
|
|
|
|
|
|
|
| |
invltlb_invpcid_pti_handler() was requesting delayed TLB invalidation
even for processes that aren't using PTI. With an out-of-tree
change to avoid PTI for non-jailed root processes, this caused an
assertion failure in pmap_activate_sw_pcid_pti() when context-switching
between PTI and non-PTI processes.
Reviewed by: bdrewery kib tychon
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D29094
|
|
|
|
|
|
|
|
|
|
| |
Otherwise during attach newbus prints "nexus0", which is not very
useful.
The generic nexus device is already quiet, as is nexus_acpi on arm64.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The purpose of these checks is to ensure that the address of the
next-level page table page is valid, since nothing is synchronizing with
a concurrent update of the large map and large map PTPs are freed to the
system. However, if PG_PS is set, there is no next level.
Reported by: rpokala
Reviewed by: kib
Tested by: rpokala
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28922
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases the DELAY implementation on amd64 can recurse on a spin
mutex in the i8254 early delay code. Detect when this is going to
happen and don't call delay in this case. It is safe to not delay here
with the only issue being KCSAN may not detect data races.
Reviewed by: kib
Tested by: arichardson
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D28895
|
|
|
|
|
|
|
|
|
| |
Add it to the x86 GENERIC and MINIMAL kernels
Sponsored by: Ampere Computing LLC
Submitted by: Klara Inc.
Reviewed by: rpokala
Differential Revision: https://reviews.freebsd.org/D28738
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tested with glibc test suite.
The C variant in libkern performs excessive branching to find the zero
byte instead of using the bsfq instruction. The same code patched to use
it is still slower than the routine implemented here as the compiler
keeps neglecting to perform certain optimizations (like using leaq).
On top of that the routine can be used as a starting point for copyinstr
which operates on words intead of bytes.
The previous attempt had an instance of swapped operands to andq when
dealing with fully aligned case, which had a side effect of breaking the
code for certain corner cases. Noted by jrtc27.
Sample results:
$(perl -e "print 'A' x 3"):
stock: 211198039
patched:338626619
asm: 465609618
$(perl -e "print 'A' x 100"):
stock: 83151997
patched: 98285919
asm: 120719888
Reviewed by: jhb, kib
Differential Revision: https://reviews.freebsd.org/D28779
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This macro returns true if a provided virtual address is contained
in the kernel's clean submap.
In CHERI kernels, the buffer cache and transient I/O map are allocated
as separate regions. Abstracting this check reduces the diff relative
to FreeBSD. It is perhaps slightly more readable as well.
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D28710
|
|
|
|
|
|
|
|
|
|
|
| |
linux_shared_page_init() creates an object and grabs and maps a single
page to back the VDSO. When destroying the VDSO object, we failed to
destroy the mapping and free KVA. Fix this.
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28696
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow setting the bootmethod variable from the Xen PVH entry point, in
order to be able to correctly set the underlying firmware mode when
booted as a dom0.
Move the bootmethod variable to be defined in x86/cpu_machdep.c
instead so it can be shared by both i386 and amd64.
Sponsored by: Citrix Systems R&D
Reviewed by: kib
Differential revision: https://reviews.freebsd.org/D28619
|
|
|
|
|
|
|
| |
No functional changes.
Sponsored By: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28533
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit af366d353b84bdc4e730f0fc563853abc338271c.
Trips over '\xa4' byte and terminates early, as found in
lib/libc/gen/setdomainname_test:setdomainname_basic testcase
However, keep moving libkern/strlen.c out of conf/files.
Reported by: lwhsu
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The C variant in libkern performs excessive branching to find the
non-zero byte instead of using the bsfq instruction. The same code
patched to use it is still slower than the routine implemented here
as the compiler keeps neglecting to perform certain optimizations
(like using leaq).
On top of that the routine can is a starting point for copyinstr
which operates on words instead of bytes.
Tested with glibc test suite.
Sample results (calls/s):
Haswell:
$(perl -e "print 'A' x 3"):
stock: 211198039
patched:338626619
asm: 465609618
$(perl -e "print 'A' x 100"):
stock: 83151997
patched: 98285919
asm: 120719888
AMD EPYC 7R32:
$(perl -e "print 'A' x 3"):
stock: 282523617
asm: 491498172
$(perl -e "print 'A' x 100"):
stock: 114857172
asm: 112082057
|
|
|
|
|
|
|
| |
Reviewed by: hselasky, manu
Sponsored by: NVidia Networking/Mellanox Technologies
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D28469
|
|
|
|
|
|
|
|
|
| |
Although if RJ-45 interface is not being used the miibus is not required
but miibus is a build time dependency.
Reviewed by: imp, manu, rajesh1.kumar@amd.com
Approved by: imp, manu, rajesh1.kumar@amd.com
Differential Revision: https://reviews.freebsd.org/D28465
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
One common method of EOI'ing an interrupt at the IO-APIC level is to
switch the pin to edge triggering mode and then back into level mode.
That would cause the IRR bit to be cleared and thus further interrupts
to be injected. FreeBSD does indeed use that method if the IO-APIC EOI
register is not supported.
The bhyve IO-APIC emulation code didn't clear the IRR bit when doing
that switch, and was also missing acknowledging the IRR state when
trying to inject an interrupt in vioapic_send_intr.
Reviewed by: grehan
Differential revision: https://reviews.freebsd.org/D28238
|
|
|
|
|
|
|
|
|
|
| |
After modifying a redirection entry only try to inject an interrupt if
the pin is in level mode, pins in edge mode shouldn't take into
account the line assert status as they are triggered by edge changes,
not the line status itself.
Reviewed by: grehan
Differential revision: https://reviews.freebsd.org/D28237
|
|
|
|
|
|
|
|
|
|
|
| |
vioapic_send_intr does already check whether the pin is masked before
injecting the interrupt, there's no need to do it in vioapic_write
also.
No functional change intended.
Reviewed by: grehan
Differential revision: https://reviews.freebsd.org/D28236
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a tradeoff which saves jumps for smaller sizes while making
the 8-16 range slower (roughly in line with the other cases).
Tested with glibc test suite.
For example size 3 (most common with vfs namecache) (ops/s):
before: 407086026
after: 461391995
The regressed range of 8-16 (with 8 as example):
before: 540850489
after: 461671032
|
|
|
|
|
|
|
|
| |
All page zeroing is using temporal stores with rep movs*, the routine is
unused for several years.
Should a need arise for zeroing using non-temporal stores, a more
optimized variant can be implemented with a more descriptive name.
|
| |
|
|
|
|
| |
Unbreaks LINT kernels.
|
|
|
|
|
|
|
| |
The ptrace(2) Linux man page claims the syscall returns ESRCH,
if the tracee is not stopped; the native ptrace(2) returns EBUSY.
Sponsored by: The FreeBSD Foundation
|
|
|
|
| |
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Based on discussions on freebsd-arch@, enable KERN_TLS in
GENERIC on amd64, but leave it disabled via the
sysctl kern.ipc.tls.enable. Users wishing to enable
ktls must set kern.ipc.tls.enable=1
While here, fix wording in NOTES to mention that KERN_TLS
also does receive now.
Sponsored by: Netflix
Reviewed by: allanjude
Differential Revision: https://reviews.freebsd.org/D28163
|
|
|
|
|
|
|
|
|
|
|
|
| |
usbhid(4) is disabled by default to avoid conflicts with existing USB HID
drivers. To enable it place following lines to /boot/loader.conf:
hw.usb.usbhid.enable=1
usbhid_load="YES"
Suggested by: jhb
Reviewed by: hselasky
Differential revision: https://reviews.freebsd.org/D28124
|
|
|
|
|
|
|
|
| |
This is the superset of the nooptions found in the -DEBUG kernels.
Reviewed by: emaste, manu
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D28152
|
|
|
|
| |
While here even up whitespace.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In particular, using GELI on a root filesystem will only use
accelerated software crypto drivers if they are available before the
root filesystem is mounted. While these modules can be loaded from
the loader, including them in GENERIC provides a better out-of-the-box
experience for users.
Both aesni(4) and armv8crypto(4) provide accelerated implementations
of the default cipher used by GELI (AES-XTS) in addition to other
ciphers.
Reviewed by: mhorne, allanjude, markj
Differential Revision: https://reviews.freebsd.org/D28100
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
semantic patch:
@@
expression rights, r;
@@
- cap_rights_init(&rights, r)
+ cap_rights_init_one(&rights, r)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Right now the routine leaves the current CPU in the map, later tripping
on an assert when filling in the scoreboard: panic: IPI scoreboard is
zero, initiator 1 target 1
Instead pre-check if all CPUs are present in the map and remember that
outcome for later.
Fixes: 7eaea04a5bb1dc86 ("amd64: compare TLB shootdown target to all_cpus")
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D28111
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On amd64, the pmap code passes all_cpus to
smp_targeted_tlb_shootdown() when unmapping from the
kernel pmap. This function has an optimized path to send IPIs
to all but itself, which it intends to do when the target
is all cpus. However, we need to compare the target cpu mask
with all_cpus, rather than using CPU_ISFULLSET(). Comparing with
CPU_ISFULLSET() will only work when we have MAXCPU cpus active in
the system, otherwise, we'll be sending repeated IPIs, rather than
a single IPI to all CPUs but ourself.
Fixing this should reduce the time spent in native_lapic_ipi_wait()
as we will be sending ipis in parallel, rather than one-by-one.
This is confirmed by dtrace.
Reviewed by: alc, jhb, kib, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D28102
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
table page.
Otherwise parallel pmap_allocpte_alloc() for nearby va might also fail
allocating page table page and free the page under us. The end result is
that we could dereference unmapped pte when doing cleanup after sleep.
Instead, on allocation failure, first free everything, only then we can
drop pmap mutex and sleep safely, right before returning to caller.
Split inner non-sleepable part of the pmap_allocpte_alloc() into a new
helper pmap_allocpte_nosleep().
Reviewed by: markj
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27956
|
|
|
|
|
|
|
|
|
|
|
|
| |
The function performs actual allocation of pte, as opposed to
pmap_allocpte() that uses existing free pte if pt page is already
there. This also moves function out of namespace similar to a language
reserved.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27956
|
|
|
|
|
|
|
|
| |
Noted by: alc
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27956
|
|
|
|
|
|
|
|
|
| |
pmap_pdpe() might return NULL, check for it.
Reviewed by: markj
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27956
|
|
|
|
|
|
|
|
|
|
|
| |
Use an interface compatible with the Linux one so that the user-space
libraries already using the Linux interface can be used without much
modifications.
This allows user-space to make use of the dm_op family of hypercalls,
which are used by device models.
Sponsored by: Citrix Systems R&D
|
|
|
|
|
|
|
|
|
| |
When we already have the vm page in hand, use vm_page_domain() instead
of vm_phys_domain(). The former has a trivial constant-time
implementation whereas the latter iterates over the mem_affinity array.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D28005
|
|
|
|
|
| |
Reviewed by: hselasky
Differential revision: https://reviews.freebsd.org/D28060
|
|
|
|
|
|
|
| |
Suggested by: markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27986
|
|
|
|
|
|
|
| |
Requested and reviewed by: alc
Discussed with: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25815
|
|
|
|
|
|
|
|
|
| |
Remove wi(4). pccard is going away, and wi only supports PC Card
devices, though it has a minor amount of glue to also support
PCI cards. However, removing the one without removing the other
is hard, so the whole driver is being removed.
Relnotes: Yes
|