aboutsummaryrefslogtreecommitdiff
path: root/sys/amd64
Commit message (Collapse)AuthorAgeFilesLines
* x86: Update some stale comments in cpu_fork() and cpu_copy_thread().John Baldwin2021-03-121-2/+4
| | | | | | | | | Neither of these routines allocate stacks. Reviewed by: kib MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D29227
* x86: Always use clean FPU and segment base state for new kthreads.John Baldwin2021-03-121-11/+35
| | | | | | | Reviewed by: kib MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D29208
* x86: Copy the FPU/XSAVE state from the creating thread to new threads.John Baldwin2021-03-121-4/+6
| | | | | | | | | | POSIX states that new threads created via pthread_create() should inherit the "floating point environment" from the creating thread. Discussed with: kib MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D29204
* amd64: Cleanups to setting TLS registers for Linux binaries.John Baldwin2021-03-122-21/+5
| | | | | | | | | | | | | | | - Use update_pcb_bases() when updating FS or GS base addresses to permit use of FSBASE and GSBASE in Linux processes. This also sets PCB_FULL_IRET. linux32 was setting PCB_32BIT which should be a no-op (exec sets it). - Remove write-only variables to construct unused segment descriptors for linux32. Reviewed by: kib MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D29026
* amd64: Only update fsbase/gsbase in pcb for curthread.John Baldwin2021-03-121-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before the pcb is copied to the new thread during cpu_fork() and cpu_copy_thread(), the kernel re-reads the current register values in case they are stale. This is done by setting PCB_FULL_IRET in pcb_flags. This works fine for user threads, but the creation of kernel processes and kernel threads do not follow the normal synchronization rules for pcb_flags. Specifically, new kernel processes are always forked from thread0, not from curthread, so adjusting pcb_flags via a simple instruction without the LOCK prefix can race with thread0 running on another CPU. Similarly, kthread_add() clones from the first thread in the relevant kernel process, not from curthread. In practice, Netflix encountered a panic where the pcb_flags in the first kthread of the KTLS process were trashed due to update_pcb_bases() in cpu_copy_thread() running from thread0 to create one of the other KTLS threads racing with the first KTLS kthread calling fpu_kern_thread() on another CPU. In the panicking case, the write to update pcb_flags in fpu_kern_thread() was lost triggering an "Unregistered use of FPU in kernel" panic when the first KTLS kthread later tried to use the FPU. Reported by: gallatin Discussed with: kib MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D29023
* amd64 pmap: convert to counter(9), add PV and pagetable page countsJason A. Harmening2021-03-093-123/+155
| | | | | | | | | | | | | | | | | | | This change converts most of the counters in the amd64 pmap from global atomics to scalable counter(9) counters. Per discussion with kib@, it also removes the handrolled per-CPU PCID save count as it isn't considered generally useful. The bulk of these counters remain guarded by PV_STATS, as it seems unlikely that they will be useful outside of very specific debugging scenarios. However, this change does add two new counters that are available without PV_STATS. pt_page_count and pv_page_count track the number of active physical-to-virtual list pages and page table pages, respectively. These will be useful in evaluating the memory footprint of pmap structures under various workloads, which will help to guide future changes in this area. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D28923
* Rename _cscan_atomic.h and _cscan_bus.h to atomic_san.h and bus_san.hMark Johnston2021-03-081-1/+1
| | | | | | | | | | | | Other kernel sanitizers (KMSAN, KASAN) require interceptors as well, so put these in a more generic place as a step towards importing the other sanitizers. No functional change intended. MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29103
* Only set delayed inval for procs using PTIEric van Gyzen2021-03-051-1/+2
| | | | | | | | | | | | invltlb_invpcid_pti_handler() was requesting delayed TLB invalidation even for processes that aren't using PTI. With an out-of-tree change to avoid PTI for non-jailed root processes, this caused an assertion failure in pmap_activate_sw_pcid_pti() when context-switching between PTI and non-PTI processes. Reviewed by: bdrewery kib tychon Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D29094
* acpi: Make nexus_acpi quiet on amd64 and i386Mark Johnston2021-03-051-1/+1
| | | | | | | | | | Otherwise during attach newbus prints "nexus0", which is not very useful. The generic nexus device is already quiet, as is nexus_acpi on arm64. MFC after: 1 week Sponsored by: The FreeBSD Foundation
* pmap: Fix largemap restart checks in the kernel_maps sysctl handlerMark Johnston2021-02-251-6/+18
| | | | | | | | | | | | | | The purpose of these checks is to ensure that the address of the next-level page table page is valid, since nothing is synchronizing with a concurrent update of the large map and large map PTPs are freed to the system. However, if PG_PS is set, there is no next level. Reported by: rpokala Reviewed by: kib Tested by: rpokala MFC after: 3 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28922
* Limit when we call DELAY from KCSAN on amd64Andrew Turner2021-02-251-1/+11
| | | | | | | | | | | | In some cases the DELAY implementation on amd64 can recurse on a spin mutex in the i8254 early delay code. Detect when this is going to happen and don't call delay in this case. It is safe to not delay here with the only issue being KCSAN may not detect data races. Reviewed by: kib Tested by: arichardson Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D28895
* smbios: Move smbios driver out from x86 machdep codeAllan Jude2021-02-233-32/+2
| | | | | | | | | Add it to the x86 GENERIC and MINIMAL kernels Sponsored by: Ampere Computing LLC Submitted by: Klara Inc. Reviewed by: rpokala Differential Revision: https://reviews.freebsd.org/D28738
* amd64: implement strlen in assembly, take 2Mateusz Guzik2021-02-211-0/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tested with glibc test suite. The C variant in libkern performs excessive branching to find the zero byte instead of using the bsfq instruction. The same code patched to use it is still slower than the routine implemented here as the compiler keeps neglecting to perform certain optimizations (like using leaq). On top of that the routine can be used as a starting point for copyinstr which operates on words intead of bytes. The previous attempt had an instance of swapped operands to andq when dealing with fully aligned case, which had a side effect of breaking the code for certain corner cases. Noted by jrtc27. Sample results: $(perl -e "print 'A' x 3"): stock: 211198039 patched:338626619 asm: 465609618 $(perl -e "print 'A' x 100"): stock: 83151997 patched: 98285919 asm: 120719888 Reviewed by: jhb, kib Differential Revision: https://reviews.freebsd.org/D28779
* Add a VA_IS_CLEANMAP() macro.John Baldwin2021-02-181-3/+2
| | | | | | | | | | | | | | This macro returns true if a provided virtual address is contained in the kernel's clean submap. In CHERI kernels, the buffer cache and transient I/O map are allocated as separate regions. Abstracting this check reduces the diff relative to FreeBSD. It is perhaps slightly more readable as well. Reviewed by: kib Obtained from: CheriBSD Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D28710
* linux: Unmap the VDSO page when unloadingMark Johnston2021-02-162-2/+4
| | | | | | | | | | | linux_shared_page_init() creates an object and grabs and maps a single page to back the VDSO. When destroying the VDSO object, we failed to destroy the mapping and free KVA. Fix this. Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28696
* xen/boot: allow specifying boot method when booted from XenRoger Pau Monné2021-02-161-4/+0
| | | | | | | | | | | | | Allow setting the bootmethod variable from the Xen PVH entry point, in order to be able to correctly set the underlying firmware mode when booted as a dom0. Move the bootmethod variable to be defined in x86/cpu_machdep.c instead so it can be shared by both i386 and amd64. Sponsored by: Citrix Systems R&D Reviewed by: kib Differential revision: https://reviews.freebsd.org/D28619
* linux: drop unneeded castsEdward Tomasz Napierala2021-02-151-3/+3
| | | | | | | No functional changes. Sponsored By: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28533
* Revert "amd64: implement strlen in assembly"Mateusz Guzik2021-02-091-66/+0
| | | | | | | | | | | This reverts commit af366d353b84bdc4e730f0fc563853abc338271c. Trips over '\xa4' byte and terminates early, as found in lib/libc/gen/setdomainname_test:setdomainname_basic testcase However, keep moving libkern/strlen.c out of conf/files. Reported by: lwhsu
* amd64: fix up a braino in strlen commentMateusz Guzik2021-02-081-1/+1
|
* amd64: implement strlen in assemblyMateusz Guzik2021-02-081-0/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The C variant in libkern performs excessive branching to find the non-zero byte instead of using the bsfq instruction. The same code patched to use it is still slower than the routine implemented here as the compiler keeps neglecting to perform certain optimizations (like using leaq). On top of that the routine can is a starting point for copyinstr which operates on words instead of bytes. Tested with glibc test suite. Sample results (calls/s): Haswell: $(perl -e "print 'A' x 3"): stock: 211198039 patched:338626619 asm: 465609618 $(perl -e "print 'A' x 100"): stock: 83151997 patched: 98285919 asm: 120719888 AMD EPYC 7R32: $(perl -e "print 'A' x 3"): stock: 282523617 asm: 491498172 $(perl -e "print 'A' x 100"): stock: 114857172 asm: 112082057
* amd64 GENERIC: compile in mlx5en(4)Konstantin Belousov2021-02-051-0/+9
| | | | | | | Reviewed by: hselasky, manu Sponsored by: NVidia Networking/Mellanox Technologies MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D28469
* Add a comment notifying that "device axp" requires miibus for build.Muhammad Moinur Rahman2021-02-042-1/+3
| | | | | | | | | Although if RJ-45 interface is not being used the miibus is not required but miibus is a build time dependency. Reviewed by: imp, manu, rajesh1.kumar@amd.com Approved by: imp, manu, rajesh1.kumar@amd.com Differential Revision: https://reviews.freebsd.org/D28465
* bhyve/ioapic: improve the tracking of IRR bitRoger Pau Monné2021-02-021-4/+18
| | | | | | | | | | | | | | | One common method of EOI'ing an interrupt at the IO-APIC level is to switch the pin to edge triggering mode and then back into level mode. That would cause the IRR bit to be cleared and thus further interrupts to be injected. FreeBSD does indeed use that method if the IO-APIC EOI register is not supported. The bhyve IO-APIC emulation code didn't clear the IRR bit when doing that switch, and was also missing acknowledging the IRR state when trying to inject an interrupt in vioapic_send_intr. Reviewed by: grehan Differential revision: https://reviews.freebsd.org/D28238
* bhyve/ioapic: only account for asserted line in level modeRoger Pau Monné2021-02-021-0/+2
| | | | | | | | | | After modifying a redirection entry only try to inject an interrupt if the pin is in level mode, pins in edge mode shouldn't take into account the line assert status as they are triggered by edge changes, not the line status itself. Reviewed by: grehan Differential revision: https://reviews.freebsd.org/D28237
* bhyve/vioapic: remove an extra pin masked checkRoger Pau Monné2021-02-021-3/+1
| | | | | | | | | | | vioapic_send_intr does already check whether the pin is masked before injecting the interrupt, there's no need to do it in vioapic_write also. No functional change intended. Reviewed by: grehan Differential revision: https://reviews.freebsd.org/D28236
* amd64: use compiler intrinsics for bsf* and bsr*Mateusz Guzik2021-02-011-32/+4
|
* amd64: move memcmp checks upfrontMateusz Guzik2021-01-311-23/+29
| | | | | | | | | | | | | | | This is a tradeoff which saves jumps for smaller sizes while making the 8-16 range slower (roughly in line with the other cases). Tested with glibc test suite. For example size 3 (most common with vfs namecache) (ops/s): before: 407086026 after: 461391995 The regressed range of 8-16 (with 8 as example): before: 540850489 after: 461671032
* amd64: retire sse2_pagezeroMateusz Guzik2021-01-302-25/+0
| | | | | | | | All page zeroing is using temporal stores with rep movs*, the routine is unused for several years. Should a need arise for zeroing using non-temporal stores, a more optimized variant can be implemented with a more descriptive name.
* amd64: add missing ALIGN_TEXT to loops in memset and memmoveMateusz Guzik2021-01-301-0/+3
|
* Remove ndis(4) remnants from kernel configsMateusz Guzik2021-01-261-5/+0
| | | | Unbreaks LINT kernels.
* linux: map EBUSY returned by ptrace into ESRCHEdward Tomasz Napierala2021-01-191-2/+6
| | | | | | | The ptrace(2) Linux man page claims the syscall returns ESRCH, if the tracee is not stopped; the native ptrace(2) returns EBUSY. Sponsored by: The FreeBSD Foundation
* linux: fix PTRACE_POKEDATA and PTRACE_POKETEXT.Edward Tomasz Napierala2021-01-191-2/+7
| | | | Sponsored by: The FreeBSD Foundation
* KTLS: Enable KERN_TLS in GENERIC on amd64Andrew Gallatin2021-01-181-0/+1
| | | | | | | | | | | | | | | Based on discussions on freebsd-arch@, enable KERN_TLS in GENERIC on amd64, but leave it disabled via the sysctl kern.ipc.tls.enable. Users wishing to enable ktls must set kern.ipc.tls.enable=1 While here, fix wording in NOTES to mention that KERN_TLS also does receive now. Sponsored by: Netflix Reviewed by: allanjude Differential Revision: https://reviews.freebsd.org/D28163
* hid: Replace USBHID_ENABLED kernel config option with loader tunableVladimir Kondratyev2021-01-141-3/+0
| | | | | | | | | | | | usbhid(4) is disabled by default to avoid conflicts with existing USB HID drivers. To enable it place following lines to /boot/loader.conf: hw.usb.usbhid.enable=1 usbhid_load="YES" Suggested by: jhb Reviewed by: hselasky Differential revision: https://reviews.freebsd.org/D28124
* Split out the NODEBUG options to a common fileAndrew Turner2021-01-141-12/+1
| | | | | | | | This is the superset of the nooptions found in the -DEBUG kernels. Reviewed by: emaste, manu Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D28152
* amd64: use builtins for all ffs* variantsMateusz Guzik2021-01-141-14/+3
| | | | While here even up whitespace.
* Enable accelerated AES-XTS software crypto in GENERIC.John Baldwin2021-01-131-0/+1
| | | | | | | | | | | | | | | In particular, using GELI on a root filesystem will only use accelerated software crypto drivers if they are available before the root filesystem is mounted. While these modules can be loaded from the loader, including them in GENERIC provides a better out-of-the-box experience for users. Both aesni(4) and armv8crypto(4) provide accelerated implementations of the default cipher used by GELI (AES-XTS) in addition to other ciphers. Reviewed by: mhorne, allanjude, markj Differential Revision: https://reviews.freebsd.org/D28100
* Convert remaining cap_rights_init users to cap_rights_init_oneMateusz Guzik2021-01-121-1/+2
| | | | | | | | | | | | | semantic patch: @@ expression rights, r; @@ - cap_rights_init(&rights, r) + cap_rights_init_one(&rights, r)
* amd64: fix tlb shootdown when all cpus are passed in the bitmapMateusz Guzik2021-01-121-9/+6
| | | | | | | | | | | | | Right now the routine leaves the current CPU in the map, later tripping on an assert when filling in the scoreboard: panic: IPI scoreboard is zero, initiator 1 target 1 Instead pre-check if all CPUs are present in the map and remember that outcome for later. Fixes: 7eaea04a5bb1dc86 ("amd64: compare TLB shootdown target to all_cpus") Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D28111
* amd64: compare TLB shootdown target to all_cpusAndrew Gallatin2021-01-121-2/+2
| | | | | | | | | | | | | | | | | | | | On amd64, the pmap code passes all_cpus to smp_targeted_tlb_shootdown() when unmapping from the kernel pmap. This function has an optimized path to send IPIs to all but itself, which it intends to do when the target is all cpus. However, we need to compare the target cpu mask with all_cpus, rather than using CPU_ISFULLSET(). Comparing with CPU_ISFULLSET() will only work when we have MAXCPU cpus active in the system, otherwise, we'll be sending repeated IPIs, rather than a single IPI to all CPUs but ourself. Fixing this should reduce the time spent in native_lapic_ipi_wait() as we will be sending ipis in parallel, rather than one-by-one. This is confirmed by dtrace. Reviewed by: alc, jhb, kib, markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D28102
* amd64 pmap: do not sleep in pmap_allocpte_alloc() with zero referenced page ↵Konstantin Belousov2021-01-111-31/+35
| | | | | | | | | | | | | | | | | | table page. Otherwise parallel pmap_allocpte_alloc() for nearby va might also fail allocating page table page and free the page under us. The end result is that we could dereference unmapped pte when doing cleanup after sleep. Instead, on allocation failure, first free everything, only then we can drop pmap mutex and sleep safely, right before returning to caller. Split inner non-sleepable part of the pmap_allocpte_alloc() into a new helper pmap_allocpte_nosleep(). Reviewed by: markj Reported and tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27956
* amd64 pmap: rename _pmap_allocpte() to pmap_allocpte_alloc().Konstantin Belousov2021-01-111-16/+16
| | | | | | | | | | | | The function performs actual allocation of pte, as opposed to pmap_allocpte() that uses existing free pte if pt page is already there. This also moves function out of namespace similar to a language reserved. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27956
* amd64 pmap: Remove wrong __unused annotation from the va argument.Konstantin Belousov2021-01-111-1/+1
| | | | | | | | Noted by: alc Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27956
* amd64 pmap: fix NULL deref in pmap_mincore().Konstantin Belousov2021-01-111-0/+3
| | | | | | | | | pmap_pdpe() might return NULL, check for it. Reviewed by: markj Reported and tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27956
* xen/privcmd: implement the dm op ioctlRoger Pau Monne2021-01-111-0/+7
| | | | | | | | | | | Use an interface compatible with the Linux one so that the user-space libraries already using the Linux interface can be used without much modifications. This allows user-space to make use of the dm_op family of hypercalls, which are used by device models. Sponsored by: Citrix Systems R&D
* Prefer the use of vm_page_domain() to vm_phys_domain().Alan Cox2021-01-101-2/+2
| | | | | | | | | When we already have the vm page in hand, use vm_page_domain() instead of vm_phys_domain(). The former has a trivial constant-time implementation whereas the latter iterates over the mem_affinity array. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D28005
* hid: Add recently imported drivers to NOTESVladimir Kondratyev2021-01-101-1/+7
| | | | | Reviewed by: hselasky Differential revision: https://reviews.freebsd.org/D28060
* x86: Add rdtscp32() into cpufunc.h.Konstantin Belousov2021-01-101-0/+9
| | | | | | | Suggested by: markj MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27986
* amd64 pmap: add comment explaining TLB invalidation modes.Konstantin Belousov2021-01-101-17/+144
| | | | | | | Requested and reviewed by: alc Discussed with: markj Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25815
* pccard: Remove wi(4) driverWarner Losh2021-01-081-1/+0
| | | | | | | | | Remove wi(4). pccard is going away, and wi only supports PC Card devices, though it has a minor amount of glue to also support PCI cards. However, removing the one without removing the other is hard, so the whole driver is being removed. Relnotes: Yes