aboutsummaryrefslogtreecommitdiff
path: root/sys/amd64
Commit message (Collapse)AuthorAgeFilesLines
* amd64/vmm.c: Fix an incorrect memory segment check in vm_iommu_{un}mapBojan Novković5 days1-4/+4
| | | | | | | | | | | | | This change fixes two checks that conflated memory mapping and memory segment idenitifers. In both cases the code iterates over all memory mappings but passes the index to `vm_memseg_sysmem`, which is wrong. Fix this by passing the memory mapping's segment identifier instead. Differential Revision: https://reviews.freebsd.org/D54210 Reviewed by: markj Fixes: c76c2a19ae37 PR: 290920
* Add sys/_align.h replacing machine/_align.hBrooks Davis10 days2-6/+1
| | | | | | | | | | | | | | | | Define _ALIGNBYTES using sizeof(void *) (no functional change on any existing architecture) which will allow it to work with CHERI were we must align things up to capability alignment. In _ALIGN, replace integer manipulation which does not preserve pointer provenance with a type and provenance preserving builtin. This requires modest changes in code which assumes _ALIGN returns an integer, but those are relatively rare. Reviewed by: kib, markj Effort: CHERI upstreaming Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D53947
* sys: RealTek -> Realtekykla2025-11-271-2/+2
| | | | | | | | | | | Realtek changed how it styled its name 25 or so years ago, but the old style persisted in many places. These products use the new styling in their datasheets. Signed-off-by: ykla yklaxds@gmail.com Sponsored by: Chinese FreeBSD Community Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/1901
* NOTES: Remove duplicate options KCSAN entriesykla2025-11-251-1/+0
| | | | | | | Signed-off-by: ykla yklaxds@gmail.com Sponsored by: Chinese FreeBSD Community Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/1900
* vmm: Initialize AMD IOMMU command buffersChuck Tuffli2025-11-121-9/+1
| | | | | | | | | | | | | | | | | The driver communicates with the AMD IOMMU by writing to the tail of a fixed length command ring buffer. After issuing cmd_max commands, the tail pointer wraps back to the beginning of the ring buffer. Now, each command buffer entry will contain content from previous commands which may set bits in fields marked as Reserved for the current command. In some cases, the hardware will return an ILLEGAL_COMMAND_ERROR event when this occurs. Fix is to memset the command buffer prior to use. PR: 270966 Reviewed by: corvink, kib, markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D53692
* random: allow disabling of entropy harvesting from keyboard & miceDavid E. O'Brien2025-11-112-0/+4
| | | | | | Reviewed by: jmg Sponsored by: Juniper Networks Differential Revision: https://reviews.freebsd.org/D53390
* random: TPM_HARVEST should have been named RANDOM_ENABLE_TPMDavid E. O'Brien2025-11-102-2/+10
| | | | | | | | | | | * Enable RANDOM_ENABLE_TPM by default * The commit of TPM_HARVEST failed to add it to NOTES so that the LINT kernel would build the code. Fixes: 4ee7d3b0118c82e651712bb65da53d08e78cd7b1 Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D53460
* vmm: Move vm_maxcpu handling into MI codeMark Johnston2025-11-043-25/+1
| | | | | | | | | | No functional change intended. Reviewed by: corvink MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D53477
* amd64/vmm: Remove an unused functionMark Johnston2025-11-042-13/+0
| | | | | | | | Reviewed by: corvink, emaste MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D53423
* vmm: Consolidate VM name length checkingMark Johnston2025-11-043-30/+2
| | | | | | | | | | | | | | | | | | | vm_create() is only called from one place. Rather than having similar checks everywhere, move them to vmmdev_create(). We can safely assume that the name is nul-terminated, the vmmctl ioctl handler and the legacy sysctl handler ensure this. So, don't bother with strnlen(). Finally, make sure that the name buffers are the same size on all platforms. VM_MAX_NAMELEN is supposed to be the maximum, not including the nul terminator. Reviewed by: corvink MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D53422
* vmm: Move the module load handler to vmm_dev.cMark Johnston2025-11-041-73/+10
| | | | | | | | | | | | | | | | | Move the vmm_initialized check out of vm_create() and into the legacy sysctl handler. If vmm_initialized is false, /dev/vmmctl will not be available and so cannot be used to create VMs. Introduce new MD vmm_modinit() and vmm_modcleanup() routines which handle MD (de)initialization. No functional change intended. Reviewed by: corvink MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D53421
* amd64/vmm: Remove useless global variablesMark Johnston2025-11-041-8/+2
| | | | | | | | | | No functional change intended. Reviewed by: corvink, jhb, emaste MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D53420
* amd64/vmm: Factor vcpu_notify_event() into two functionsMark Johnston2025-11-044-21/+28
| | | | | | | | | | | | | | | | | | | | vcpu_notify_event() previously took a boolean parameter which determines whether the implementation should try to use a posted interrupt. On arm64 and riscv, the implementation of vcpu_notify_event() is otherwise identical to that of amd64. With the aim of deduplicating vcpu state management code, introduce a separate amd64-only function which tries to use posted interrupts. This requires some duplication with vcpu_notify_event_locked(), but only a little bit. Then, fix up callers. No functional change intended. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D53419
* vmm: Fix routines which create maps of the guest physical address spaceMark Johnston2025-10-283-35/+38
| | | | | | | | | | | | | | | | | | | In vm_mmap_memseg(), use vm_map_insert() instead of vm_map_find(). Existing callers expect to map the GPA that they passed, whereas vm_map_find() merely treats the GPA as a hint. Also check for overflow and remove a test for first < 0 since "first" is unsigned. In vmm_mmio_alloc(), return an error number instead of an object pointer, since the sole caller doesn't need the pointer. As in vm_mmap_memseg(), use vm_map_insert() instead of vm_map_find() and validate parameters. This function is not directly reachable via ioctl(), but we ought to be careful anyway. Reviewed by: corvink, kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D53246
* amd64: Add kexec supportJustin Hibbits2025-10-274-0/+444
| | | | | | | | | | | The biggest difference between this and arm64 kexec is that we can't disable the MMU for amd64, we have to instead create a new "safe" page table that the trampoline and "child" kernel can use. This requires a lot more work to create identity mappings, etc. Reviewed by: kib Sponsored by: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D51623
* amd64: Add cpu_stop() support to go UP after SMPJustin Hibbits2025-10-273-0/+25
| | | | | | Reviewed by: kib Sponsored by: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D51622
* amd64: print 'EFI RT fault' line before fault CPU stateKonstantin Belousov2025-10-241-2/+2
| | | | | | Suggested by: arrowd Sponsored by: The FreeBSD Foundation MFC after: 3 days
* padlock(4)/nehemiah: move i386-only entropy source to MD filesDavid E. O'Brien2025-10-232-2/+0
| | | | | Reviewed by: khng Differential Revision: https://reviews.freebsd.org/D53309
* vmm: Move local variables into ioctl handlersMark Johnston2025-10-211-86/+151
| | | | | | | | | | | Make the ioctl handlers easy to read by moving local variables into per-ioctl blocks. No functional change intended. Reviewed by: corvink, emaste MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D53145
* vmm: Add PRIV_DRIVER checks for passthru ioctlsMark Johnston2025-10-211-7/+11
| | | | | | | | | | | In preparation for allowing non-root users to create and access bhyve VMs, add privilege checks for ioctls which operate on passthru devices. Reviewed by: corvink MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D53144
* vmm: Improve register get/set handling a bitMark Johnston2025-10-211-1/+2
| | | | | | | | | | | | | | | | | On non-amd64 platforms, check for negative register indices. This isn't required today since we match against individual register indices, but we might as well check it. On amd64, add a comment explaining why we permit negative register indices. Use mallocarray() for allocating register arrays in the ioctl layer. No functional change intended. Reviewed by: corvink MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D53143
* sgx: Migrate to use macro LINUX_IOCTL_SET to register linux ioctl handlerZhenlei Huang2025-10-201-10/+1
| | | | | | Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D53158
* pt: Switch to swi(9)Bojan Novković2025-10-171-98/+123
| | | | | | | | | | | | | | | The pt hwt(4) backend uses NMIs to receive updates about the latest t racing buffer offsets from the tracing hardware. However, it uses taskqueue(9) to schedule the bottom-half handler. This can lead to a panic since the taskqueue(9) code isn't aware it's being called from an NMI context and uses the regular scheduling interfaces. Fix this by scheduling the bottom-half handler using swi(9) and the SWI_FROMNMI flag. Fixes: 310162ea218a MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D52491
* vmm: Fix a deadlock between vm_smp_rendezvous() and vcpu_lock_all()Mark Johnston2025-10-172-30/+150
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vm_smp_rendezvous() invokes a callback on all vCPUs, blocking the initiator until all vCPUs have responded. vcpu_lock_all() blocks each vCPU by waiting for it to go idle and setting the vCPU state to frozen. These two operations can deadlock on each other, particularly when booting a Windows guest, when vcpu_lock_all() blocks waiting for a rendezvous initiator, and the initiator is blocked waiting for the vCPU thread which called vcpu_lock_all() to invoke the rendezvous callback. Implement vcpu_lock_all() in a way that avoids deadlocks with vm_smp_rendezvous(). In particular, when traversing vCPUs, invoke the rendezvous callback on the vCPU's behalf to help the initiator finish. We can only safely do so when the vCPU is IDLE or we have already locked it, otherwise we may be racing with the target vCPU thread. Thus: - Use an exclusive lock to serialize vcpu_lock_all() callers, which lets us lock vCPUs out of order without fear of deadlock with parallel vcpu_lock_all() callers. - If a rendezvous is pending, lock all idle vCPUs and invoke the callback on their behalf. If the vcpu_lock_all() caller is itself a vCPU thread, this will handle that thread. - Block waiting for all non-idle vCPUs to idle, or until one of them initiates a rendezvous, in which case we go back and invoke callbacks on behalf of already-locked vCPUs. Note that on !amd64 no changes are needed since there is no rendezvous mechanism, so there is a separate vcpu_set_state_all() for them based on the previous vcpu_lock_all(). These will be merged together once vcpu state handling is consolidated into sys/dev/vmm. Reviewed by: corvink (previous version) MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D52968
* imgact: Mark brandinfo and note structures as constMark Johnston2025-10-143-20/+18
| | | | | | | | No functional change intended. Reviewed by: olce, kib, emaste MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D53062
* vmm: Move the guest vmspace into the generic vm_mem structureMark Johnston2025-10-103-28/+19
| | | | | | | | | | | | | This further consolidates handling of guest memory into MI code in sys/dev/vmm. No functional change intended. Reviewed by: corvink MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D53012
* vmm: Make vmmops declarations more consistentMark Johnston2025-10-102-46/+54
| | | | | | | | | | | | | | | | | | | - On amd64, make vmmops_* functions globally visible, as some will be called from machine-independent code in the future. - On arm64 and riscv, move declarations to vmm.h, since they're supposed to be generic across different VMM backends (only amd64 has more than one backend). - Make the declaration macros consistent with each other. - On amd64, make the function typedef names consistent with the corresponding ifunc names. No functional change intended. Reviewed by: corvink MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D53011
* power: Add stype parameter in power_suspend/resume eventhandlersAymeric Wibo2025-10-061-2/+2
| | | | | | | | | | Add enum sleep_type stype parameter in power_suspend/resume event handlers, as with the introduction of s2idle there are more than one type of suspend. Reviewed by: bz Approved by: bz Sponsored by: The FreeBSD Foundation
* amd64: bump sleepq hash size to 2048Mateusz Guzik2025-09-301-1/+10
| | | | | | | | | This is the most contended lock type during the first hour of -j 104 poudriere. Drops significantly with the change. Note there are suspicous acquires which most likely don't need to happen, artificially exacerbating tehe problem..
* u2f(4): Invert U2F_MAKE_UHID_ALIAS kernel build optionVladimir Kondratyev2025-09-251-1/+0
| | | | | | | This makes non-GENERIC kernel configs easier to maintain. Requested by: glebius MFC after: 2 days
* amd64: add wrmsr_early_safe(9)Konstantin Belousov2025-09-243-0/+53
| | | | | | | | | | | | | | The variant of wrmsr_safe(9) that might work before IDT and curpcb are initialized. Assumes BSP, and that all APs are parked. Before calling wrmsr_early_safe(), the wrmsr_early_safe_start() should be called, afterward wrmsr_early_safe_end() restores the bootenv IDT. Reviewed by: markj Tested by: glebius Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D52607
* amd64 cpufunc.h: add rcs(), to read code selectorKonstantin Belousov2025-09-241-0/+9
| | | | | | | Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D52607
* x86: directly use clflushopt mnemonic in cpufunc.hKonstantin Belousov2025-09-211-1/+1
| | | | | | | | | We already use clflushopt in support.S, there is no reason to manually construct the encoding. Initially it was done because toolchains did not supported the (then) new instruction. Sponsored by: The FreeBSD Foundation MFC after: 1 week
* vmm: Suspend the VM before destroying itMark Johnston2025-09-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise we don't do anything to kick vcpu threads out of a sleep state when destroying a VM. For instance, suppose a guest executes hlt on amd64 or wfi on arm64 with interrupts disabled. Then, bhyvectl --destroy will hang until the vcpu thread somehow comes out of vm_handle_hlt()/vm_handle_wfi() since destroy_dev() is waiting for vCPU threads to drain. Note that on amd64, if hw.vmm.halt_detection is set to 1 (the default), the guest will automatically exit in this case since it's treated as a shutdown. But, the above should not hang if halt_detection is set to 0. Here, vm_suspend() wakes up vcpu threads, and a subsequent attempt to run the vCPU will result in an error which gets propagated to userspace, allowing destroy_dev() to proceed. Add a new suspend code for this purpose. Modify bhyve to exit with status 4 ("exited due to an error") when it's received, since that's what'll happen generally when the VM is destroyed asynchronously. Reported by: def MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D51761
* sys: NOTES: Fix comment for wlan_* devices; GENERIC*: Re-order 'wlan_tkip'Olivier Certner2025-09-091-1/+1
| | | | | | | | | | | | | | | | | Fix the comment introducing the 'wlan_*' devices (AES-CCMP is missing) after introducing AES-GCMP. While here, re-order the devices in order of appearance of the related technologies. No functional change (intended). Reviewed by: adrian, emaste Fixes: 7bf82ea4fdda ("sys: add wlan_gcmp to GENERIC kernels as appropriate") MFC after: 3 days MFC to: stable/15 Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D52444
* sys: Rename BLOAT_KERNEL_WITH_EXTERR to EXTERR_STRINGSEd Maste2025-09-031-1/+1
| | | | | | | | | There's no need for an implied value judgement. Suggested by: jhb Reviewed by: kib, jhb Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D52351
* amd64 vmx: micro-optimize vmlaunch failure pathKonstantin Belousov2025-08-241-5/+3
| | | | | | | | | | | Eliminate two unneeded jumps. One is the jmp to the next instruction, where there is no requrement that vmlaunch is followed by jmp. Another one conditionally sets %r11d value, and can be replaced by cmovcc. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D52136
* u2f(4): a HID driver for FIDO/U2F security keysVladimir Kondratyev2025-08-171-0/+1
| | | | | | | | | | | | | | | | | While FIDO/U2F keys were already supported by the generic uhid(4) and hidraw(4) drivers, this driver adds some additional features an does steps to tighten the security of FIDO/U2F access. - It automatically loads through devd. - Automatically enables HQ_NO_READAHEAD for FIDO/U2F devices. - Implements only miminum set of features. - Do not requires external devfs configuration to set character device permissions. - Names character device as u2f/# to make possible capsicum or any other pledge()-style sandboxing. PR: 265528 Differential Revision: https://reviews.freebsd.org/D51612
* Revert "amd64: re-enable la57"Konstantin Belousov2025-08-171-1/+1
| | | | | | | This reverts commit 2abf24b3698c08c9fc906580fd5be67be65c9feb. The la57 should be not force-enabled. Sponsored by: The FreeBSD Foundation
* amd64: re-enable la57Konstantin Belousov2025-08-171-1/+1
| | | | | | | | | It benefits KVA. For userspace la57 is disabled by default for quite some time, to avoid compat issues. Reviewed by: alc, imp, olce Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D51929
* amd64 GENERIC: Add ufshciJaeyoon Choi2025-08-161-0/+3
| | | | | | Sponsored by: Samsung Electronics Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D51507
* bhyve: Support and advertise 15-bit MSI Extended Destination IDDavid Woodhouse2025-08-133-1/+15
| | | | | | | | | | | To support guests with more than 255 vCPUs, allow bits 5-11 of the MSI address to be used as additional destination ID bits. This is compatible with Hyper-V, KVM and Xen's implementation of the same enlightenment, as documented at http://david.woodhou.se/ExtDestId.pdf Reviewed by: kib Pull Request: https://github.com/freebsd/freebsd-src/pull/1797 Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
* bhyve: Add CPUID_BHYVE_FEATURES leafDavid Woodhouse2025-08-131-3/+15
| | | | | | | | | | | | | | This allows the hypervisor to advertise features to the guest. The first such feature is CPUID_BHYVE_EXT_DEST_ID which advertises that 15 bits of target APIC ID are available in MSI (and I/O APIC) interrupts, as documented in http://david.woodhou.se/ExtDestId.pdf This defines the guest ABI. The actual implementation will come in a subsequent commit. Reviewed by: kib Pull Request: https://github.com/freebsd/freebsd-src/pull/1797 Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
* amd64/pmap: include opt_kstack_pages.hKa Ho Ng2025-08-031-0/+1
| | | | | | | | | | | This fixes an early KASAN initialization panic in pmap_san_enter_early_alloc_4k, when a non-default value is specified for KSTACK_PAGES in the build config file. Sponsored by: Juniper Networks, Inc. MFC after: 7 days Reviewed by: des, markj Differential Revision: https://reviews.freebsd.org/D51709
* Revert "amd64: include opt_kstack_pages.h"Ka Ho Ng2025-08-022-4/+2
| | | | | | | | This reverts commit d5ec97156d3314f979629968f76151c2d35a1e62. The commit broke the build. Reported by: des
* amd64 pmap: Use INVPCID_CTXGLOB on Ryzen processorsAlan Cox2025-08-022-38/+45
| | | | | | | | | | | Recent AMD Ryzen processors support a limited form of the invpcid instruction, even when they do not support PCID functionality. In particular, they support the type 2 form of the instruction, what we call INVPCID_CTXGLOB. This is faster than toggling PGE in cr4. Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D51565
* amd64: include opt_kstack_pages.hKa Ho Ng2025-08-012-2/+4
| | | | | | | | | | | | | | | This fixes an early KASAN initialization panic in pmap_san_enter_early_alloc_4k, when a non-default value is specified for KSTACK_PAGES in the build config file. Also, rearrange amd64/locore.S's #include order to match the counterparts of other architectures. And amd64/locore.S now also explicitly include opt_kstack_pages.h as well. Sponsored by: Juniper Networks, Inc. MFC after: 7 days Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D51676
* amd64: Remove a stray syzkaller configMark Johnston2025-07-281-5/+0
| | | | | Reported by: mjg Fixes: 6efe8e6be413 ("pf: Fix a lock leak in pf_ioctl_addrule()")
* pf: Fix a lock leak in pf_ioctl_addrule()Mark Johnston2025-07-281-0/+5
| | | | | | | | | | | The ERROUT macro assumes that the rules lock is held, but some error paths arise before that lock is acquired. Introduce ERROUT_UNLOCKED for that case. Reviewed by: kp Reported by: syzkaller Fixes: cc68decda316 ("pf: Reject rules with invalid port ranges") Differential Revision: https://reviews.freebsd.org/D51571
* vmm: Add support for guest NUMA emulationBojan Novković2025-07-271-1/+6
| | | | | | | | | | | | | This change adds the necessary kernelspace bits required for supporting NUMA domains in bhyve VMs. The layout of system memory segments and how they're created has been reworked. Each guest NUMA domain will now have its own memory segment. Furthermore, this change allows users to tweak the domain's backing vm_object domainset(9) policy. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D44565