aboutsummaryrefslogtreecommitdiff
path: root/sys/sys
Commit message (Collapse)AuthorAgeFilesLines
* make maximum interrupt number tunable on ARM, ARM64, MIPS, and RISC-VOleksandr Tymoshenko2021-01-192-5/+3
| | | | | | | | | | | | Use a machdep.nirq tunable intead of compile-time constant NIRQ as a value for maximum number of interrupts. It allows keep a system footprint small by default with an option to increase the limit for large systems like server-grade ARM64 Reviewd by: mhorne Differential Revision: https://reviews.freebsd.org/D27844 Submitted by: Klara, Inc. Sponsored by: Ampere Computing
* jail: Add prison_isvalid() and prison_isalive()Jamie Gritton2021-01-181-0/+2
| | | | | | | | | | | | | | | | | | | | prison_isvalid() checks if a prison record can be used at all, i.e. pr_ref > 0. This filters out prisons that aren't fully created, and those that are either in the process of being dismantled, or will be at the next opportunity. While the check for pr_ref > 0 is simple enough to make without a convenience function, this prepares the way for other measures of prison validity. prison_isalive() checks not only validity as far as the useablity of the prison structure, but also whether the prison is visible to user space. It replaces a test for pr_uref > 0, which is currently only used within kern_jail.c, and not often there. Both of these functions also assert that either the prison mutex or allprison_lock is held, since it's generally the case that unlocked prisons aren't guaranteed to remain useable for any length of time. This isn't entirely true, for example a thread can assume its own prison is good, but most exceptions will exist inside of kern_jail.c.
* Implement malloc_domainset_aligned(9).Konstantin Belousov2021-01-171-0/+3
| | | | | | | | | | | | | | Change the power-of-two malloc zones to require alignment equal to the size [*]. Current uma allocator already provides such alignment, so in fact this change does not change anything except providing future-proof setup. Suggested by: markj [*] Reviewed by: andrew, jah, markj Tested by: pho MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28147
* Bump __FreeBSD_version after linuxkpi changesEmmanuel Vadot2021-01-171-1/+1
|
* fd: add refcount argument to falloc_noinstallMateusz Guzik2021-01-131-1/+3
| | | | | | | | This lets callers avoid atomic ops by initializing the count to required value from the get go. While here add falloc_abort to backpedal from this without having to fdrop.
* fd: add finstall_refedMateusz Guzik2021-01-131-0/+2
| | | | | Can be used to consume an already existing reference and consequently avoid atomic ops.
* fd: provide a dedicated closef variant for unix socket codeMateusz Guzik2021-01-131-0/+1
| | | | This avoids testing for td != NULL.
* vfs: add NDFREE_NOTHING and convert several NDFREE_PNBUF callersMateusz Guzik2021-01-121-0/+2
| | | | Check the comment above the routine for reasoning.
* Bump __FreeBSD_version after linuxkpi changesEmmanuel Vadot2021-01-121-1/+1
|
* lio_listio: validate aio_lio_opcodeAlan Somers2021-01-121-1/+1
| | | | | | | | | | | | | | | Previously, we would accept any kind of LIO_* opcode, including ones that were intended for in-kernel use only like LIO_SYNC (which is not defined in userland). The situation became more serious with 022ca2fc7fe08d51f33a1d23a9be49e6d132914e. After that revision, setting aio_lio_opcode to LIO_WRITEV or LIO_READV would trigger an assertion. Note that POSIX does not specify what should happen if aio_lio_opcode is invalid. MFC-with: 022ca2fc7fe08d51f33a1d23a9be49e6d132914e Reviewed by: jhb, tmunro, 0mp Differential Revision: <https://reviews.freebsd.org/D28078
* jobc: rework detection of orphaned groups.Konstantin Belousov2021-01-101-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | Instead of trying to maintain pg_jobc counter on each process group update (and sometimes before), just calculate the counter when needed. Still, for the benefit of the signal delivery code, explicitly mark orphaned groups as such with the new process group flag. This way we prevent bugs in the corner cases where updates to the counter were missed due to complicated configuration of p_pptr/p_opptr/real_parent (debugger). Since we need to iterate over all children of the process on exit, this change mostly affects the process group entry and leave, where we need to iterate all process group members to detect orpaned status. (For MFC, keep pg_jobc around but unused). Reported by: jhb Reviewed by: jilles Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27871
* pgrp: Prevent use after free.Konstantin Belousov2021-01-101-1/+1
| | | | | | | | | | | | | | | | Often, we have a process locked and need to get locked process group. In this case, because progress group lock is before process lock, unlocking process allows the group to be freed. See for instance tty_wait_background(). Make pgrp structures allocated from nofree zone, and ensure type stability of the pgrp mutex. Reviewed by: jilles Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27871
* efidev: remove EFIIOC_GET_TABLE ioctlKyle Evans2021-01-081-7/+0
| | | | | | | | | | | | | | | | | This ioctl would instantly induce a panic, likely since near inception, up until 0861c7d3e048. Lack of previous interest in fixing it combined with the problematic interface (exports a pointer, really a physical address) brings us to the natural conclusion: remove it until a useful consumer forward. If it eventually gets resurrected, the interface should definitely not return in this exact form and likely needs to be reimagined. The associated KPI, efi_get_table, is left intact for the time being. Reviewed by: imp, jrtc27 Also discussed with: brooks, jhb Differential Revision: https://reviews.freebsd.org/D28030
* Move the PMC overflow count to make it per-CPUAndrew Turner2021-01-081-1/+1
| | | | | | | | | | Virtual PMCs could be running on multiple CPUs so this needs to be a per-CPU value. Submitted by: rwatson (earlier version) Reviewed by: gnn Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D27973
* Fix conflicting value of O_DSYNC.Thomas Munro2021-01-081-1/+1
| | | | | | | O_RESOLVE_BENEATH recently took value 0x00800000, but I failed to spot that while rebasing. Let's use 0x01000000 for the new O_DSYNC flag. Reported by: kevans
* Regenerate syscall files after reallocation of aio_writev/aio_readvAlan Somers2021-01-083-16/+16
|
* aio_fsync(2): Support O_DSYNC.Thomas Munro2021-01-081-0/+1
| | | | | | | aio_fsync(O_DSYNC, ...) is the asynchronous version of fdatasync(2). Reviewed by: kib, asomers, jhb Differential Review: https://reviews.freebsd.org/D25071
* open(2): Add O_DSYNC flag.Thomas Munro2021-01-082-6/+10
| | | | | | | | | | | | | | | POSIX O_DSYNC means that writes include an implicit fdatasync(2), just as O_SYNC implies fsync(2). VOP_WRITE() functions that understand the new IO_DATASYNC flag can act accordingly, but we'll still pass down IO_SYNC so that file systems that don't understand it will continue to provide the stronger O_SYNC behaviour. Flag also applies to fcntl(2). Reviewed by: kib, delphij Differential Revision: https://reviews.freebsd.org/D25090
* hid: Add UPDATING entry and bump __FreeBSD_versionVladimir Kondratyev2021-01-071-1/+1
| | | | | Reviewed by: hselasky Differential revision: https://reviews.freebsd.org/D28019
* libkern/strcasestr.c: Drop xlocale support and connect to build.Vladimir Kondratyev2021-01-071-0/+1
| | | | | Reviewed by: markj, hselasky Differential revision: https://reviews.freebsd.org/D27866
* vfs: add vn_seqc_read_notmodifyMateusz Guzik2021-01-061-0/+1
|
* seqc: add seqc_read_notmodifyMateusz Guzik2021-01-061-0/+7
| | | | | | The routine can be used when the caller does not expect to ever have to wait for anything. Checking later with seqc_consistent retains all the guarantees.
* cache: combine fast path enabled status into one flagMateusz Guzik2021-01-061-0/+1
| | | | Tested by: pho
* Add missing structs to pmclog_entryAndrew Turner2021-01-051-0/+3
| | | | | | | This is used to size allocations so needs to incude all log structs to ensure its size is correct. Sponsored by: Innovate UK
* vfs: denote vnode being a mount point with VIRF_MOUNTPOINTMateusz Guzik2021-01-031-0/+1
| | | | | Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D27794
* vfs: add v_irflag accessorsMateusz Guzik2021-01-031-1/+9
| | | | | Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D27793
* Regenerate syscall files after addition of aio_writev/aio_readvAlan Somers2021-01-033-0/+14
|
* Add aio_writev and aio_readvAlan Somers2021-01-031-1/+21
| | | | | | | | | | | | | | POSIX AIO is great, but it lacks vectored I/O functions. This commit fixes that shortcoming by adding aio_writev and aio_readv. They aren't part of the standard, but they're an obvious extension. They work just like their synchronous equivalents pwritev and preadv. It isn't yet possible to use vectored aiocbs with lio_listio, but that could be added in the future. Reviewed by: jhb, kib, bcr Relnotes: yes Differential Revision: https://reviews.freebsd.org/D27743
* loader: implement framebuffer consoleToomas Soome2021-01-021-3/+4
| | | | | | | | | | | | | | | | | | | | | | | Draw console on efi. Add vbe framebuffer for BIOS loader (vbe off, vbe on, vbe list, vbe set xxx). autoload font (/boot/fonts) based on resolution and font size. Add command loadfont (set font by file) and variable screen.font (set font by size). Pass loaded font to kernel. Export variables: screen.height screen.width screen.depth Add gfx primitives to draw the screen and put png image on the screen. Rework menu draw to iterate list of consoles to enamble device specific output. Probably something else I forgot... Relnotes: yes Differential Revision: https://reviews.freebsd.org/D27420
* fd: inline pwd_get_smrMateusz Guzik2021-01-011-1/+1
| | | | Tested by: pho
* bitset: implement BIT_TEST_CLR_ATOMIC & BIT_TEST_SET_ATOMICRyan Libby2020-12-311-0/+14
| | | | | | | | | | That is, provide wrappers around the atomic_testandclear and atomic_testandset primitives. Submitted by: jeff Reviewed by: cem, kib, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22702
* copyrights: Happy New Year 2021Glen Barber2020-12-311-4/+2
| | | | | | Good riddance 2020. Sponsored by: Rubicon Communications, LLC (netgate.com)
* kern: efirt: correct configuration table entry sizeKyle Evans2020-12-291-1/+1
| | | | | | | | | | | | | Each entry actually stores a native pointer, not a uint64_t quantity. While we're here, go ahead and export the pointer as-is rather than converting it to KVA. This may be more useful as consumers can map /dev/mem and observe the entry. For reference, see: sys/contrib/edk2/Include/Uefi/UefiSpec.h Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D27669
* Correct font.h comment describing vfnt font mapsEd Maste2020-12-281-5/+7
| | | | | | | | | Commit 41fb06651122 doubled the number of glyph maps in the vfnt format from 2 to 4 to support double-width characters, but a comment describing the maps was not updated to match. MFC after: 1 week Sponsored by: The FreeBSD Foundation
* vfs: add FAILIFEXISTS flagMateusz Guzik2020-12-281-1/+1
| | | | | | | | | | | | | | | | | Both FreeBSD and Linux mkdir -p walk the tree up ignoring any EEXIST on the way and both are used a lot when building respective kernels. This poses a problem as spurious locking avoidably interferes with concurrent operations like getdirentries on affected directories. Work around the problem by adding FAILIFEXISTS flag. In case of lockless lookup this manages to avoid any work to begin with, there is no speed up for the locked case but perhaps this can be augmented later on. For simplicity the only supported semantics are as used by mkdir. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D27789
* Regen.Konstantin Belousov2020-12-273-2/+11
|
* Expose eventfd in the native API/ABI using a new __specialfd syscallKonstantin Belousov2020-12-275-1/+103
| | | | | | | | | | | | | | | | | | | | eventfd is a Linux system call that produces special file descriptors for event notification. When porting Linux software, it is currently usually emulated by epoll-shim on top of kqueues. Unfortunately, kqueues are not passable between processes. And, as noted by the author of epoll-shim, even if they were, the library state would also have to be passed somehow. This came up when debugging strange HW video decode failures in Firefox. A native implementation would avoid these problems and help with porting Linux software. Since we now already have an eventfd implementation in the kernel (for the Linuxulator), it's pretty easy to expose it natively, which is what this patch does. Submitted by: greg@unrelenting.technology Reviewed by: markj (previous version) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D26668
* jail: Consistently handle the pr_allow bitmaskJamie Gritton2020-12-271-0/+1
| | | | | | | | | | | | | | Return a boolean (i.e. 0 or 1) from prison_allow, instead of the flag value itself, which is what sysctl expects. Add prison_set_allow(), which can set or clear a permission bit, and propagates cleared bits down to child jails. Use prison_allow() and prison_set_allow() in the various jail.allow.* sysctls, and others that depend on thoe permissions. Add locking around checking both pr_allow and pr_enforce_statfs in prison_priv_check().
* jail: Make comments on struct prison locking more preciseJamie Gritton2020-12-271-3/+5
|
* Add tcgetwinsize(3) and tcsetwinsize(3) to termiosKonstantin Belousov2020-12-252-11/+50
| | | | | | | | | | | | | | These functions get/set tty winsize respectively, and are trivial wrappers around corresponding termio ioctls. The functions are expected to be a part of POSIX.1 issue 8: https://www.austingroupbugs.net/view.php?id=1151#c3856. They are currently available in NetBSD and in musl libc. PR: 251868 Submitted by: Soumendra Ganguly <soumendraganguly@gmail.com> MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D27650
* version bump for commit 665b1365fe8e24d618d63b0d57b0b4ad39e97824Rick Macklem2020-12-231-1/+1
| | | | | | The commit changed the internal API between the NFS and kernel RPC modules. Bump the version so that all modules get rebuilt from sources.
* AIO: remove the kaiocb->bio linkageAlan Somers2020-12-231-8/+1
| | | | | | | | | | | | Vectored aio will require each aiocb to be associated with multiple bios, so we can't store a link to the latter from the former. But we don't really need to. aio_biowakeup already knows the bio it's using, and the other fields can be stored within the bio and/or buf itself. Also, remove the unused kaiocb.backend2 field. Reviewed By: kib Differential Revision: https://reviews.freebsd.org/D27682
* Add ELF flag to disable ASLR stack gap.Konstantin Belousov2020-12-182-0/+2
| | | | | | | | | | | | Also centralize and unify checks to enable ASLR stack gap in a new helper exec_stackgap(). PR: 239873 Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=368772
* proc.h: Reformat P_ and P2_ definitions.Konstantin Belousov2020-12-181-45/+66
| | | | | | | | | | | | Use traditional explicit leading zero format for hex numbers. Align P2_ hex values. Wrap long lines by splitting comments. Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=368771
* Improve handling of alternate settings in the USB stack.Hans Petter Selasky2020-12-151-1/+1
| | | | | | | | | | | | | | | | | Allow setting the alternate interface number to fail when there is only one alternate setting present, to comply with the USB specification. Refactor how iface->num_altsetting is computed. Bump the __FreeBSD_version due to change of core USB structure. PR: 251856 MFC after: 1 week Submitted by: Ma, Horse <Shichun.Ma@dell.com> Sponsored by: Mellanox Technologies // NVIDIA Networking Notes: svn path=/head/; revision=368659
* Patch annotation in sigdeferstopMateusz Guzik2020-12-131-1/+1
| | | | | | | | Probability flipped since sigdefer handling was moved away from regular VOP calls. Notes: svn path=/head/; revision=368616
* fd: fix fdrop prediction when closing a fdMateusz Guzik2020-12-131-0/+11
| | | | | | | Most of the time this is the last reference, contrary to typical fdrop use. Notes: svn path=/head/; revision=368609
* Provide userland notification of gpio pin changes ("userland gpio interrupts").Ian Lepore2020-12-121-4/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is an import of the Google Summer of Code 2018 project completed by Christian Kramer (and, sadly, ignored by us for two years now). The goals stated for that project were: FreeBSD already has support for interrupts implemented in the GPIO controller drivers of several SoCs, but there are no interfaces to take advantage of them out of user space yet. The goal of this work is to implement such an interface by providing descriptors which integrate with the common I/O system calls and multiplexing mechanisms. The initial imported code supports the following functionality: - A kernel driver that provides an interface to the user space; the existing gpioc(4) driver was enhanced with this functionality. - Implement support for the most common I/O system calls / multiplexing mechanisms: - read() Places the pin number on which the interrupt occurred in the buffer. Blocking and non-blocking behaviour supported. - poll()/select() - kqueue() - signal driven I/O. Posting SIGIO when the O_ASYNC was set. - Many-to-many relationship between pins and file descriptors. - A file descriptor can monitor several GPIO pins. - A GPIO pin can be monitored by multiple file descriptors. - Integration with gpioctl and libgpio. I added some fixes (mostly to locking) and feature enhancements on top of the original gsoc code. The feature ehancements allow the user to choose between detailed and summary event reporting. Detailed reporting provides a record describing each pin change event. Summary reporting provides the time of the first and last change of each pin, and a count of how many times it changed state since the last read(2) call. Another enhancement allows the recording of multiple state change events on multiple pins between each call to read(2) (the original code would track only a single event at a time). The phabricator review for these changes timed out without approval, but I cite it below anyway, because the review contains a series of diffs that show how I evolved the code from its original state in Christian's github repo for the gsoc project to what is being commited here. (In effect, the phab review extends the VC history back to the original code.) Submitted by: Christian Kramer Obtained from: https://github.com/ckraemer/freebsd/tree/gsoc2018 Differential Revision: https://reviews.freebsd.org/D27398 Notes: svn path=/head/; revision=368585
* Bump __FreeBSD_version for removal of crypto fd's in r368005.John Baldwin2020-12-071-1/+1
| | | | | | | | Requested by: swills Sponsored by: Chelsio Communications Notes: svn path=/head/; revision=368417
* Allow sys/refcount.h to be used by standalone builds.Hans Petter Selasky2020-12-071-1/+1
| | | | | | | | | | No functional change. MFC after: 1 week Sponsored by: Mellanox Technologies // NVIDIA Networking Notes: svn path=/head/; revision=368405