aboutsummaryrefslogtreecommitdiff
path: root/lib/libc/sys
Commit message (Collapse)AuthorAgeFilesLines
* ptrace(2): document policies affecting access to the facilityKonstantin Belousov93 min.1-1/+50
| | | | | | | Reviewed by: emaste Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D33986
* kqueue(2): Add note about format of the data for NOTE_EXITKonstantin Belousov3 days1-2/+4
| | | | | | | Noted by: Dave Baukus <daveb@spectralogic.com> PR: 261346 MFC after: 3 days Sponsored by: The FreeBSD Foundation
* Clarify the description of the EINTEGRITY error in intro(2).Kirk McKusick2021-12-291-1/+1
| | | | | | Requested by: pauamma_gundo.com Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D18765
* getfh: clarify that it is a privileged operationEd Maste2021-12-231-1/+4
| | | | | | | Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D33629
* Add idle priority scheduling privilege group to MAC/priorityFlorian Walpen2021-12-101-9/+9
| | | | | | | | | | | | | | | Add an idletime user group that allows non-root users to run processes with idle scheduling priority. Privileges are granted by a MAC policy in the mac_priority module. For this purpose, the kernel privilege PRIV_SCHED_IDPRIO was added to sys/priv.h (kernel module ABI change). Deprecate the system wide sysctl(8) knob security.bsd.unprivileged_idprio which lets any user run idle priority processes, regardless of context. While the knob is still working, it is marked as deprecated in the description and in the man pages. MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D33338
* Document new variant of swapoff(2)Konstantin Belousov2021-12-091-23/+3
| | | | | | | Reviewed by: brooks Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D33343
* swapoff: add one more variant of the syscallKonstantin Belousov2021-12-091-1/+1
| | | | | | | Requested and reviewed by: brooks Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D33343
* libc: get rid of NO_P1003_1B make variableBrooks Davis2021-12-071-8/+4
| | | | | | | | | | | There's no point in a knob to avoid installing a half dozen manpages. It's undocumented and unused in the tree. Online, the only metions I've found are the FreeBSD source tree, a commit in DragonFly BSD removing it, and some lists of build options for small systems where it's inevitably redundant due to an accompanying NO_MAN. Reviewed by: emaste Differential Revision: https://reviews.freebsd.org/D33310
* libc: Add pdfork to the list of interposed system callsMark Johnston2021-12-061-0/+1
| | | | | | | | | Otherwise the asm stub is used and libthr interposition does not work. Reviewed by: kib Fixes: 21f749da82e7 ("libthr: wrap pdfork(2), same as fork(2).") MFC after: 1 week Sponsored by: The FreeBSD Foundation
* fcntl(2): be more precise about third arg typeKonstantin Belousov2021-12-061-2/+10
| | | | | | | | | Also use the term operation consistently, over the command. Reviewed by: emaste, jhb, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D33277
* fcntl(2): add F_KINFO operationKonstantin Belousov2021-12-061-2/+15
| | | | | | | | | | | that returns struct kinfo_file for the given file descriptor. Among other data, it also returns kf_path, if file op was able to restore file path. Reviewed by: jhb, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D33277
* swapoff(2): document extended syscall argumentsKonstantin Belousov2021-12-041-1/+37
| | | | | | | | Reviewed by: markj Discussed with: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D33165
* MAC/priority module for realtime privilege groupFlorian Walpen2021-12-041-2/+7
| | | | | | | | | | | | This is a MAC policy module that grants scheduling privileges based on group membership. Users or processes in the group realtime (gid 47) are allowed to run threads and processes with realtime scheduling priority. For timing-sensitive, low-latency software like audio/jack, running with realtime priority helps to avoid stutter and gaps. PR: 239125 MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D33191
* Add sched_getcpu()Konstantin Belousov2021-11-101-0/+3
| | | | | | | | | for compatibility with Linux. Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32901
* fexecve(2): allow O_PATH file descriptors opened without O_EXECKonstantin Belousov2021-11-031-3/+0
| | | | | | | | | | This improves compatibility with Linux. Noted by: Drew DeVault <sir@cmpwn.com> Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32821
* bpf: Fix the write filter for detached descriptorsMark Johnston2021-10-261-2/+2
| | | | | | | | | | | | A BPF descriptor only has an associated interface descriptor once it is attached to an interface, e.g., with BIOCSETIF. Avoid dereferencing a NULL pointer in filt_bpfwrite() if the BPF descriptor is not attached. Reviewed by: ae Reported by: syzbot+ae45d5166afe15a5a21d@syzkaller.appspotmail.com Fixes: ded77e0237a8 ("Allow the BPF to be select for write.") Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32561
* procctl: actually require debug privileges over targetKonstantin Belousov2021-10-191-0/+8
| | | | | | | | | | | for state control over TRACE, TRAPCAP, ASLR, PROTMAX, STACKGAP, NO_NEWPRIVS, and WXMAP. Reported by: emaste Reviewed by: emaste, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32513
* procctl(2): add consistent shortcut P_ID:0 as curprocKonstantin Belousov2021-10-191-0/+2
| | | | | | | | Reported by: bdrewery, emaste Reviewed by: emaste, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32513
* Allow the BPF to be select for write. This is needed for boost:asioHartmut Brandt2021-10-101-2/+7
| | | | | | which otherwise fails to handle BPFs. Reviewed by: ae Differential Revision: https://reviews.freebsd.org/D31967
* O_PATH: allow vfs_extattr syscallsGreg V2021-10-111-1/+6
| | | | | | | | | These calls do operate on vnodes only, not file contents. This is useful for e.g. the xdg-document-portal fuse filesystem. Reviewed by: kib, markj MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D32438
* nanosleep.2: use appropriate macrosPiotr Pawel Stefaniak2021-10-111-1/+4
| | | | | Reported by: kib Fixes: bf8f6ffcb66a
* readlinkat(2): allow O_PATH fdKonstantin Belousov2021-10-091-2/+3
| | | | | | | | | PR: 258856 Reported by: ashish Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32390
* Mention kern.timecounter.alloweddeviation in nanosleep.1Piotr Pawel Stefaniak2021-10-081-1/+3
| | | | | PR: 224837 Reported by: Aleksander Derevianko
* Fix mistakes in link(2) and shm_open(2)Konstantin Belousov2021-10-062-3/+3
| | | | | | PR: 258957 Submitted by: sigsys@gmail.com MFC after: 1 week
* kqueue: clean up some igor and mandoc -Tlint warningsKyle Evans2021-10-011-4/+5
|
* kqueue: document how timers with low/past timeouts are handledKyle Evans2021-10-011-1/+7
| | | | | Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D32237
* kqueue: Add EV_KEEPUDATA flagNathaniel Wesley Filardo2021-09-241-1/+16
| | | | | | | | | | | When this flag is set, operations that update an existing kevent will not change the udata field. This can be used to NOTE_TRIGGER or EV_{EN,DIS}ABLE events without overwriting the stashed pointer. Reviewed by: Domagoj Stolfa <domagoj.stolfa@gmail.com> Obtained from: CheriBSD Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D30286
* procctl(2): Add PROC_WXMAP_CTL/STATUSKonstantin Belousov2021-09-171-1/+63
| | | | | | | | | | | It allows to override kern.elf{32,64}.allow_wx on per-process basis. In particular, it makes it possible to run binaries without PT_GNU_STACK and without elfctl note while allow_wx = 0. Reviewed by: brooks, emaste, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31779
* libc: Fix build on case-insensitive file systemsJessica Clarke2021-09-102-1/+1
| | | | | | | | | | | | | | | | | | | | | | On case-insensitive file systems (most likely to be seen on macOS, where it is the default), _Fork.o for the new POSIX _Fork function conflicts with _fork.o for the PSEUDO file. This results in non-determinsitic behaviour in terms of which ends up being present; if _Fork.o wins then the build fails to link libc.so due to missing __sys_fork, and if _fork.o wins then libc silently fails to include the implementation of _Fork. A similar issue occurred in the past for C99's _Exit conflicting with exit(2) and was fixed in cb1cb6a2a83f, so this adds a fix based on that. As a longer-term solution it might be better to instead make the generated files use a different prefix that's less likely to conflict with other things (such as __sys_foo.o given they always contain that) but that's a rather more invasive change. Fixes: 49ad342cc10c ("Add _Fork()") Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D31895
* Export _mmap and __sys_mmap from libc.soAlex Richardson2021-09-091-0/+2
| | | | | | | | | | Unlike the other syscalls these two symbols were missing from the version script. I noticed this while looking into the compiler-rt runtime libraries for CHERI. Reviewed by: brooks Obtained from: https://github.com/CTSRD-CHERI/cheribsd/pull/1063 MFC after: 3 days
* mprotect.2: Improve the description of protBrooks Davis2021-09-071-8/+15
| | | | | | | | | | | The new wording for standard flags is losely based on the POSIX description. Make it clearer that PROT_MAX() is a local extension. Reviewed by: alc, mckusick, imp, kib, markj Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D31777
* kqueue.2: Document the fact that EVFILT_READ can be used on kqueuesMark Johnston2021-09-071-1/+5
| | | | | | | Reviewed by: bcr, kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31864
* mprotect.2: Remove legacy BSD textBrooks Davis2021-09-031-6/+1
| | | | | | | | | | | | | | | | This text dates to the BSD 4.4 import and is misleading. The mprotect syscall acts on page granularity and breaks up mappings as required to do so. Note that with the addition of non-transparent superpages (aka largepages) the size of a page at a given address may vary. This commit does not attempt to address the lack of documentation of this feature. Sponsored by: DARPA Reviewed by: alc, mckusick, imp, kib, markj Differential Revision: https://reviews.freebsd.org/D31776
* Symbol.map: Remove an extra space before _ForkKa Ho Ng2021-09-021-1/+1
| | | | | | Make it consistent with all other entries. Sponsored by: The FreeBSD Foundation
* fspacectl(2): Changes on rmsr.r_offset's minimum value returnedKa Ho Ng2021-08-251-5/+4
| | | | | | | | | | | | | rmsr.r_offset now is set to rqsr.r_offset plus the number of bytes zeroed before hitting the end-of-file. After this change rmsr.r_offset no longer contains the EOF when the requested operation range is completely beyond the end-of-file. Instead in such case rmsr.r_offset is equal to rqsr.r_offset. Callers can obtain the number of bytes zeroed by subtracting rqsr.r_offset from rmsr.r_offset. Sponsored by: The FreeBSD Foundation Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D31677
* fspacectl(2): Clarifies the return valuesKa Ho Ng2021-08-241-5/+22
| | | | | | | | | | | | | | | | | | | | | | | | rmacklem@ spotted two things in the system call: - Upon returning from a successful operation, vop_stddeallocate can update rmsr.r_offset to a value greater than file size. This behavior, although being harmless, can be confusing. - The EINVAL return value for rqsr.r_offset + rqsr.r_len > OFF_MAX is undocumented. This commit has the following changes: - vop_stddeallocate and shm_deallocate to bound the the affected area further by the file size. - The EINVAL case for rqsr.r_offset + rqsr.r_len > OFF_MAX is documented. - The fspacectl(2), vn_deallocate(9) and VOP_DEALLOCATE(9)'s return len is explicitly documented the be the value 0, and the return offset is restricted to be the smallest of off + len and current file size suggested by kib@. This semantic allows callers to interact better with potential file size growth after the call. Sponsored by: The FreeBSD Foundation Reviewed by: imp, kib Differential Revision: https://reviews.freebsd.org/D31604
* Fix aio_readv(2), aio_writev(2) with SIGEV_THREAD.Thomas Munro2021-08-221-0/+2
| | | | | | | | | | Add missing wrapper code to librt for these new functions so that SIGEV_THREAD works. Without machinery to convert it to SIGEV_THREAD_ID, you got EINVAL. Reviewed by: asomers MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D31618
* lio_listio(2): Allow LIO_READV and LIO_WRITEV.Thomas Munro2021-08-221-1/+15
| | | | | | | | | | | | | Allow multiple vector IOs to be started with one system call. aio_readv() and aio_writev() already used these opcodes under the covers. This commit makes them available to user space. Being non-standard extensions, they're only visible if __BSD_VISIBLE is defined, like the functions. Reviewed by: asomers, kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D31627
* fork(2): comment about doubtful use of stdio and exit(3) in exampleKonstantin Belousov2021-08-081-1/+18
| | | | | | | | | | Add fflush(stdout) as the common idiom. Explain the need to use exit() but advise against it. Reviewed by: emaste, markj Sponsored by: The FreeBSD Foundation MFC after: 3 days Differential revision: https://reviews.freebsd.org/D31425
* Fix pathconf.2 documentation errorKa Ho Ng2021-08-061-4/+5
| | | | | | | | | _PC_MIN_HOLE_SIZE and _PC_DEALLOC_PRESENT were mixed somehow before this fix. Sponsored by: The FreeBSD Foundation Reviewed by: delphij Differential Revision: https://reviews.freebsd.org/D31436
* fork.2: correct minor typo in manpage.Ceri Davies2021-08-051-1/+1
|
* Add fspacectl(2), vn_deallocate(9) and VOP_DEALLOCATE(9).Ka Ho Ng2021-08-054-0/+194
| | | | | | | | | | | | | | | | | | | | | | fspacectl(2) is a system call to provide space management support to userspace applications. VOP_DEALLOCATE(9) is a VOP call to perform the deallocation. vn_deallocate(9) is a public KPI for kmods' use. The purpose of proposing a new system call, a KPI and a VOP call is to allow bhyve or other hypervisor monitors to emulate the behavior of SCSI UNMAP/NVMe DEALLOCATE on a plain file. fspacectl(2) comprises of cmd and flags parameters to specify the space management operation to be performed. Currently cmd has to be SPACECTL_DEALLOC, and flags has to be 0. fo_fspacectl is added to fileops. VOP_DEALLOCATE(9) is added as a new VOP call. A trivial implementation of VOP_DEALLOCATE(9) is provided. Sponsored by: The FreeBSD Foundation Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D28347
* Add _Fork()Konstantin Belousov2021-08-034-4/+124
| | | | | | | | | | | | | | | | | | | | | Current POSIX standard requires fork() to be async-signal safe. Neither our implementation, nor implementations in other operating systems are, and practically it is impossible to make fork() async-signal safe without too much efforts. Also, that would put undue requirement that all atfork handlers should be async-signal safe as well, which contradicts its main use. As result, Austin Group dropped the requirement, and added a new function _Fork() that should be async-signal safe, but it does not call atfork handlers. Basically, _Fork() can be implemented as a raw syscall. Release of glibc 2.34 added _Fork(), do the same for FreeBSD. Clarify threading behavior for fork() in the manpage. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D31378
* clock_gettime: Add Linux aliases for CLOCK_*Warner Losh2021-07-301-1/+14
| | | | | | | | | | | | | Linux standardized what we call CLOCK_{REALTIME,MONOTONIC}_FAST as CLOCK_{REALTIME,MONOTONIC}_COARSE. In addition, Linux spells CLOCK_UPTIME as CLOCK_BOOTTIME. Add aliases to time.h and document these new aliases in clock_gettime(2). Reviewed by: vangyzen, kib (prior), dchagin (prior) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D30988
* socket: Implement SO_RERRORRoy Marples2021-07-281-1/+9
| | | | | | | | | | | | | | | | | | SO_RERROR indicates that receive buffer overflows should be handled as errors. Historically receive buffer overflows have been ignored and programs could not tell if they missed messages or messages had been truncated because of overflows. Since programs historically do not expect to get receive overflow errors, this behavior is not the default. This is really really important for programs that use route(4) to keep in sync with the system. If we loose a message then we need to reload the full system state, otherwise the behaviour from that point is undefined and can lead to chasing bogus bug reports. Reviewed by: philip (network), kbowling (transport), gbe (manpages) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D26652
* kenv: allow listing of static kernel environmentsKyle Evans2021-07-191-3/+21
| | | | | | | | | | The early environment is typically cleared, so these new options need the PRESERVE_EARLY_KENV kernel config(8) option. These environments are reported as missing by kenv(1) if the option is not present in the running kernel. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D30835
* Pass the syscall number to capsicum permission-denied signalsDavid Chisnall2021-07-161-0/+10
| | | | | | | | | | | | | | | | | | The syscall number is stored in the same register as the syscall return on amd64 (and possibly other architectures) and so it is impossible to recover in the signal handler after the call has returned. This small tweak delivers it in the `si_value` field of the signal, which is sufficient to catch capability violations and emulate them with a call to a more-privileged process in the signal handler. This reapplies 3a522ba1bc852c3d4660a4fa32e4a94999d09a47 with a fix for the static assertion failure on i386. Approved by: markj (mentor) Reviewed by: kib, bcr (manpages) Differential Revision: https://reviews.freebsd.org/D29185
* Revert "Pass the syscall number to capsicum permission-denied signals"David Chisnall2021-07-101-10/+0
| | | | | | This broke the i386 build. This reverts commit 3a522ba1bc852c3d4660a4fa32e4a94999d09a47.
* Pass the syscall number to capsicum permission-denied signalsDavid Chisnall2021-07-101-0/+10
| | | | | | | | | | | | | | | The syscall number is stored in the same register as the syscall return on amd64 (and possibly other architectures) and so it is impossible to recover in the signal handler after the call has returned. This small tweak delivers it in the `si_value` field of the signal, which is sufficient to catch capability violations and emulate them with a call to a more-privileged process in the signal handler. Approved by: markj (mentor) Reviewed by: kib, bcr (manpages) Differential Revision: https://reviews.freebsd.org/D29185
* procctl(2): add PROC_NO_NEW_PRIVS_CTL, PROC_NO_NEW_PRIVS_STATUSEdward Tomasz Napierala2021-07-011-1/+26
| | | | | | | | | | | | | | | This introduces a new, per-process flag, "NO_NEW_PRIVS", which is inherited, preserved on exec, and cannot be cleared. The flag, when set, makes subsequent execs ignore any SUID and SGID bits, instead executing those binaries as if they not set. The main purpose of the flag is implementation of Linux PROC_SET_NO_NEW_PRIVS prctl(2), and possibly also unpriviledged chroot. Reviewed By: kib Sponsored By: EPSRC Differential Revision: https://reviews.freebsd.org/D30939