aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* madvise(MADV_FREE): Quick fix to time rewind.Konstantin Belousov2019-09-041-0/+12
| | | | | | | | | | | | | | | | | | | | | | Don't free pages in a shadowing object. While this degrades MADV_FREE to a no-op (and we could, instead, choose to fall back to MADV_DONTNEED, at the cost of changing pmap_madvise), this is presently considered a temporary fix. We may prefer to risk a little fragmentation of the map by creating a zero/OBJT_DEFAULT entry over top of the existing object and, simultaneously, revert to the existing marking any pages in the former shadowing object in the advised region as reclaimable. At least one consumer of MADV_FREE (snmalloc) may use mmap() to construct zeroed pages "eventually" here anyway, so the fragmentation may be coming anyway. Submitted by: Nathaniel Filardo <nwf20@cl.cam.ac.uk> PR: 240061 Reviewed by: markj MFC after: 1 week Differential revision: https://reviews.freebsd.org/D21517 Notes: svn path=/head/; revision=351830
* Support doorbell strides != 0.Warner Losh2019-09-043-25/+33
| | | | | | | | | | | | | | | | | | | | | | | The NVMe standard (1.4) states >>> 8.6 Doorbell Stride for Software Emulation >>> The doorbell stride,...is useful in software emulation of an NVM >>> Express controller. ... For hardware implementations of the NVM >>> Express interface, the expected doorbell stride value is 0h. However, hardware in the wild exists with a doorbell stride of 1 (meaning 8 byte separation). This change supports that hardware, as well as software emulators as envisioned in Section 8.6. Since this is the fast path, care has been taken to make this computation efficient. The bit of math to compute an offset for each is replaced by a memory load from cache of a pre-computed value. MFC After: 3 days Reviewed by: scottl@ Differential Revision: https://reviews.freebsd.org/D21514 Notes: svn path=/head/; revision=351828
* vfs: fully hold vnodes in vnlru_free_lockedMateusz Guzik2019-09-041-6/+1
| | | | | | | | | | | | | | | | Currently the code only bumps holdcnt and clears the VI_FREE flag, not performing actual vhold. Since the vnode is still visible elsewhere, a potential new user can find it and incorrectly assume it is properly held. Use vholdl instead to correctly hold the vnode. Another place recycling (vlrureclaim) does this already. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21522 Notes: svn path=/head/; revision=351825
* Report the Host Buffer Memory minimum and preferred sizes.Warner Losh2019-09-041-0/+2
| | | | | | | | | | The Host Buffer feature (NVMe 1.4 section 89) allows for the NVMe card request the host provide it buffer for lookaside tables and maybe other things. Report the card's minimum and preferred sizes with nvmecontrol/camcontrol identify. Notes: svn path=/head/; revision=351824
* PROGS: Build common sources before recursed PROGS_TARGETS as well when building.Bryan Drewery2019-09-041-7/+4
| | | | | | | | MFC after: 2 weeks Sponsored by: DellEMC Notes: svn path=/head/; revision=351823
* Fix /proc/mounts for autofs(5) mounts.Edward Tomasz Napierala2019-09-041-0/+9
| | | | | | | | MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=351822
* Improve debugging output.Edward Tomasz Napierala2019-09-041-0/+22
| | | | | | | | MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=351821
* - correct HISTORY sectionJason Helfman2019-09-041-2/+2
| | | | | | | | | | | - while here clarify wording PR: 240260 (based on) Submitted by: gbergling@gmail.com MFC after: after 1 week Notes: svn path=/head/; revision=351820
* procstat/tests: Fix flakiness by waiting for program to startJilles Tjoelker2019-09-042-15/+8
| | | | | | | | | | | | | | | | | Some of the procstat tests start a program "while1" and examine the process using procstat, but did not wait properly for it to start (kill -0 will succeed immediately after the child process has been created). Instead, have "while1" write something when it starts, and use a fifo to wait for that. PR: 233587, 233588 Reviewed by: ngie MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D21519 Notes: svn path=/head/; revision=351819
* Include dwgpio to the build.Ruslan Bukin2019-09-043-0/+5
| | | | | | | Sponsored by: DARPA, AFRL Notes: svn path=/head/; revision=351818
* o Add support for multi-port instances of Synopsys DesignWare APB GPIORuslan Bukin2019-09-044-113/+302
| | | | | | | | | | Controller. o Rename the driver to dwgpio. Sponsored by: DARPA, AFRL Notes: svn path=/head/; revision=351817
* Back out r351799Kyle Evans2019-09-041-2/+0
| | | | | | | | empty does not appear to work like I thought it did and it actively breaks real LOCAL_MODULES usage, of which I have none at the moment... Notes: svn path=/head/; revision=351816
* pseudofs: make readdir work without a pid againKyle Evans2019-09-041-10/+15
| | | | | | | | | | | | | | | | | | | | | Specifically, the following was broken: $ mount -t procfs procfs /proc $ ls -l /proc r351741 reworked readdir slightly to avoid pfs_node/pidhash LOR, but inadvertently regressed pid == NO_PID; new pfs_lookup_proc() fails for the obvious reasons, and later pfs_visible_proc doesn't capture the pid == NO_PID -> return 1 aspect of pfs_visible. We can infact skip this whole block if we're operating on a directory w/ NO_PID, as it's always visible. Reported by: trasz Reviewed by: mjg Differential Revision: https://reviews.freebsd.org/D21518 Notes: svn path=/head/; revision=351815
* bectl(8): implement sorting for 'bectl list' outputKyle Evans2019-09-043-40/+148
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow 'bectl list' to sort output by a given property name. The property name is passed in using a command-line flag, '-c' for ascending order and '-C' for descending order. The properties allowed to sort by are: - name (the default output, even if '-c' or '-C' are not used) - creation - origin - used - usedds - usedsnap - usedrefreserv The default output for 'bectl list' is now ascending alphabetical order of BE name. To sort by creation time from earliest to latest, the command would be 'bectl list -c creation' Submitted by: Rob Fairbanks <rob.fx907 gmail com> Reviewed by: ler MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D20818 Notes: svn path=/head/; revision=351813
* mpsutil slot set statusAndriy Gapon2019-09-044-1/+139
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This code has been written as a proof of concept, but I think that it can be useful in general. It allows to set the status of an enclosure slot. Practically, this means controlling whatever slot status LEDs the enclosure provides. At present, the new command does not have sanity checks or any conveniences. That means that it is possible to issue the command for an invalid slot and an enclosure. But the worst I have seen happening is either the command failing or simply being ignored. Also, at the moment, the status has to be specified as a numeric bit mask. The bit definitions can be found in sys/dev/mps/mpi/mpi2_init.h, they are prefixed with MPI2_SEP_REQ_SLOTSTATUS_. The only way to address a slot is by the enclosure handle and the slot number. Both are readily available from mpsutil show commands. So, future enhancements could include alternative ways to address a slot (e.g., by a disk handle or a disk device name) and human friendly names for slot statuses. The new command is useful alternative to 'sas2ircu locate' command. First, sas2ircu is a proprietary blob. Second, it supports setting only locate / identify status bit. Tested on HP H220 running LSI IT firmware 20.x. Reviewed by: bapt MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D20535 Notes: svn path=/head/; revision=351812
* Adjust history, info source from v1's manualsSevan Janiyan2019-09-043-6/+6
| | | | | | | | | https://www.bell-labs.com/usr/dmr/www/1stEdman.html MFC after: 5 days Notes: svn path=/head/; revision=351811
* shutdown_halt: make sure that watchdog timer is stoppedAndriy Gapon2019-09-041-0/+3
| | | | | | | | | | | The point of halt is to keep the machine in limbo. Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D21222 Notes: svn path=/head/; revision=351810
* ZFS: Always refuse receving non-resume stream when resume state existsAndriy Gapon2019-09-042-8/+19
| | | | | | | | | | | | | | | | | | This fixes a hole in the situation where the resume state is left from receiving a new dataset and, so, the state is set on the dataset itself (as opposed to %recv child). Additionally, distinguish incremental and resume streams in error messages. This was also committed to ZoL: zfsonlinux/zfs@ebeb6f23bf7e8fe6732a05267ed1cab4c38d3b23 MFC after: 2 weeks Sponsored by: CyberSecure Notes: svn path=/head/; revision=351803
* Correct overflow logic in fullpath().Xin LI2019-09-041-9/+13
| | | | | | | | Obtained from: OpenBSD MFC after: 3 days Notes: svn path=/head/; revision=351802
* Fix the SACK block generation in the base TCP stack by bringing it inMichael Tuexen2019-09-041-11/+20
| | | | | | | | | | | | sync with the RACK stack. Reviewed by: rrs@ MFC after: 5 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D21513 Notes: svn path=/head/; revision=351801
* Fix some nits in pmap_page_array_startup().Mark Johnston2019-09-032-6/+3
| | | | | | | | | | | | | | | - Use ptoa() instead of the archaic ctob(). - Use pagezero() to zero a PDP page. - Remove PA_MIN_ADDRESS, orphaned by r351742. - Remove unneeded parens and an unnecessary control flow statement. Reported by: alc Reviewed by: alc, kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21495 Notes: svn path=/head/; revision=351800
* LOCAL_MODULES: Allow LOCAL_MODULES="" in src.conf to workKyle Evans2019-09-031-0/+2
| | | | | | | | | | | Currently LOCAL_MODULES= works, but LOCAL_MODULES="" causes build errors as .for still has the empty string to loop over. An .if empty prior to the loop was considered, but LOCAL_MODULES has empty quotes at that point and thus, isn't empty. A better solution likely exists, but this floats us by for now... Notes: svn path=/head/; revision=351799
* Allow more nesting of GEOM partitioning schemesKyle Evans2019-09-033-7/+34
| | | | | | | | | | | | | | | | | | GEOM is supposed to be topology-agnostic, but the GPT and BSD partition code has arbitrary restrictions on nesting that are annoying in cases such as running VMs on raw partitions (since the VM's partitioning scheme is not visible to the host). This patch adds sysctls to disable the restrictions except in the case of BSD label (and similar) partitions with offset 0 (where we need to avoid recursively recognizing the label). Submitted by: Andrew Gierth MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D21350 Notes: svn path=/head/; revision=351797
* posixshm: start counting writeable mappingsKyle Evans2019-09-031-5/+12
| | | | | | | | | | | | | | | | r351650 switched posixshm to using OBJT_SWAP for shm_object r351795 added support to the swap_pager for tracking writeable mappings Take advantage of this and start tracking writeable mappings; fd sealing will use this to reject a seal on writing with EBUSY if any such mapping exist. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D21456 Notes: svn path=/head/; revision=351796
* vm pager: writemapping accounting for OBJT_SWAPKyle Evans2019-09-039-31/+90
| | | | | | | | | | | | | | | | | | | | | Currently writemapping accounting is only done for vnode_pager which does some accounting on the underlying vnode. Extend this to allow accounting to be possible for any of the pager types. New pageops are added to update/release writecount that need to be implemented for any pager wishing to do said accounting, and we implement these methods now for both vnode_pager (unchanged) and swap_pager. The primary motivation for this is to allow other systems with OBJT_SWAP objects to check if their objects have any write mappings and reject operations with EBUSY if so. posixshm will be the first to do so in order to reject adding write seals to the shmfd if any writable mappings exist. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D21456 Notes: svn path=/head/; revision=351795
* Unbreak Linux binaries linked against new glibc, such as the onesEdward Tomasz Napierala2019-09-031-0/+6
| | | | | | | | | | | | from recent Ubuntu versions. Without it they segfault on startup. Reviewed by: emaste MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D20687 Notes: svn path=/head/; revision=351783
* Fix two TCP RACK issues:Michael Tuexen2019-09-031-1/+6
| | | | | | | | | | | | | | | * Convert the TCP delayed ACK timer from ms to ticks as required. This fixes the timer on platforms with hz != 1000. * Don't delay acknowledgements which report duplicate data using DSACKs. Reviewed by: rrs@ MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D21512 Notes: svn path=/head/; revision=351782
* - Retire pc-sysinstall(8)Kris Moore2019-09-0394-11211/+81
| | | | | | | | | | https://reviews.freebsd.org/D21094 Submitted by: kmoore@FreeBSD.org Approved by: imp@FreeBSD.org Notes: svn path=/head/; revision=351781
* Add stackgap control mode to proccontrol(1).Konstantin Belousov2019-09-031-2/+34
| | | | | | | | | | | PR: 239894 Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D21352 Notes: svn path=/head/; revision=351774
* Add procctl(PROC_STACKGAP_CTL)Konstantin Belousov2019-09-038-4/+144
| | | | | | | | | | | | | | It allows a process to request that stack gap was not applied to its stacks, retroactively. Also it is possible to control the gaps in the process after exec. PR: 239894 Reviewed by: alc Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D21352 Notes: svn path=/head/; revision=351773
* Use makefs -t msdos in make_esp_fileMatt Macy2019-09-032-12/+12
| | | | | | | | | | | | | | | With this last piece in place, make -C /usr/src/release release.iso is finally able to run in a jail. This was not possible before because msdosfs cannot be mounted inside a jail. Submitted by: ryan@ixsystems.com Reviewed by: emaste@, imp@, gjb@ MFC after: 1 week Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D21385 Notes: svn path=/head/; revision=351771
* Add conv=fsync flag to ddMatt Macy2019-09-034-0/+13
| | | | | | | | | | | | | The fsync flag performs an fsync(2) on the output file before closing it. This will be useful for the ZFS test suite. Submitted by: ryan@ixsystems.com Reviewed by: jilles@, imp@ MFC after: 1 week Sponsored by: iXsystems, Inc. Notes: svn path=/head/; revision=351770
* bsdgrep(1): add some basic tests for some GNU Extension supportKyle Evans2019-09-031-0/+26
| | | | | | | | | These will be expanded later as I come up with good test cases; for now, these seem to be enough to trigger bugs in base gnugrep and expose missing features in bsdgrep. Notes: svn path=/head/; revision=351769
* Make linprocfs(4) report Tgid, Linux ltrace(1) needs it.Edward Tomasz Napierala2019-09-031-0/+1
| | | | | | | | MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=351758
* vfs: implement usecount implying holdcntMateusz Guzik2019-09-038-110/+164
| | | | | | | | | | | | | | | | | | vnodes have 2 reference counts - holdcnt to keep the vnode itself from getting freed and usecount to denote it is actively used. Previously all operations bumping usecount would also bump holdcnt, which is not necessary. We can detect if usecount is already > 1 (in which case holdcnt is also > 1) and utilize it to avoid bumping holdcnt on our own. This saves on atomic ops. Reviewed by: kib Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21471 Notes: svn path=/head/; revision=351748
* Implement nvme suspend / resume for pci attachmentWarner Losh2019-09-034-18/+136
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we suspend, we need to properly shutdown the NVME controller. The controller may go into D3 state (or may have the power removed), and to properly flush the metadata to non-volatile RAM, we must complete a normal shutdown. This consists of deleting the I/O queues and setting the shutodown bit. We have to do some extra stuff to make sure we reset the software state of the queues as well. On resume, we have to reset the card twice, for reasons described in the attach funcion. Once we've done that, we can restart the card. If any of this fails, we'll fail the NVMe card, just like we do when a reset fails. Set is_resetting for the duration of the suspend / resume. This keeps the reset taskqueue from running a concurrent reset, and also is needed to prevent any hw completions from queueing more I/O to the card. Pass resetting flag to nvme_ctrlr_start. It doesn't need to get that from the global state of the ctrlr. Wait for any pending reset to finish. All queued I/O will get sent to the hardware as part of nvme_ctrlr_start(), though the upper layers shouldn't send any down. Disabling the qpairs is the other failsafe to ensure all I/O is queued. Rename nvme_ctrlr_destory_qpairs to nvme_ctrlr_delete_qpairs to avoid confusion with all the other destroy functions. It just removes the queues in hardware, while the other _destroy_ functions tear down driver data structures. Split parts of the hardware reset function up so that I can do part of the reset in suspsend. Split out the software disabling of the qpairs into nvme_ctrlr_disable_qpairs. Finally, fix a couple of spelling errors in comments related to this. Relnotes: Yes MFC After: 1 week Reviewed by: scottl@ (prior version) Differential Revision: https://reviews.freebsd.org/D21493 Notes: svn path=/head/; revision=351747
* Revert a portion of r351628 that I did not mean to commit.Mark Johnston2019-09-031-3/+1
| | | | | | | | Reported by: mjg MFC with: r351628 Notes: svn path=/head/; revision=351744
* Add preliminary support for atomic updates of per-page queue state.Mark Johnston2019-09-033-46/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Queue operations on a page use the page lock when updating the page to reflect the desired queue state, and the page queue lock when physically enqueuing or dequeuing a page. Multiple pages share a given page lock, but queue state is per-page; this false sharing results in heavy lock contention. Take a small step towards the use of atomic_cmpset to synchronize updates to per-page queue state by introducing vm_page_pqstate_cmpset() and using it in the page daemon. In the longer term the plan is to stop using the page lock to protect page identity and rely only on the object and page busy locks. However, since the page daemon avoids acquiring the object lock except when necessary, some synchronization with a concurrent free of the page is required. vm_page_pqstate_cmpset() can be used to ensure that queue state updates are successful only if the page is not scheduled for a dequeue, which is sufficient for the page daemon. Add vm_page_swapqueue(), which moves a page from one queue to another using vm_page_pqstate_cmpset(). Use it in the active queue scan, which does not use the object lock. Modify vm_page_dequeue_deferred() to use vm_page_pqstate_cmpset() as well. Reviewed by: kib Discussed with: jeff Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21257 Notes: svn path=/head/; revision=351743
* Map the vm_page array into KVA on amd64.Mark Johnston2019-09-034-40/+27
| | | | | | | | | | | | | | | | | | | | r351198 allows the kernel to use domain-local memory to back the vm_page array (up to 2MB boundaries) and reserves a separate PML4 entry for that purpose. One consequence of that change is that the vm_page array is no longer present in minidumps, which only adds pages mapped above VM_MIN_KERNEL_ADDRESS. To avoid the friction caused by having kernel data structures mapped below VM_MIN_KERNEL_ADDRESS, map the vm_page array starting at VM_MIN_KERNEL_ADDRESS instead of using a dedicated PML4 entry. Reviewed by: kib Discussed with: jeff Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21491 Notes: svn path=/head/; revision=351742
* pseudofs: fix a LOR pfs_node vs pidhash (sleepable after non-sleepable)Mateusz Guzik2019-09-031-3/+31
| | | | | | | Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=351741
* superio: fix the copyright block and update the yearAndriy Gapon2019-09-032-7/+12
| | | | | | | MFC after: 2 weeks Notes: svn path=/head/; revision=351740
* Temporarily skip sys.sys.qmath_test.qdivq_s64q in CI because it is unstableLi-Wen Hsu2019-09-031-0/+4
| | | | | | | | | PR: 240219 Discussed with: trasz Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=351739
* Add sysctlbyname system callMateusz Guzik2019-09-0320-17/+264
| | | | | | | | | | | | | | | Previously userspace would issue one syscall to resolve the sysctl and then another one to actually use it. Do it all in one trip. Fallback is provided in case newer libc happens to be running on an older kernel. Submitted by: Pawel Biernacki Reported by: kib, brooks Differential Revision: https://reviews.freebsd.org/D17282 Notes: svn path=/head/; revision=351729
* Add a sysctl to dump kernel mappings and their properties on amd64.Mark Johnston2019-09-021-0/+298
| | | | | | | | | | | | | | The sysctl is called vm.pmap.kernel_maps. It dumps address ranges and their corresponding protection and mapping mode, as well as counts of 2MB and 1GB pages in the range. Reviewed by: kib MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21380 Notes: svn path=/head/; revision=351728
* Replace PMAP_LARGEMAP_MAX_ADDRESS() with a more general predicate.Mark Johnston2019-09-021-7/+6
| | | | | | | | | | | No functional change intended. Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=351727
* This patch improves the DSACK handling to conform with RFC 2883.Michael Tuexen2019-09-025-15/+154
| | | | | | | | | | | | | | The lowest SACK block is used when multiple Blocks would be elegible as DSACK blocks ACK blocks get reordered - while maintaining the ordering of SACK blocks not relevant in the DSACK context is maintained. Reviewed by: rrs@, tuexen@ Obtained from: Richard Scheffenegger MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D21038 Notes: svn path=/head/; revision=351725
* Fix the name of the devicetree bindings document file cited in the manpage.Ian Lepore2019-09-021-3/+3
| | | | | | | Reported by: thj@ Notes: svn path=/head/; revision=351724
* Bump Linux version to 3.2.0. Without it, binaries linked againstEdward Tomasz Napierala2019-09-021-3/+3
| | | | | | | | | | | | | | | glibc 2.24 and up (eg Ubuntu 19.04) fail with "FATAL: kernel too old". This alone is not enough to make newer binaries actually work; fix/hack/workaround is pending review at https://reviews.freebsd.org/D20687. Reviewed by: emaste MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D20757 Notes: svn path=/head/; revision=351723
* In nvme_completion_poll, add a sanity check to make sure that we complete theWarner Losh2019-09-021-1/+13
| | | | | | | | | | | | polling within a second. Panic if we don't. All the commands that use this interface should typically complete within a few tens to hundreds of microseconds. Panic rather than return ETIMEDOUT because if the command somehow does later complete, it will randomly corrupt memory. Also, it helps to get a traceback from where the unexpected failure happens, rather than an infinite loop. Notes: svn path=/head/; revision=351706
* In all the places that we use the polled for completion interface, except crashWarner Losh2019-09-023-16/+16
| | | | | | | | | dump support code, move the while loop into an inline function. These aren't done in the fast path, so if the compiler choses to not inline, any performance hit is tiny. Notes: svn path=/head/; revision=351705