aboutsummaryrefslogtreecommitdiff
path: root/sys/kern/vfs_subr.c
Commit message (Collapse)AuthorAgeFilesLines
* Revert r334708Justin Hibbits2018-06-061-3/+0
| | | | | | | | This is the wrong place to put the barrier. Requested by: kib,mjg Notes: svn path=/head/; revision=334716
* Add a memory barrier after taking a reference on the vnode holdcnt in _vholdJustin Hibbits2018-06-061-0/+3
| | | | | | | | | | | | | | This is needed to avoid a race between the VNASSERT() below, and another thread updating the VI_FREE flag, on weakly-ordered architectures. On a 72-thread POWER9, without this barrier a 'make -j72 buildworld' would panic on the assert regularly. It may be possible to use a weaker barrier, and I'll investigate that once all stability issues are worked out on POWER9. Notes: svn path=/head/; revision=334708
* vfs: annotate variables only used by debug builds as __unusedMatt Macy2018-05-191-1/+1
| | | | Notes: svn path=/head/; revision=333852
* Make it easier for filesystems to count themselves as jail-enabled,Jamie Gritton2018-05-041-12/+14
| | | | | | | | | | | | | | | by doing most of the work in a new function prison_add_vfs in kern_jail.c Now a jail-enabled filesystem need only mark itself with VFCF_JAIL, and the rest is taken care of. This includes adding a jail parameter like allow.mount.foofs, and a sysctl like security.jail.mount_foofs_allowed. Both of these used to be a static list of known filesystems, with predefined permission bits. Reviewed by: kib Differential Revision: D14681 Notes: svn path=/head/; revision=333263
* Move most of the contents of opt_compat.h to opt_global.h.Brooks Davis2018-04-061-1/+0
| | | | | | | | | | | | | | | | | | | | | opt_compat.h is mentioned in nearly 180 files. In-progress network driver compabibility improvements may add over 100 more so this is closer to "just about everywhere" than "only some files" per the guidance in sys/conf/options. Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h is created on all architectures. Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the set of compiled files. Reviewed by: kib, cem, jhb, jtl Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14941 Notes: svn path=/head/; revision=332122
* ZFS vn_rele_async: catch up with the use of refcount(9) for the vnode use countAndriy Gapon2018-03-281-38/+7
| | | | | | | | | | | | | | | | | | | | | | It's not sufficient nor required to use the vnode interlock when checking if we are going to drop the last use count as the code in vputx() uses refcount (atomic) operations for both checking and decrementing the use code. Apply the same method to vn_rele_async(). While here, remove vn_rele_inactive(), a wrapper around vrele() that didn't add any value. Also, the change required making vfs_refcount_release_if_not_last() public. I've made vfs_refcount_acquire_if_not_zero() public as well. They are in sys/refcount.h now. While making the move I've dropped the vfs_ prefix. Reviewed by: mjg MFC after: 2 weeks Sponsored by: Panzura Differential Revision: https://reviews.freebsd.org/D14869 Notes: svn path=/head/; revision=331666
* Further parallelize the buffer cache.Jeff Roberson2018-02-201-6/+1
| | | | | | | | | | | | | | | | | | | | | | Provide multiple clean queues partitioned into 'domains'. Each domain manages its own bufspace and has its own bufspace daemon. Each domain has a set of subqueues indexed by the current cpuid to reduce lock contention on the cleanq. Refine the sleep/wakeup around the bufspace daemon to use atomics as much as possible. Add a B_REUSE flag that is used to requeue bufs during the scan to approximate LRU rather than locking the queue on every use of a frequently accessed buf. Implement bufspace_reserve with only atomic_fetchadd to avoid loop restarts. Reviewed by: markj Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D14274 Notes: svn path=/head/; revision=329612
* One of the vnode fields listed by vn_printf is the union of pointersKirk McKusick2018-01-311-2/+19
| | | | | | | | | | | whose type depends on the type of vnode. Correct vn_printf so that it correctly identifies the name of the pointer that it is printing. Submitted by: Andreas Longwitz <longwitz at incore.de> MFC after: 1 week Notes: svn path=/head/; revision=328643
* vfs: tidy up vdropMateusz Guzik2018-01-121-14/+11
| | | | | | | | | | | Skip vfs_refcount_release_if_not_last if the interlock is held and just go straight to refcount_release. While here do cosmetic rearrangement of _vhold to better show it contains equivalent behaviour. Notes: svn path=/head/; revision=327874
* kernel: Fix several typos and minor errorsEitan Adler2017-12-271-1/+1
| | | | | | | | | | | - duplicate words - typos - references to old versions of FreeBSD Reviewed by: imp, benno Notes: svn path=/head/; revision=327231
* Do pass removing some write-only variables from the kernel.Alexander Kabaev2017-12-251-2/+1
| | | | | | | | | | | | This reduces noise when kernel is compiled by newer GCC versions, such as one used by external toolchain ports. Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial) Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c) Differential Revision: https://reviews.freebsd.org/D10385 Notes: svn path=/head/; revision=327173
* sys: further adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-201-0/+2
| | | | | | | | | | | | | | | | | Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point. Notes: svn path=/head/; revision=326023
* Avoid the nbp lookup in the final loop iteration in flushbuflist().Mark Johnston2017-10-201-2/+2
| | | | | | | | | | | | | | | | The end of the loop must re-lookup the next buf since the bufobj lock is dropped in the loop body. If the lookup fails, the loop is restarted. This mechanism non-obviously also terminates the loop when the end of the buf list is reached. Split up the two loops termination cases to make the code a bit less fragile. No functional change intended. Reviewed by: kib MFC after: 1 week Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D12730 Notes: svn path=/head/; revision=324804
* Fix a racy VI_DOOMED check in MNT_VNODE_FOREACH_ALL().Mark Johnston2017-10-171-16/+24
| | | | | | | | | | | | | | | MNT_VNODE_FOREACH_ALL() is supposed to avoid returning doomed vnodes, but the VI_DOOMED check it used was done without the vnode interlock held, so it could race with a concurrent vgone(). Submitted by: Don Morris <don.morris@isilon.com> Reviewed by: kib, mckusick MFC after: 1 week Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D12704 Notes: svn path=/head/; revision=324704
* For unlinked files, do not msync(2) or sync on the vnode deactivation.Konstantin Belousov2017-09-191-2/+2
| | | | | | | | | | | | | | | | | One consequence of the patch is that msyncing unlinked file mappings no longer reduces the amount of the dirty memory in the system, but I do not think that there are users of msync(2) that utilize it for such side-effect. Reported and tested by: tjil PR: 222356 Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D12411 Notes: svn path=/head/; revision=323768
* Allow vdrop() of a vnode not yet on the per-mount list after r306512.Bryan Drewery2017-08-281-13/+29
| | | | | | | | | | | | | | | | | The old code allowed calling vdrop() before insmntque() to place the vnode back onto the freelist for later recycling. Some downstream consumers may rely on this support. Normally insmntque() failing is fine since is uses vgone() and immediately frees the vnode rather than attempting to add it to the freelist if vdrop() were used instead. Also assert that vhold() cannot be used on such a vnode. Reviewed by: kib, cem, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D12126 Notes: svn path=/head/; revision=322978
* Allow vinvalbuf() to operate with the shared vnode lock.Konstantin Belousov2017-08-201-2/+6
| | | | | | | | | | | | | | | | This mode allows other clean buffers to arrive while we flush the buf lists for the vnode, which is fine for the targeted use. We only need that all buffers existed at the time of the function start were flushed. In fact, only one assert has to be relaxed. In collaboration with: pho Reviewed by: rmacklem Sponsored by: The FreeBSD Foundation MFC after: 2 weeks X-Differential revision: https://reviews.freebsd.org/D12083 Notes: svn path=/head/; revision=322721
* For UNIX sockets make vnode point not to the socket, but to the UNIX PCB,Gleb Smirnoff2017-06-021-1/+4
| | | | | | | | | since the latter is the thing that links together VFS and sockets. While here, make the union in the struct vnode anonymous. Notes: svn path=/head/; revision=319502
* mnt_vnode_next_active: use conventional lock order when trylock fails.Konstantin Belousov2017-05-151-11/+87
| | | | | | | | | | | | | | | | | | | | | Previously, when the VI_TRYLOCK failed, we would spin under the mutex that protects the vnode active list until we either succeeded or noticed that we had hogged the CPU. Since we were violating the lock order, this would guarantee that we would become a hog under any deadlock condition (e.g. a race with vdrop(9) on the same vnode). In the presence of many concurrent threads in sync(2) or vdrop etc, the victim could hang for a long time. Now, avoid spinning by dropping and reacquiring the locks in the conventional lock order when the trylock fails. This requires a dance with the vnode hold count. Submitted by: Tom Rix <trix@juniper.net> Tested by: pho Differential revision: https://reviews.freebsd.org/D10692 Notes: svn path=/head/; revision=318285
* Add V_VMIO flag for vinvalbuf(9) to indicate that the flush requestKonstantin Belousov2017-04-051-8/+10
| | | | | | | | | | | | | | | was issued during VM-initiated i/o (pageout), so that the function does not try to flush or remove pages or wait for the vm object paging-in-progress counter. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week X-Differential revision: https://reviews.freebsd.org/D10241 Notes: svn path=/head/; revision=316528
* Correct a kernel stack leak in 32-bit compat when vfc_name is short.Brooks Davis2017-04-041-2/+1
| | | | | | | | | | | | | | | Don't zero unused pointer members again. Per discussion with secteam we are not issuing an advisory for this issue as we have no current evidence it leaks exploitable information. Reviewed by: rwatson, glebius, delphij MFC after: 1 day Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D10227 Notes: svn path=/head/; revision=316497
* Change 'Hz' back to 'HZ'... it's referring to the kernel config optionIan Lepore2017-03-121-1/+1
| | | | | | | named HZ, not being used as an abbreviation of the unit of measure. Notes: svn path=/head/; revision=315167
* Correct the abbreviations for microseconds (us, not ms), and for Hz (not HZ).Ian Lepore2017-03-121-1/+1
| | | | Notes: svn path=/head/; revision=315165
* vfs: use atomic_fcmpset in vfs_refcount_*Mateusz Guzik2017-02-051-4/+4
| | | | Notes: svn path=/head/; revision=313268
* Improve debugging printf.Edward Tomasz Napierala2017-01-221-1/+1
| | | | Notes: svn path=/head/; revision=312621
* vfs: hide the getvnode NULL mp message behind DIAGNOSTICMateusz Guzik2017-01-211-2/+4
| | | | | | | | | | Since crossmp vnode changes the message was being printed on each boot. Reported by: trasz Discussed with: kib Notes: svn path=/head/; revision=312598
* vfs: switch nodes_created, recycles_count and free_owe_inact to counter(9)Mateusz Guzik2016-12-311-11/+17
| | | | | | | Reviewed by: kib Notes: svn path=/head/; revision=310983
* vfs: add vrefact, to be used when the vnode has to be already activeMateusz Guzik2016-12-121-0/+22
| | | | | | | | | | | | | This allows blind increment of relevant counters which under contention is cheaper than inc-not-zero loops at least on amd64. Use it in some of the places which are guaranteed to see already active vnodes. Reviewed by: kib (previous version) Notes: svn path=/head/; revision=309893
* Launder VPO_NOSYNC pages upon vnode deactivation.Mark Johnston2016-11-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | As of r234483, vnode deactivation causes non-VPO_NOSYNC pages to be laundered. This behaviour has two problems: 1. Dirty VPO_NOSYNC pages must be laundered before the vnode can be reclaimed, and this work may be unfairly deferred to the vnlru process or an unrelated application when the system is under vnode pressure. 2. Deactivation of a vnode with dirty VPO_NOSYNC pages requires a scan of the corresponding VM object's memq for non-VPO_NOSYNC dirty pages; if the laundry thread needs to launder pages from an unreferenced such vnode, it will reactivate and deactivate the vnode with each laundering, potentially resulting in a large number of expensive scans. Therefore, ensure that all dirty pages are laundered upon deactivation, i.e., when all maps of the vnode are removed and all references are released. Reviewed by: alc, kib MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D8641 Notes: svn path=/head/; revision=309200
* vfs: clear the tmp free list flag before taking the free vnode list lockMateusz Guzik2016-10-081-2/+2
| | | | | | | Safe access is already guaranteed because of the mnt_listmx lock. Notes: svn path=/head/; revision=306841
* vrefl: Assert that the interlock is held.Bryan Drewery2016-10-061-0/+1
| | | | | | | | Sponsored by: Dell EMC Isilon MFC after: 2 weeks Notes: svn path=/head/; revision=306775
* Add vrecyclel() to vrecycle() a vnode with the interlock already held.Bryan Drewery2016-10-061-3/+16
| | | | | | | | | Obtained from: OneFS Sponsored by: Dell EMC Isilon MFC after: 2 weeks Notes: svn path=/head/; revision=306774
* Correct some comments after r294299.Bryan Drewery2016-10-041-4/+4
| | | | | | | Sponsored by: Dell EMC Isilon Notes: svn path=/head/; revision=306689
* vfs: batch free vnodes in per-mnt listsMateusz Guzik2016-09-301-30/+116
| | | | | | | | | | | | | | | | | Previously free vnodes would always by directly returned to the global LRU list. With this change up to mnt_free_list_batch vnodes are collected first. syncer runs always return the batch regardless of its size. While vnodes on per-mnt lists are not counted as free, they can be returned in case of vnode shortage. Reviewed by: kib Tested by: pho Notes: svn path=/head/; revision=306512
* vfs: remove the __bo_vnode field from struct vnodeMateusz Guzik2016-09-301-2/+1
| | | | | | | | | The pointer can be obtained using __containerof instead. Reviewed by: kib Notes: svn path=/head/; revision=306509
* Renumber license clauses in sys/kern to avoid skipping #3Ed Maste2016-09-151-1/+1
| | | | Notes: svn path=/head/; revision=305832
* Print vnode details when vnode locking assertion gets triggered.Edward Tomasz Napierala2016-08-121-0/+6
| | | | | | | MFC after: 1 month Notes: svn path=/head/; revision=304023
* Replace all remaining calls to vprint(9) with vn_printf(9), and removeEdward Tomasz Napierala2016-08-101-3/+3
| | | | | | | | | the old macro. MFC after: 1 month Notes: svn path=/head/; revision=303924
* Remove unused - never actually implemented - vnode lock typesEdward Tomasz Napierala2016-08-041-19/+0
| | | | | | | | | from vnode_if.src. MFC after: 1 month Notes: svn path=/head/; revision=303743
* Fix grammar.Konstantin Belousov2016-07-111-1/+1
| | | | | | | | Submitted by: alc MFC after: 2 weeks Notes: svn path=/head/; revision=302580
* In vgonel(), postpone setting BO_DEAD until VOP_RECLAIM() is called,Konstantin Belousov2016-07-111-1/+7
| | | | | | | | | | | | | | | | | | | | if vnode is VMIO. For VMIO vnodes, set BO_DEAD in vm_object_terminate(). The vnode_destroy_object(), when calling into vm_object_terminate(), must be able to flush buffers. BO_DEAD purpose is to quickly destroy buffers on write when the underlying vnode is not operable any more (one example is the devfs node after geom is gone). Setting BO_DEAD for reclaiming vnode before object is terminated is premature, and results in unability to flush buffers with live SU dependencies from vinvalbuf() in vm_object_terminate(). Reported by: David Cross <dcrosstech@gmail.com> Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Notes: svn path=/head/; revision=302567
* Remove racy assert. The thread which changes vnode usecount from 0 to 1Konstantin Belousov2016-07-031-5/+2
| | | | | | | | | | | | | | does it under the vnode interlock, but the interlock is not owned by the asserting thread. As result, we might read increased use counter but also still see VI_OWEINACT. In collaboration with: nwhitehorn Hardware donated by: IBM LTC Sponsored by: The FreeBSD Foundation (kib) Approved by: re (gjb) Notes: svn path=/head/; revision=302322
* Fix typo. Note that atomic is still required even for interlocked case.Konstantin Belousov2016-06-201-2/+3
| | | | | | | | Sponsored by: The FreeBSD Foundation Approved by: re (marius) Notes: svn path=/head/; revision=302029
* vfs: ifdef out noop vop_* primitives on !DEBUG_VFS_LOCKS kernelsMateusz Guzik2016-06-171-10/+2
| | | | | | | | | | This removes calls to empty functions like vop_lock_{pre/post} from common vfs routines. Approved by: re (gjb) Notes: svn path=/head/; revision=302000
* Add VFS interface to flush specified amount of free vnodes belongingKonstantin Belousov2016-06-171-10/+34
| | | | | | | | | | | | | | to mount points with the given filesystem type, specified by mount vfs_ops pointer. Based on patch by: mckusick Reviewed by: avg, mckusick Tested by: allanjude, madpilot Sponsored by: The FreeBSD Foundation Approved by: re (gjb) Notes: svn path=/head/; revision=301996
* Cosmetics - add missing space after ellipses in shutdown messages.Edward Tomasz Napierala2016-05-311-1/+1
| | | | | | | | MFC after: 1 month Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=301040
* vfs_read_dirent: increment ncookies after adding a cookieAndriy Gapon2016-05-161-0/+1
| | | | | | | | | | It seems that at present vfs_read_dirent() is used only with filesystems that do not support cookies, so the bug never manifested itself. MFC after: 1 week Notes: svn path=/head/; revision=299916
* Add EVFILT_VNODE open, read and close notifications.Konstantin Belousov2016-05-031-0/+39
| | | | | | | | | | While there, order EVFILT_VNODE notes descriptions alphabetically. Based on submission, and tested by: Vladimir Kondratyev <wulf@cicgroup.ru> MFC after: 2 weeks Notes: svn path=/head/; revision=298982
* Issue NOTE_EXTEND when a directory entry is added to or removed fromKonstantin Belousov2016-05-021-0/+1
| | | | | | | | | | | the monitored directory as the result of rename(2) operation. The renames staying in the directory are not reported. Submitted by: Vladimir Kondratyev <wulf@cicgroup.ru> MFC after: 2 weeks Notes: svn path=/head/; revision=298922
* Fix reporting of NOTE_LINK when directory link count changes due toKonstantin Belousov2016-05-021-2/+18
| | | | | | | | | | | | rename removing or adding subdirectory entry. Discussed with and tested by: Vladimir Kondratyev <wulf@cicgroup.ru> NetBSD PR: 48958 (http://gnats.netbsd.org/48958) MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=298921