aboutsummaryrefslogtreecommitdiff
path: root/sys/nfs
Commit message (Collapse)AuthorAgeFilesLines
* nfs: clean up empty lines in .c and .h filesMateusz Guzik2020-09-016-13/+6
| | | | Notes: svn path=/head/; revision=365082
* Transition from rtrequest1_fib() to rib_action().Alexander V. Chernikov2020-07-211-6/+20
| | | | | | | | | | | | Remove all variations of rtrequest <rtrequest1_fib, rtrequest_fib, in6_rtrequest, rtrequest_fib> and their uses and switch to to rib_action(). This is part of the new routing KPI. Submitted by: Neel Chauhan <neel AT neelc DOT org> Differential Revision: https://reviews.freebsd.org/D25546 Notes: svn path=/head/; revision=363403
* Use epoch(9) for rtentries to simplify control plane operations.Alexander V. Chernikov2020-05-231-0/+3
| | | | | | | | | | | | | | | | | | Currently the only reason of refcounting rtentries is the need to report the rtable operation details immediately after the execution. Delaying rtentry reclamation allows to stop refcounting and simplify the code. Additionally, this change allows to reimplement rib_lookup_info(), which is used by some of the customers to get the matching prefix along with nexthops, in more efficient way. The change keeps per-vnet rtzone uma zone. It adds nh_vnet field to nhop_priv to be able to reliably set curvnet even during vnet teardown. Rest of the reference counting code will be removed in the D24867 . Differential Revision: https://reviews.freebsd.org/D24866 Notes: svn path=/head/; revision=361409
* Remove rtable dumping code from bootp.Alexander V. Chernikov2020-04-281-96/+0
| | | | | | | | | | | | | This debugging code printing routing table data was introduced in rS25723, 22+ years ago. The last functional commit to this code was rS67534, 19 years ago. The code has been turned off by default all this time. Lastly, this code directly iterates radix tree and rtentries, which is not not a proper interaction with routing system. Differential Revision: https://reviews.freebsd.org/D24554 Notes: svn path=/head/; revision=360429
* Re-organize the NFS file handle affinity code for the NFS server.Rick Macklem2020-04-142-649/+0
| | | | | | | | | | | | | | | | | | | | The file handle affinity code was configured to be used by both the old and new NFS servers. This no longer makes sense, since there is only one NFS server. This patch copies a majority of the code in sys/nfs/nfs_fha.c and sys/nfs/nfs_fha.h into sys/fs/nfsserver/nfs_fha_new.c and sys/fs/nfsserver/nfs_fha_new.h, so that the files in sys/nfs can be deleted. The code is simplified by deleting the function callback pointers used to call functions in either the old or new NFS server and they were replaced by calls to the functions. As well as a cleanup, this re-organization simplifies the changes required for handling of external page mbufs, which is required for KERN_TLS. This patch should not result in a semantic change to file handle affinity. Notes: svn path=/head/; revision=359910
* Remove the old NFS lock device driver that uses Giant.Rick Macklem2020-04-091-404/+0
| | | | | | | | | | | | | | | | | This NFS lock device driver was replaced by the kernel NLM around FreeBSD7 and has not normally been used since then. To use it, the kernel had to be built without "options NFSLOCKD" and the nfslockd.ko had to be deleted as well. Since it uses Giant and is no longer used, this patch removes it. With this device driver removed, there is now a lot of unused code in the userland rpc.lockd. That will be removed on a future commit. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D22933 Notes: svn path=/head/; revision=359745
* Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)Pawel Biernacki2020-02-261-2/+2
| | | | | | | | | | | | | | | | | | | r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718 Notes: svn path=/head/; revision=358333
* vfs: drop the mostly unused flags argument from VOP_UNLOCKMateusz Guzik2020-01-031-1/+1
| | | | | | | | | | | Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427 Notes: svn path=/head/; revision=356337
* Switch r356210 to use gone_in() instead of printf().Rick Macklem2019-12-311-2/+1
| | | | | | | Suggested by: cem Notes: svn path=/head/; revision=356219
* Add warning printf w.r.t. removal of sys/nfs/nfs_lock.c.Rick Macklem2019-12-301-0/+2
| | | | | | | | | | | The code in sys/nfs/nfs_lock.c has not been run by default since March 2008 when it was replaced by the in kernel sys/nlm code. It uses Giant, so it needs to be removed before the FreeBSD 13 release. This will happen in a couple of months, since few if any users run the code anyhow and can easily switch to the default in kernel NFSLOCKD. Notes: svn path=/head/; revision=356210
* Switch RIB and RADIX_NODE_HEAD lock from rwlock(9) to rmlock(9).Andrey V. Elsukov2018-06-161-0/+1
| | | | | | | | | | | | | | Using of rwlock with multiqueue NICs for IP forwarding on high pps produces high lock contention and inefficient. Rmlock fits better for such workloads. Reviewed by: melifaro, olivier Obtained from: Yandex LLC Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D15789 Notes: svn path=/head/; revision=335250
* Merge the pNFS server code from projects/pnfs-planb-server into head.Rick Macklem2018-06-122-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This code merge adds a pNFS service to the NFSv4.1 server. Although it is a large commit it should not affect behaviour for a non-pNFS NFS server. Some documentation on how this works can be found at: http://people.freebsd.org/~rmacklem/pnfs-planb-setup.txt and will hopefully be turned into a proper document soon. This is a merge of the kernel code. Userland and man page changes will come soon, once the dust settles on this merge. It has passed a "make universe", so I hope it will not cause build problems. It also adds NFSv4.1 server support for the "current stateid". Here is a brief overview of the pNFS service: A pNFS service separates the Read/Write oeprations from all the other NFSv4.1 Metadata operations. It is hoped that this separation allows a pNFS service to be configured that exceeds the limits of a single NFS server for either storage capacity and/or I/O bandwidth. It is possible to configure mirroring within the data servers (DSs) so that the data storage file for an MDS file will be mirrored on two or more of the DSs. When this is used, failure of a DS will not stop the pNFS service and a failed DS can be recovered once repaired while the pNFS service continues to operate. Although two way mirroring would be the norm, it is possible to set a mirroring level of up to four or the number of DSs, whichever is less. The Metadata server will always be a single point of failure, just as a single NFS server is. A Plan B pNFS service consists of a single MetaData Server (MDS) and K Data Servers (DS), all of which are recent FreeBSD systems. Clients will mount the MDS as they would a single NFS server. When files are created, the MDS creates a file tree identical to what a single NFS server creates, except that all the regular (VREG) files will be empty. As such, if you look at the exported tree on the MDS directly on the MDS server (not via an NFS mount), the files will all be of size 0. Each of these files will also have two extended attributes in the system attribute name space: pnfsd.dsfile - This extended attrbute stores the information that the MDS needs to find the data storage file(s) on DS(s) for this file. pnfsd.dsattr - This extended attribute stores the Size, AccessTime, ModifyTime and Change attributes for the file, so that the MDS doesn't need to acquire the attributes from the DS for every Getattr operation. For each regular (VREG) file, the MDS creates a data storage file on one (or more if mirroring is enabled) of the DSs in one of the "dsNN" subdirectories. The name of this file is the file handle of the file on the MDS in hexadecimal so that the name is unique. The DSs use subdirectories named "ds0" to "dsN" so that no one directory gets too large. The value of "N" is set via the sysctl vfs.nfsd.dsdirsize on the MDS, with the default being 20. For production servers that will store a lot of files, this value should probably be much larger. It can be increased when the "nfsd" daemon is not running on the MDS, once the "dsK" directories are created. For pNFS aware NFSv4.1 clients, the FreeBSD server will return two pieces of information to the client that allows it to do I/O directly to the DS. DeviceInfo - This is relatively static information that defines what a DS is. The critical bits of information returned by the FreeBSD server is the IP address of the DS and, for the Flexible File layout, that NFSv4.1 is to be used and that it is "tightly coupled". There is a "deviceid" which identifies the DeviceInfo. Layout - This is per file and can be recalled by the server when it is no longer valid. For the FreeBSD server, there is support for two types of layout, call File and Flexible File layout. Both allow the client to do I/O on the DS via NFSv4.1 I/O operations. The Flexible File layout is a more recent variant that allows specification of mirrors, where the client is expected to do writes to all mirrors to maintain them in a consistent state. The Flexible File layout also allows the client to report I/O errors for a DS back to the MDS. The Flexible File layout supports two variants referred to as "tightly coupled" vs "loosely coupled". The FreeBSD server always uses the "tightly coupled" variant where the client uses the same credentials to do I/O on the DS as it would on the MDS. For the "loosely coupled" variant, the layout specifies a synthetic user/group that the client uses to do I/O on the DS. The FreeBSD server does not do striping and always returns layouts for the entire file. The critical information in a layout is Read vs Read/Writea and DeviceID(s) that identify which DS(s) the data is stored on. At this time, the MDS generates File Layout layouts to NFSv4.1 clients that know how to do pNFS for the non-mirrored DS case unless the sysctl vfs.nfsd.default_flexfile is set non-zero, in which case Flexible File layouts are generated. The mirrored DS configuration always generates Flexible File layouts. For NFS clients that do not support NFSv4.1 pNFS, all I/O operations are done against the MDS which acts as a proxy for the appropriate DS(s). When the MDS receives an I/O RPC, it will do the RPC on the DS as a proxy. If the DS is on the same machine, the MDS/DS will do the RPC on the DS as a proxy and so on, until the machine runs out of some resource, such as session slots or mbufs. As such, DSs must be separate systems from the MDS. Tested by: james.rose@framestore.com Relnotes: yes Notes: svn path=/head/; revision=335012
* UDP: further performance improvements on txMatt Macy2018-05-232-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cumulative throughput while running 64 netperf -H $DUT -t UDP_STREAM -- -m 1 on a 2x8x2 SKL went from 1.1Mpps to 2.5Mpps Single stream throughput increases from 910kpps to 1.18Mpps Baseline: https://people.freebsd.org/~mmacy/2018.05.11/udpsender2.svg - Protect read access to global ifnet list with epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender3.svg - Protect short lived ifaddr references with epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender4.svg - Convert if_afdata read lock path to epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender5.svg A fix for the inpcbhash contention is pending sufficient time on a canary at LLNW. Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15409 Notes: svn path=/head/; revision=334118
* ifnet: Replace if_addr_lock rwlock with epoch + mutexMatt Macy2018-05-182-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Run on LLNW canaries and tested by pho@ gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I see an almost 12% improvement in received packet rate, and a larger improvement in bytes delivered all the way to userspace. When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1, I see, using nstat -I mce0 1 before the patch: InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32 4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32 4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32 4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32 4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32 4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32 4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32 After the patch InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51 5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51 5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51 5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51 5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52 5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52 Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15366 Notes: svn path=/head/; revision=333813
* Remove support for FDDI networks.Brooks Davis2018-04-111-2/+0
| | | | | | | | | | | | Defines in net/if_media.h remain in case code copied from ifconfig is in use elsewere (supporting non-existant media type is harmless). Reviewed by: kib, jhb Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15017 Notes: svn path=/head/; revision=332412
* Remove infrastructure for token-ring networks.Brooks Davis2018-03-281-2/+0
| | | | | | | | | | Reviewed by: cem, imp, jhb, jmallett Relnotes: yes Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14875 Notes: svn path=/head/; revision=331714
* Modernize nfssvc(2) registartion.Brooks Davis2018-02-081-12/+7
| | | | | | | | | | | | | | Use syscall_helper_register() to register syscalls and do it through the module interface rather than sysinit. This pattern is more common and easier to understand. Reviewed by: jhb Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14232 Notes: svn path=/head/; revision=329025
* Do pass removing some write-only variables from the kernel.Alexander Kabaev2017-12-251-9/+0
| | | | | | | | | | | | This reduces noise when kernel is compiled by newer GCC versions, such as one used by external toolchain ports. Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial) Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c) Differential Revision: https://reviews.freebsd.org/D10385 Notes: svn path=/head/; revision=327173
* sys: general adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-276-0/+12
| | | | | | | | | | | | | | | | | Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. No functional change intended. Notes: svn path=/head/; revision=326272
* sys: further adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-207-0/+14
| | | | | | | | | | | | | | | | | Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point. Notes: svn path=/head/; revision=326023
* spdx: initial adoption of licensing ID tags.Pedro F. Giffuni2017-11-182-0/+4
| | | | | | | | | | | | | | | | | | | | The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point. Initially, only tag files that use BSD 4-Clause "Original" license. RelNotes: yes Differential Revision: https://reviews.freebsd.org/D13133 Notes: svn path=/head/; revision=325966
* Improve FHA locality control for NFS read/write requests.Alexander Motin2017-07-312-28/+34
| | | | | | | | | | | | | | | | | | | | | This change adds two new tunables, allowing to control serialization for read and write NFS requests separately. It does not change the default behavior since there are too many factors to consider, but gives additional space for further experiments and tuning. The main motivation for this change is very low write speed in case of ZFS with sync=always or when NFS clients requests sychronous operation, when every separate request has to be written/flushed to ZIL, and requests are processed one at a time. Setting vfs.nfsd.fha.write=0 in that case allows to increase ZIL throughput by several times by coalescing writes and cache flushes. There is a worry that doing it may increase data fragmentation on disks, but I suppose it should not happen for pool with SLOG. MFC after: 1 week Sponsored by: iXsystems, Inc. Notes: svn path=/head/; revision=321794
* Add kernel support for the NFS client forced dismount "umount -N" option.Rick Macklem2017-07-292-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | When an NFS mount is hung against an unresponsive NFS server, the "umount -f" option can be used to dismount the mount. Unfortunately, "umount -f" gets hung as well if a "umount" without "-f" has already been done. Usually, this is because of a vnode lock being held by the "umount" for the mounted-on vnode. This patch adds kernel code so that a new "-N" option can be added to "umount", allowing it to avoid getting hung for this case. It adds two flags. One indicates that a forced dismount is about to happen and the other is used, along with setting mnt_data == NULL, to handshake with the nfs_unmount() VFS call. It includes a slight change to the interface used between the client and common NFS modules, so I bumped __FreeBSD_version to ensure both modules are rebuilt. Tested by: pho Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D11735 Notes: svn path=/head/; revision=321688
* Renumber copyright clause 4Warner Losh2017-02-287-7/+7
| | | | | | | | | | | | Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96 Notes: svn path=/head/; revision=314436
* Hide the boottime and bootimebin globals, provide the getboottime(9)Konstantin Belousov2016-07-271-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | and getboottimebin(9) KPI. Change consumers of boottime to use the KPI. The variables were renamed to avoid shadowing issues with local variables of the same name. Issue is that boottime* should be adjusted from tc_windup(), which requires them to be members of the timehands structure. As a preparation, this commit only introduces the interface. Some uses of boottime were found doubtful, e.g. NLM uses boottime to identify the system boot instance. Arguably the identity should not change on the leap second adjustment, but the commit is about the timekeeping code and the consumers were kept bug-to-bug compatible. Tested by: pho (as part of the bigger patch) Reviewed by: jhb (same) Discussed with: bde Sponsored by: The FreeBSD Foundation MFC after: 1 month X-Differential revision: https://reviews.freebsd.org/D7302 Notes: svn path=/head/; revision=303382
* NFS: spelling fixes on comments.Pedro F. Giffuni2016-04-292-2/+2
| | | | | | | No funcional change. Notes: svn path=/head/; revision=298788
* Do not try to install a default route for each interface found, becauseIan Lepore2016-03-271-17/+42
| | | | | | | | | | | | | | | | | | | | only the first one will actually work and all the others just result in errors (which would get printed but otherwise ignored). Instead, wait until we make a choice of which interface will be used to mount the rootfs, and install the default route associated with it (if any). After doing the md_mount() call to obtain the needed info, remove the default route again, and transcribe the route info into the nfs_diskless structure. If the system eventually chooses to mount the nfs rootfs, the default route will be installed again when the nfs_diskless code re-initializes the interface. The theory here is that since we can only have one default route, the one most likely to be correct for mounting the rootfs is the one that was delivered along with the rootpath option. Notes: svn path=/head/; revision=297326
* Stop setting the default route to the IP of the interface itself when theIan Lepore2016-03-271-5/+1
| | | | | | | | | | | | | | | | | bootp/dhcp server doesn't provide a router option. Doing so prevents setting defaultrouter=<ip> in rc.conf (it fails because there's already a bogus default route installed by bootpc_init). When an admin wants to use this style of proxy arp on an interface, the proper mechanism is to set the "use-lease-addr-for-default-route" flag in the dhcp server config. That causes the lease address to be delivered in the routers option, and the normal handling of the routers option will then install the self-ip as the default route. PR: 187094 Notes: svn path=/head/; revision=297325
* Switch bootpc_adjust_interface() from returning int to void. Its one callerIan Lepore2016-03-271-6/+4
| | | | | | | | | doesn't check for errors, and all the errors that can happen result in it calling panic anyway, except for one that's really more of a warning (and is going to disappear on an upcoming commit anyway). Notes: svn path=/head/; revision=297324
* Set ifctx->gotrootpath=1 only when the root path came from the dhcp/bootpIan Lepore2016-03-271-1/+2
| | | | | | | | | | | | | | | | | | | server (and not when it came from a fallback method such as the ROOTDEVNAME option). This makes the code in bootpc_init() choose the first interface that provided a rootpath name. Previously it was choosing the first interface that got an IP address, which could be on a different and potentially unreachable subnet than the server providing the rootfs. If the rootpath name actually does come from a fallback source, then the code continues to use the first interface in the list that got configured. Note that this wasn't directly reported in the PR cited below, but was discovered while working on that PR. PR: 187094 Notes: svn path=/head/; revision=297323
* If the dhcp server provides an interface-mtu option, parse the value andIan Lepore2016-03-211-1/+21
| | | | | | | | | | | | | set that mtu on the interface. These changes are based on the patch submitted by Robert Blayzor in the PR, but I changed things around a bit, so the blame for any mistakes belongs to me. PR: 187094 Notes: svn path=/head/; revision=297149
* It appears nfs_mountroot() will use the env var "boot.netif.mtu" if itIan Lepore2016-03-201-0/+1
| | | | | | | exists, so mention that along with all the other boot.netif vars. Notes: svn path=/head/; revision=297086
* MFP r287070,r287073: split radix implementation and route table structure.Alexander V. Chernikov2016-01-251-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | There are number of radix consumers in kernel land (pf,ipfw,nfs,route) with different requirements. In fact, first 3 don't have _any_ requirements and first 2 does not use radix locking. On the other hand, routing structure do have these requirements (rnh_gen, multipath, custom to-be-added control plane functions, different locking). Additionally, radix should not known anything about its consumers internals. So, radix code now uses tiny 'struct radix_head' structure along with internal 'struct radix_mask_head' instead of 'struct radix_node_head'. Existing consumers still uses the same 'struct radix_node_head' with slight modifications: they need to pass pointer to (embedded) 'struct radix_head' to all radix callbacks. Routing code now uses new 'struct rib_head' with different locking macro: RADIX_NODE_HEAD prefix was renamed to RIB_ (which stands for routing information base). New net/route_var.h header was added to hold routing subsystem internal data. 'struct rib_head' was placed there. 'struct rtentry' will also be moved there soon. Notes: svn path=/head/; revision=294706
* Add kernel support to the NFS server for the "-manage-gids"Rick Macklem2015-11-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | option that will be added to the nfsuserd daemon in a future commit. It modifies the cache used by NFSv4 for name<-->id translation (both username/uid and group/gid) to support this. When "-manage-gids" is set, the server looks up each uid for the RPC and uses the list of groups cached in the server instead of the list of groups provided in the RPC request. The cached group list is acquired for the cache by the nfsuserd daemon via getgrouplist(3). This avoids the 16 groups limit for the list in the RPC request. Since the cache is now used for every RPC when "-manage-gids" is enabled, the code also modifies the cache to use a separate mutex for each hash list instead of a single global mutex. Suggested by: jpaetzel Tested by: jpaetzel MFC after: 2 weeks Notes: svn path=/head/; revision=291527
* Wait up to 10 seconds for late-initializing network interfaces to arrive.Ian Lepore2015-09-261-0/+9
| | | | | | | Reviewed by: rmacklem Notes: svn path=/head/; revision=288265
* Avoid closing unallocated socket in case socreate fails.Alexander Kabaev2015-02-281-1/+1
| | | | | | | | | Found by: Brainy Code Scanner Reported by: Maxime Villard <max@M00nBSD.net> MFC after: 2 weeks Notes: svn path=/head/; revision=279405
* Remove the old NFS client and server from head,Rick Macklem2014-12-232-410/+0
| | | | | | | | | | | | | | | which means that the NFSCLIENT and NFSSERVER kernel options will no longer work. This commit only removes the kernel components. Removal of unused code in the user utilities will be done later. This commit does not include an addition to UPDATING, but that will be committed in a few minutes. Discussed on: freebsd-fs Notes: svn path=/head/; revision=276096
* Avoid dynamic syscall overhead for statically compiled modules.Mateusz Guzik2014-10-261-1/+1
| | | | | | | | | | | | | | | | The kernel tracks syscall users so that modules can safely unregister them. But if the module is not unloadable or was compiled into the kernel, there is no need to do this. Achieve this by adding SY_THR_STATIC_KLD macro which expands to SY_THR_STATIC during kernel build and 0 otherwise. Reviewed by: kib (previous version) MFC after: 2 weeks Notes: svn path=/head/; revision=273707
* Follow up to r225617. In order to maximize the re-usability of kernel codeDavide Italiano2014-10-162-11/+11
| | | | | | | | | | | in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv(). This fixes a namespace collision with libc symbols. Submitted by: kmacy Tested by: make universe Notes: svn path=/head/; revision=273174
* Fix/improve fhe_stats sysctl output.Alexander Motin2014-06-141-19/+21
| | | | | | | MFC after: 2 weeks Notes: svn path=/head/; revision=267479
* Introduce new per-thread lock to protect the list of requests.Alexander Motin2014-06-081-7/+3
| | | | | | | | | | This allows to slightly simplify svc_run_internal() code: if we processed all the requests in a queue, then we know that new one will not appear. MFC after: 2 weeks Notes: svn path=/head/; revision=267221
* - Remove rt_metrics_lite and simply put its members into rtentry.Gleb Smirnoff2014-03-051-1/+1
| | | | | | | | | | | | | | | | | | | - Use counter(9) for rt_pksent (former rt_rmx.rmx_pksent). This removes another cache trashing ++ from packet forwarding path. - Create zini/fini methods for the rtentry UMA zone. Via initialize mutex and counter in them. - Fix reporting of rmx_pksent to routing socket. - Fix netstat(1) to report "Use" both in kvm(3) and sysctl(3) mode. The change is mostly targeted for stable/10 merge. For head, rt_pksent is expected to just disappear. Discussed with: melifaro Sponsored by: Netflix Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=262763
* Move most of NFS file handle affinity code out of the heavily congestedAlexander Motin2013-12-302-76/+76
| | | | | | | | | global RPC thread pool lock and protect it with own set of locks. On synthetic benchmarks this improves peak NFS request rate by 40%. Notes: svn path=/head/; revision=260097
* Fix RPC server threads file handle affinity to work better with ZFS.Alexander Motin2013-12-232-5/+2
| | | | | | | | | | | | | | | Instead of taking 8 specific bytes of file handle to identify file during RPC thread affitinity handling, use trivial hash of the full file handle. ZFS's struct zfid_short does not have padding field after the length field, as result, originally picked 8 bytes are loosing lower 16 bits of object ID, causing many false matches and unneeded requests affinity to same thread. This fix substantially improves NFS server latency and scalability in SPEC NFS benchmark by more flexible use of multiple NFS threads. Sponsored by: iXsystems, Inc. Notes: svn path=/head/; revision=259765
* Remove several linear list traversals per request from RPC server code.Alexander Motin2013-12-201-16/+1
| | | | | | | | | | | | | | Do not insert active ports into pool->sp_active list if they are success- fully assigned to some thread. This makes that list include only ports that really require attention, and so traversal can be reduced to simple taking the first one. Remove idle thread from pool->sp_idlethreads list when assigning some work (port of requests) to it. That again makes possible to replace list traversals with simple taking the first element. Notes: svn path=/head/; revision=259659
* The r48589 promised to remove implicit inclusion of if_var.h soon. PrepareGleb Smirnoff2013-10-261-0/+1
| | | | | | | | | | | to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h Sponsored by: Netflix Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=257176
* Changes to allow using BOOTP_NFSROOT and mounting an nfs root filesystemIan Lepore2013-07-311-19/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | other than the one specified by the BOOTP server. This configures NFS using the BOOTP protocol while also respecting other root-path options such as setting vfs.root.mountfrom in the environment or using the RB_DFLTROOT boot option. It allows you to override the root path provided by the server, or to supply a root path when the server provides IP configuration but no root path info. This maintains the historical BOOTP_NFSROOT behavior of panicking on a failure to mount the root path provided by the server, unless you've provided an alternative via the ROOTDEVNAME kernel option or by setting vfs.root.mountfrom. The behavior of panicking when given no other options is preserved because it amounts to a bit of a retry loop that could eventually recover from a transient network or server problem. The user can now override the root path from loader(8) even if the kernel is compiled with BOOTP_NFSROOT. If vfs.root.mountfrom is set in the environment it is used unconditionally -- it always overrides the BOOTP info. If it begins with [old]nfs: then the BOOTP code uses it instead of the server-provided info. If it specifies some other filesystem then the bootp code will not panic like it used to and the code in vfs_mountroot.c will invoke the right filesystem to do the mount. If the kernel is compiled with the ROOTDEVNAME option, then that name is used by the BOOTP code if either * The server doesn't provide a pathname. * The boothowto flags include RB_DFLTROOT. The latter allows the user to compile in alternate path in ROOTDEVNAME such as ufs:/dev/da0s1a and boot from that path by setting boot_dftlroot=1 in loader(8) or using the '-r' option in boot(8). The one thing not provided here is automatic failover from a server-provided path to a compiled-in one without the user manually requesting that. The code just isn't currently structured in a way that makes that possible with a lot of rewrite. I think the ability to set vfs.root.mountfrom and to use ROOTDEVNAME automatically when the server doesn't provide a name covers the most common needs. A set of patches submitted by Lars Eggert provided the part I couldn't figure out by myself when I tried to do this last year; many thanks. Reviewed by: rodrigc Notes: svn path=/head/; revision=253847
* Move the NFS FHA (File Handle Affinity) code from sys/nfsserver toKenneth D. Merry2013-04-172-0/+668
| | | | | | | | | | | sys/nfs, since it is now shared by the two NFS servers. Suggested by: rmacklem Sponsored by: Spectra Logic MFC after: 2 weeks Notes: svn path=/head/; revision=249596
* Use m_get() and m_getcl() instead of compat macros.Gleb Smirnoff2013-03-151-2/+2
| | | | Notes: svn path=/head/; revision=248318
* Functions m_getm2() and m_get2() have different order of arguments,Gleb Smirnoff2013-03-121-1/+1
| | | | | | | | | | and that can drive someone crazy. While m_get2() is young and not documented yet, change its order of arguments to match m_getm2(). Sorry for churn, but better now than later. Notes: svn path=/head/; revision=248207