aboutsummaryrefslogtreecommitdiff
path: root/sys/sys/unpcb.h
Commit message (Collapse)AuthorAgeFilesLines
* unix(4): Add SOL_LOCAL:LOCAL_CREDS_PERSISTENTConrad Meyer2020-11-031-2/+5
| | | | | | | | | | | | This option is intended to be semantically identical to Linux's SOL_SOCKET:SO_PASSCRED. For now, it is mutually exclusive with the pre-existing sockopt SOL_LOCAL:LOCAL_CREDS. Reviewed by: markj (penultimate version) Differential Revision: https://reviews.freebsd.org/D27011 Notes: svn path=/head/; revision=367287
* Simplify unix socket connection peer locking.Mark Johnston2020-09-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unp_pcb_owned_lock2() has some sharp edges and forces callers to deal with a bunch of cases. Simplify it: - Rename to unp_pcb_lock_peer(). - Return the connected peer instead of forcing callers to load it beforehand. - Handle self-connected sockets. - In unp_connectat(), just lock the accept socket directly. It should not be possible for the nascent socket to participate in any other lock orders. - Get rid of connect_internal(). It does not provide any useful checking anymore. - Block in unp_connectat() when a different thread is concurrently attempting to lock both sides of a connection. This provides simpler semantics for callers of unp_pcb_lock_peer(). - Make unp_connectat() return EISCONN if the socket is already connected. This fixes a race[1] when multiple threads attempt to connect() to different addresses using the same datagram socket. Upper layers will disconnect a connected datagram socket before calling the protocol connect's method, but there is no synchronization between this and protocol-layer code. Reported by: syzkaller [1] Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26299 Notes: svn path=/head/; revision=365764
* Update unix domain socket locking comments.Mark Johnston2020-09-151-18/+25
| | | | | | | | | | | | | | | - Define a locking key for unpcb members. - Rewrite some of the locking protocol description to make it less verbose and avoid referencing some subroutines which will be renamed. - Reorder includes. Reviewed by: glebius, kevans, kib Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26294 Notes: svn path=/head/; revision=365759
* Remove UNP_NASCENT, reverting r303855.Mark Johnston2020-03-201-1/+0
| | | | | | | | | | | | | unp_connectat() no longer holds the link lock across calls to sonewconn(), so the recursion described in r303855 can no longer occur. No functional change intended. Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=359170
* Implement cycle-detecting garbage collector for AF_UNIX socketsJason A. Harmening2020-01-251-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing AF_UNIX socket garbage collector destroys any socket which may potentially be in a cycle, as indicated by its file reference count being equal to its enqueue count. However, this can produce false positives for in-flight sockets which aren't part of a cycle but are part of one or more SCM_RIGHTS mssages and which have been closed on the sending side. If the garbage collector happens to run at exactly the wrong time, destruction of these sockets will render them unusable on the receiving side, such that no previously-written data may be read. This change rewrites the garbage collector to precisely detect cycles: 1. The existing check of msgcount==f_count is still used to determine whether the socket is potentially in a cycle. 2. The socket is now placed on a local "dead list", which is used to reduce iteration time (and therefore contention on the global unp_link_rwlock). 3. The first pass through the dead list removes each potentially-dead socket's outgoing references from the graph of potentially-dead sockets, using a gc-specific copy of the original reference count. 4. The second series of passes through the dead list removes from the list any socket whose remaining gc refcount is non-zero, as this indicates the socket is actually accessible outside of any possible cycle. Iteration is repeated until no further sockets are removed from the dead list. 5. Sockets remaining in the dead list are destroyed as before. PR: 227285 Submitted by: jan.kokemueller@gmail.com (prior version) Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D23142 Notes: svn path=/head/; revision=357110
* Fix the alignment of struct xunpcb on systems with >64-bit pointers.Brooks Davis2019-11-071-1/+1
| | | | | | | | | | | Reviewed by: emaste Obtained from: CheriBSD MFC after: 3 days Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D22268 Notes: svn path=/head/; revision=354420
* Fix LOCAL_PEERCRED with socketpair(2)Alan Somers2018-08-031-0/+9
| | | | | | | | | | | | | | Enable the LOCAL_PEERCRED socket option for unix domain stream sockets created with socketpair(2). Previously, it only worked with unix domain stream sockets created with socket(2)/listen(2)/connect(2)/accept(2). PR: 176419 Reported by: Nicholas Wilson <nicholas@nicholaswilson.me.uk> MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D16350 Notes: svn path=/head/; revision=337222
* Make struct xinpcb and friends word-size independent.Brooks Davis2018-07-051-7/+7
| | | | | | | | | | | | | | | | | | | | | Replace size_t members with ksize_t (uint64_t) and pointer members (never used as pointers in userspace, but instead as unique idenitifiers) with kvaddr_t (uint64_t). This makes the structs identical between 32-bit and 64-bit ABIs. On 64-bit bit systems, the ABI is maintained. On 32-bit systems, this is an ABI breaking change. The ABI of most of these structs was previously broken in r315662. This also imposes a small API change on userspace consumers who must handle kernel pointers becoming virtual addresses. PR: 228301 (exp-run by antoine) Reviewed by: jtl, kib, rwatson (various versions) Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15386 Notes: svn path=/head/; revision=335979
* AF_UNIX: make unix socket locking finer grainedMatt Macy2018-05-171-12/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change moves to using a reference count across lock drop / reacquire to guarantee liveness. Currently sends on unix sockets contend heavily on read locking the list lock. unix1_processes in will-it-scale peaks at 6 processes and then declines. With this change I get a substantial improvement in number of operations per second with 96 processes: x before + after N Min Max Median Avg Stddev x 11 1688420 1696389 1693578 1692766.3 2971.1702 + 10 63417955 71030114 70662504 69576423 2374684.6 Difference at 95.0% confidence 6.78837e+07 +/- 1.49463e+06 4010.22% +/- 88.4246% (Student's t, pooled s = 1.63437e+06) And even for 2 processes shows a ~18% improvement. "Small" iron changes (1, 2, and 4 processes): x before1 + after1.2 +------------------------------------------------------------------------+ | + | | x + | | x + | | x + | | x ++ | | xx ++ | |x x xx ++ | | |__________________A_____M_____AM____|| +------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 1131648 1197750 1197138.5 1190369.3 20651.839 + 10 1203840 1205056 1204919 1204827.9 353.27404 Difference at 95.0% confidence 14458.6 +/- 13723 1.21463% +/- 1.16683% (Student's t, pooled s = 14605.2) x before2 + after2.2 +------------------------------------------------------------------------+ | +| | +| | +| | +| | +| | +| | x +| | x +| | x xx +| |x xxxx +| | |___AM_| A| +------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 1972843 2045866 2038186.5 2030443.8 21367.694 + 10 2400853 2402196 2401043.5 2401172.7 385.40024 Difference at 95.0% confidence 370729 +/- 14198.9 18.2585% +/- 0.826943% (Student's t, pooled s = 15111.7) x before4 + after4.2 N Min Max Median Avg Stddev x 10 3986994 3991728 3990137.5 3989985.2 1300.0164 + 10 4799990 4806664 4806116.5 4805194 1990.6625 Difference at 95.0% confidence 815209 +/- 1579.64 20.4314% +/- 0.0421713% (Student's t, pooled s = 1681.19) Tested by: pho Reported by: mjg Approved by: sbruno Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15430 Notes: svn path=/head/; revision=333744
* sys: further adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-201-0/+2
| | | | | | | | | | | | | | | | | Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point. Notes: svn path=/head/; revision=326023
* Hide struct socket and struct unpcb from the userland.Gleb Smirnoff2017-10-021-17/+35
| | | | | | | | | | | | | | | | | | | Violators may define _WANT_SOCKET and _WANT_UNPCB respectively and are not guaranteed for stability of the structures. The violators list is the the usual one: libprocstat(3) and netstat(1) internally and lsof in ports. In struct xunpcb remove the inclusion of kernel structure and add a bunch of spare fields. The xsocket already has socket not included, but add there spares as well. Embed xsockbuf into xsocket. Sort declarations in sys/socketvar.h to separate kernel only from userland available ones. PR: 221820 (exp-run) Notes: svn path=/head/; revision=324227
* Remove write only flag UNP_HAVEPCCACHED.Gleb Smirnoff2017-06-021-6/+0
| | | | Notes: svn path=/head/; revision=319503
* Renumber copyright clause 4Warner Losh2017-02-281-1/+1
| | | | | | | | | | | | Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96 Notes: svn path=/head/; revision=314436
* Handle races with listening socket close when connecting a unix socket.Mark Johnston2016-08-081-5/+9
| | | | | | | | | | | | | | | | If the listening socket is closed while sonewconn() is executing, the nascent child socket is aborted, which results in recursion on the unp_link lock when the child's pru_detach method is invoked. Fix this by using a flag to mark such sockets, and skip a part of the socket's teardown during detach. Reported by: Raviprakash Darbha <rdarbha@juniper.net> Tested by: pho MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D7398 Notes: svn path=/head/; revision=303855
* Fix cleanup race between unp_dispose and unp_gcConrad Meyer2015-07-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | unp_dispose and unp_gc could race to teardown the same mbuf chains, which can lead to dereferencing freed filedesc pointers. This patch adds an IGNORE_RIGHTS flag on unpcbs marking the unpcb's RIGHTS as invalid/freed. The flag is protected by UNP_LIST_LOCK. To serialize against unp_gc, unp_dispose needs the socket object. Change the dom_dispose() KPI to take a socket object instead of an mbuf chain directly. PR: 194264 Differential Revision: https://reviews.freebsd.org/D3044 Reviewed by: mjg (earlier version) Approved by: markj (mentor) Obtained from: mjg MFC after: 1 month Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=285522
* Replace 4.4BSD Lite's unix domain socket backpressure hack with a cleanerAlan Somers2014-03-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mechanism, based on the new SB_STOP sockbuf flag. The old hack dynamically changed the sending sockbuf's high water mark whenever adding or removing data from the receiving sockbuf. It worked for stream sockets, but it never worked for SOCK_SEQPACKET sockets because of their atomic nature. If the sockbuf was partially full, it might return EMSGSIZE instead of blocking. The new solution is based on DragonFlyBSD's fix from commit 3a6117bbe0ed6a87605c1e43e12a1438d8844380 on 2008-05-27. It adds an SB_STOP flag to sockbufs. Whenever uipc_send surpasses the socket's size limit, it sets SB_STOP on the sending sockbuf. sbspace() will then return 0 for that sockbuf, causing sosend_generic and friends to block. uipc_rcvd will likewise clear SB_STOP. There are two fringe benefits: uipc_{send,rcvd} no longer need to call chgsbsize() on every send and receive because they don't change the sockbuf's high water mark. Also, uipc_sense no longer needs to acquire the UIPC linkage lock, because it's simpler to compute the st_blksizes. There is one drawback: since sbspace() will only ever return 0 or the maximum, sosend_generic will allow the sockbuf to exceed its nominal maximum size by at most one packet of size less than the max. I don't think that's a serious problem. In fact, I'm not even positive that FreeBSD guarantees a socket will always stay within its nominal size limit. sys/sys/sockbuf.h Add the SB_STOP flag and adjust sbspace() sys/sys/unpcb.h Delete the obsolete unp_cc and unp_mbcnt fields from struct unpcb. sys/kern/uipc_usrreq.c Adjust uipc_rcvd, uipc_send, and uipc_sense to use the SB_STOP backpressure mechanism. Removing obsolete unpcb fields from db_show_unpcb. tests/sys/kern/unix_seqpacket_test.c Clear expected failures from ATF. Obtained from: DragonFly BSD PR: kern/185812 Reviewed by: silence from freebsd-net@ and rwatson@ MFC after: 3 weeks Sponsored by: Spectra Logic Corporation Notes: svn path=/head/; revision=263116
* Partial revert of change 262914. I screwed up subversion syntax withAlan Somers2014-03-071-2/+2
| | | | | | | | | | | | | perforce syntax and committed some unrelated files. Only devd files should've been committed. Reported by: imp Pointy hat to: asomers MFC after: 3 weeks X-MFC-With: r262914 Notes: svn path=/head/; revision=262915
* sbin/devd/devd.8Alan Somers2014-03-071-2/+2
| | | | | | | | | | | | | sbin/devd/devd.cc Add a -q flag to devd that will suppress syslog logging at LOG_NOTICE or below. Requested by: ian@ and imp@ MFC after: 3 weeks Sponsored by: Spectra Logic Corporation Notes: svn path=/head/; revision=262914
* Remove explicit locking of struct file.Jeff Roberson2007-12-301-1/+8
| | | | | | | | | | | | | | | | - Introduce a finit() which is used to initailize the fields of struct file in such a way that the ops vector is only valid after the data, type, and flags are valid. - Protect f_flag and f_count with atomic operations. - Remove the global list of all files and associated accounting. - Rewrite the unp garbage collection such that it no longer requires the global list of all files and instead uses a list of all unp sockets. - Mark sockets in the accept queue so we don't incorrectly gc them. Tested by: kris, pho Notes: svn path=/head/; revision=174988
* Revise locking strategy used for UNIX domain sockets in order to improveRobert Watson2007-02-261-0/+1
| | | | | | | | | | | | | | | | | | | | | concurrency: - Add per-unpcb mutexes protecting unpcb connection state, fields, etc. - Replace global UNP mutex with a global UNP rwlock, which will protect the UNIX domain socket connection topology, v_socket, and be acquired exclusively before acquiring more than per-unpcb at a time in order to avoid lock order issues. In performance measurements involving MySQL, this change has little or no overhead on UP (+/- 1%), but leads to a significant (5%-30%) improvement in multi-processor measurements using the sysbench and supersmack benchmarks. Much testing by: kris Approved by: re (kensmith) Notes: svn path=/head/; revision=167030
* - Close a race between enumerating UNIX domain socket pcb structures viaJohn Baldwin2007-01-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | sysctl and socket teardown by adding a reference count to the UNIX domain pcb object and fixing the sysctl that enumerates unpcbs to grab a reference on each unpcb while it builds the list to copy out to userland. - Close a race between UNIX domain pcb garbage collection (unp_gc()) and file descriptor teardown (fdrop()) by adding a new garbage collection flag FWAIT. unp_gc() sets FWAIT while it walks the message buffers in a UNIX domain socket looking for nested file descriptor references and clears the flag when it is finished. fdrop() checks to see if the flag is set on a file descriptor whose refcount just dropped to 0 and waits for unp_gc() to clear the flag before completely destroying the file descriptor. MFC after: 1 week Reviewed by: rwatson Submitted by: ups Hopefully makes the panics go away: mx1 Notes: svn path=/head/; revision=165810
* Add two new unpcb flags, UNP_BINDING and UNP_CONNECTING, which will beRobert Watson2006-07-231-0/+8
| | | | | | | | | | | used to mark UNIX domain sockets as being in the process of binding or connecting. Use these to prevent simultaneous bind or connect operations by multiple threads or processes on the same socket at the same time, which closes race conditions present in the UNIX domain socket implementation since inception. Notes: svn path=/head/; revision=160590
* Implement unix(4) socket options LOCAL_CREDS and LOCAL_CONNWAIT.Matthew N. Dodd2005-04-131-0/+2
| | | | | | | | | | | - Add unp_addsockcred() (for LOCAL_CREDS). - Add an argument to unp_connect2() to differentiate between PRU_CONNECT and PRU_CONNECT2. (for LOCAL_CONNWAIT) Obtained from: NetBSD (with some changes) Notes: svn path=/head/; revision=144978
* /* -> /*- for license, minor formatting changesWarner Losh2005-01-071-1/+1
| | | | Notes: svn path=/head/; revision=139825
* Remove advertising clause from University of California Regent's license,Warner Losh2004-04-071-4/+0
| | | | | | | | | per letter dated July 22, 1999. Approved by: core Notes: svn path=/head/; revision=127976
* Remove vestiges of no longer needed unp_rvnode field.Jeffrey Hsu2003-02-061-1/+0
| | | | | | | Approved by: phk (who originally added it in rev 1.8 of unpcb.h) Notes: svn path=/head/; revision=110430
* Fix typos, mostly s/ an / a / where appropriate and a few s/an/and/Jens Schweikhardt2002-12-301-1/+1
| | | | | | | Add FreeBSD Id tag where missing. Notes: svn path=/head/; revision=108470
* More s/file system/filesystem/gTom Rhodes2002-05-161-1/+1
| | | | Notes: svn path=/head/; revision=96755
* style(9) the structure definitions.David E. O'Brien2001-09-051-3/+3
| | | | Notes: svn path=/head/; revision=83045
* Implement a LOCAL_PEERCRED socket option which returns aDima Dorfman2001-08-171-0/+19
| | | | | | | | | | | | | | `struct xucred` with the credentials of the connected peer. Obviously this only works (and makes sense) on SOCK_STREAM sockets. This works for both the connect(2) and listen(2) callers. There is precise documentation of the semantics in unix(4). Reviewed by: dwmalone (eyeballed) Notes: svn path=/head/; revision=81857
* Back out the previous change to the queue(3) interface.Jake Burkholder2000-05-261-3/+3
| | | | | | | | | It was not discussed and should probably not happen. Requested by: msmith and others Notes: svn path=/head/; revision=60938
* Change the way that the queue(3) structures are declared; don't assume thatJake Burkholder2000-05-231-3/+3
| | | | | | | | | | | the type argument to *_HEAD and *_ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd Notes: svn path=/head/; revision=60833
* $Id$ -> $FreeBSD$Peter Wemm1999-08-281-1/+1
| | | | Notes: svn path=/head/; revision=50477
* This Implements the mumbled about "Jail" feature.Poul-Henning Kamp1999-04-281-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a seriously beefed up chroot kind of thing. The process is jailed along the same lines as a chroot does it, but with additional tough restrictions imposed on what the superuser can do. For all I know, it is safe to hand over the root bit inside a prison to the customer living in that prison, this is what it was developed for in fact: "real virtual servers". Each prison has an ip number associated with it, which all IP communications will be coerced to use and each prison has its own hostname. Needless to say, you need more RAM this way, but the advantage is that each customer can run their own particular version of apache and not stomp on the toes of their neighbors. It generally does what one would expect, but setting up a jail still takes a little knowledge. A few notes: I have no scripts for setting up a jail, don't ask me for them. The IP number should be an alias on one of the interfaces. mount a /proc in each jail, it will make ps more useable. /proc/<pid>/status tells the hostname of the prison for jailed processes. Quotas are only sensible if you have a mountpoint per prison. There are no privisions for stopping resource-hogging. Some "#ifdef INET" and similar may be missing (send patches!) If somebody wants to take it from here and develop it into more of a "virtual machine" they should be most welcome! Tools, comments, patches & documentation most welcome. Have fun... Sponsored by: http://www.rndassociates.com/ Run for almost a year by: http://www.servetheweb.com/ Notes: svn path=/head/; revision=46155
* Convert socket structures to be type-stable and add a version number.Garrett Wollman1998-05-151-4/+39
| | | | | | | | | | | | | | | | | | | | | | Define a parameter which indicates the maximum number of sockets in a system, and use this to size the zone allocators used for sockets and for certain PCBs. Convert PF_LOCAL PCB structures to be type-stable and add a version number. Define an external format for infomation about socket structures and use it in several places. Define a mechanism to get all PF_LOCAL and PF_INET PCB lists through sysctl(3) without blocking network interrupts for an unreasonable length of time. This probably still has some bugs and/or race conditions, but it seems to work well enough on my machines. It is now possible for `netstat' to get almost all of its information via the sysctl(3) interface rather than reading kmem (changes to follow). Notes: svn path=/head/; revision=36079
* Fix all areas of the system (or at least all those in LINT) to avoid storingGarrett Wollman1997-08-161-2/+2
| | | | | | | | | | | socket addresses in mbufs. (Socket buffers are the one exception.) A number of kernel APIs needed to get fixed in order to make this happen. Also, fix three protocol families which kept PCBs in mbufs to not malloc them instead. Delete some old compatibility cruft while we're at it, and add some new routines in the in_cksum family. Notes: svn path=/head/; revision=28270
* Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are notPeter Wemm1997-02-221-1/+1
| | | | | | | ready for it yet. Notes: svn path=/head/; revision=22975
* Make the long-awaited change from $Id$ to $FreeBSD$Jordan K. Hubbard1997-01-141-1/+1
| | | | | | | | | | | This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise. Notes: svn path=/head/; revision=21673
* Made them all idempotent.Paul Richards1994-08-211-1/+6
| | | | | | | | Reviewed by: Submitted by: Notes: svn path=/head/; revision=2165
* Added $Id$David Greenman1994-08-021-0/+1
| | | | Notes: svn path=/head/; revision=1817
* BSD 4.4 Lite Kernel SourcesRodney W. Grimes1994-05-241-0/+73
Notes: svn path=/head/; revision=1541