aboutsummaryrefslogtreecommitdiff
path: root/sys/net
Commit message (Collapse)AuthorAgeFilesLines
* Remove duplicate #include <net/route.h> from the middle of the file.Bjoern A. Zeeb2009-06-231-1/+0
| | | | Notes: svn path=/head/; revision=194700
* V_irtualize flowtable state.Marko Zec2009-06-222-100/+192
| | | | | | | | | | | | | | | This change should make options VIMAGE kernel builds usable again, to some extent at least. Note that the size of struct vnet_inet has changed, though in accordance with one-bump-per-day policy we didn't update the __FreeBSD_version number, given that it has already been touched by r194640 a few hours ago. Reviewed by: bz Approved by: julian (mentor) Notes: svn path=/head/; revision=194660
* Updates after r194640:Bjoern A. Zeeb2009-06-221-4/+0
| | | | | | | | | - shrink size guards for vnet_net. vnet_rtable does not need size guards as it is self-contained. - remove a bunch of defines from vnet.h no longer valid. Notes: svn path=/head/; revision=194641
* Move virtualization of routing related variables into their ownBjoern A. Zeeb2009-06-223-16/+47
| | | | | | | | | | | | | | Vimage module, which had been there already but now is stateful. All variables are now file local; so this further limits the global spreading of routing related things throughout the kernel. Add a missing function local variable in case of MPATHing. Reviewed by: zec Notes: svn path=/head/; revision=194640
* Collect all VIMAGE_GLOBALS variables in one place.Bjoern A. Zeeb2009-06-222-9/+4
| | | | | | | | | | | | | | | No longer export rt_tables as all lookups go through rt_tables_get_rnh(). We cannot make rt_tables (and rtstat, rttrash[1]) static as netstat -r (-rs[1]) would stop working on a stripped VIMAGE_GLOBALS kernel. Reviewed by: zec Presumably broken by: phk 13.5y ago in r12820 [1] Notes: svn path=/head/; revision=194629
* Add a new function, ifa_ifwithaddr_check(), which rather than returningRobert Watson2009-06-223-3/+18
| | | | | | | | | | | | | a pointer to an ifaddr matching the passed socket address, returns a boolean indicating whether one was present. In the (near) future, ifa_ifwithaddr() will return a referenced ifaddr rather than a raw ifaddr pointer, and the new wrapper will allow callers that care only about the boolean condition to avoid having to free that reference. MFC after: 3 weeks Notes: svn path=/head/; revision=194622
* After the update to fxp(4) in r194573 we should no longer needBjoern A. Zeeb2009-06-221-1/+0
| | | | | | | | | | | this DELAY(100) hack introduced in r56938. Thanks to: yongari MFC after: 6 weeks X-MFC note: not before the fxp(4) changes Notes: svn path=/head/; revision=194620
* Clean up common ifaddr management:Robert Watson2009-06-214-35/+47
| | | | | | | | | | | | | | | | | - Unify reference count and lock initialization in a single function, ifa_init(). - Move tear-down from a macro (IFAFREE) to a function ifa_free(). - Move reference count bump from a macro (IFAREF) to a function ifa_ref(). - Instead of using a u_int protected by a mutex to refcount(9) for reference count management. The ifa_mtx is now used for exactly one ioctl, and possibly should be removed. MFC after: 3 weeks Notes: svn path=/head/; revision=194602
* Switch cmd argument to u_long. This matches what if_ethersubr.c does andRoman Divacky2009-06-219-9/+9
| | | | | | | | | | allows the code to compile cleanly on amd64 with clang. Reviewed by: rwatson Approved by: ed (mentor) Notes: svn path=/head/; revision=194581
* In non-debugging mode make this define (void)0 instead of nothing. ThisRoman Divacky2009-06-211-1/+1
| | | | | | | | | | | | | helps to catch bugs like the below with clang. if (cond); <--- note the trailing ; something(); Approved by: ed (mentor) Discussed on: current@ Notes: svn path=/head/; revision=194577
* add helper function for flushing software queuesKip Macy2009-06-191-1/+15
| | | | Notes: svn path=/head/; revision=194518
* Implement the -z (zero counters) option for the various bpf counters.Christian S.J. Peron2009-06-191-1/+45
| | | | | | | | | | | Add necessary changes to the kernel for this (basically introduce a bpf_zero_counters() function). As well, update the man page. MFC after: 1 month Discussed with: rwatson Notes: svn path=/head/; revision=194512
* Add explicit includes for jail.h to the files that need them andBjoern A. Zeeb2009-06-172-0/+2
| | | | | | | remove the "hidden" one from vimage.h. Notes: svn path=/head/; revision=194368
* Add the explicit include of vimage.h to another five .c files stillBjoern A. Zeeb2009-06-172-0/+2
| | | | | | | | | | missing it. Remove the "hidden" kernel only include of vimage.h from ip_var.h added with the very first Vimage commit r181803 to avoid further kernel poisoning. Notes: svn path=/head/; revision=194357
* r193336 moved ifq_detach to if_free which broke if_alloc followedSam Leffler2009-06-152-7/+6
| | | | | | | | | | | by if_free (w/o doing if_attach); move ifq_attach to if_alloc and rename ifq_attach/detach to ifq_init/ifq_delete to better identify their purpose Reviewed by: jhb, kmacy Notes: svn path=/head/; revision=194259
* Get vnets from creds instead of threads where they're available, and fromJamie Gritton2009-06-151-1/+1
| | | | | | | | | | passed threads instead of curthread. Reviewed by: zec, julian Approved by: bz (mentor) Notes: svn path=/head/; revision=194252
* Manage vnets via the jail system. If a jail is given the booleanJamie Gritton2009-06-152-2/+18
| | | | | | | | | | | | | | parameter "vnet" when it is created, a new vnet instance will be created along with the jail. Networks interfaces can be moved between prisons with an ioctl similar to the one that moves them between vimages. For now vnets will co-exist under both jails and vimages, but soon struct vimage will be going away. Reviewed by: zec, julian Approved by: bz (mentor) Notes: svn path=/head/; revision=194251
* Add an optional callback function that will be invoked when a per-CPUBjoern A. Zeeb2009-06-142-0/+6
| | | | | | | | | | | | | queue was drained. It will never fire for a directly dispatched packet. You will most likely never want to use this for any ordinary netisr usage and you will never blame netisr in case you try to use it and it does not work as expected. Reviewed by: rwatson Notes: svn path=/head/; revision=194201
* Garbage collect an extern for a non-existent variable.Bjoern A. Zeeb2009-06-121-4/+2
| | | | | | | | | While here let the comment end in a '.' and mark the #endif of _KERNEL. Reviewed by: rwatson (as part of a larger patch) Notes: svn path=/head/; revision=194077
* Move the kernel option FLOWTABLE chacking from the header file to theBjoern A. Zeeb2009-06-121-17/+0
| | | | | | | | | | | | | | actual implementation. Remove the accessor functions for the compiled out case, just returning "unavail" values. Remove the kernel conditional from the header file as it is no longer needed, only leaving the externs. Hide the improperly virtualized SYSCTL/TUNABLE for the flowtable size under the kernel option as well. Reviewed by: rwatson Notes: svn path=/head/; revision=194076
* Added support for NAT-Traversal (RFC 3948) in IPsec stack.VANHULLEBUS Yvan2009-06-121-1/+36
| | | | | | | | | | | | | | | | | Thanks to (no special order) Emmanuel Dreyfus (manu@netbsd.org), Larry Baird (lab@gta.com), gnn, bz, and other FreeBSD devs, Julien Vanherzeele (julien.vanherzeele@netasq.com, for years of bug reporting), the PFSense team, and all people who used / tried the NAT-T patch for years and reported bugs, patches, etc... X-MFC: never Reviewed by: bz Approved by: gnn(mentor) Obtained from: NETASQ Notes: svn path=/head/; revision=194062
* carp(4) allows people to share a set of IP addresses and can onlyBjoern A. Zeeb2009-06-113-1/+17
| | | | | | | | | | | | | | use IPv4/v6 for inter-node communication (according to my reading). Properly wrap the carp callouts in INET || INET6 and refelect this in sys/conf/files as well. While in theory this should be ok, it might be a bit optimistic to think that carp could build with inet6 only[1]. Discussed with: mlaier [1] Notes: svn path=/head/; revision=193983
* Adapt vfs kqfilter to the shared vnode lock used by zfs write vop. UseKonstantin Belousov2009-06-104-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | vnode interlock to protect the knote fields [1]. The locking assumes that shared vnode lock is held, thus we get exclusive access to knote either by exclusive vnode lock protection, or by shared vnode lock + vnode interlock. Do not use kl_locked() method to assert either lock ownership or the fact that curthread does not own the lock. For shared locks, ownership is not recorded, e.g. VOP_ISLOCKED can return LK_SHARED for the shared lock not owned by curthread, causing false positives in kqueue subsystem assertions about knlist lock. Remove kl_locked method from knlist lock vector, and add two separate assertion methods kl_assert_locked and kl_assert_unlocked, that are supposed to use proper asserts. Change knlist_init accordingly. Add convenience function knlist_init_mtx to reduce number of arguments for typical knlist initialization. Submitted by: jhb [1] Noted by: jhb [2] Reviewed by: jhb Tested by: rnoland Notes: svn path=/head/; revision=193951
* SCTP needs either IPv4 or IPv6 as lower layer[1].Bjoern A. Zeeb2009-06-101-0/+4
| | | | | | | | | | | So properly hide the already #ifdef SCTP code with #if defined(INET) || defined(INET6) as well to get us closer to a non-INET/INET6 kernel. Discussed with: tuexen [1] Notes: svn path=/head/; revision=193926
* ip_gif_ttl/GIF_TTL are only used by the inet part in in_gif.c,Bjoern A. Zeeb2009-06-101-0/+2
| | | | | | | so put the initialization under #ifdef INET. Notes: svn path=/head/; revision=193913
* The llentry *lle is only used in cases of INET or INET6.Bjoern A. Zeeb2009-06-104-4/+13
| | | | | | | | | | Put the variable declaration under proper #ifdefs. In case variables are only needed for one of the two AFs more them into proper scope. Notes: svn path=/head/; revision=193891
* revert to opt-in flowtableKip Macy2009-06-092-4/+3
| | | | Notes: svn path=/head/; revision=193863
* Close long existed race with net.inet.ip.fw.one_pass = 0:Oleg Bulyzhin2009-06-092-16/+27
| | | | | | | | | | | | | | | | | If packet leaves ipfw to other kernel subsystem (dummynet, netgraph, etc) it carries pointer to matching ipfw rule. If this packet then reinjected back to ipfw, ruleset processing starts from that rule. If rule was deleted meanwhile, due to existed race condition panic was possible (as well as other odd effects like parsing rules in 'reap list'). P.S. this commit changes ABI so userland ipfw related binaries should be recompiled. MFC after: 1 month Tested by: Mikolaj Golub Notes: svn path=/head/; revision=193859
* make flowtable opt-outKip Macy2009-06-092-2/+3
| | | | Notes: svn path=/head/; revision=193856
* move jenkins hash to its own header in libkernKip Macy2009-06-091-145/+2
| | | | Notes: svn path=/head/; revision=193854
* - add drbr routines for accessing #qentries and conditionally dequeueingKip Macy2009-06-091-3/+34
| | | | | | | - track bytes enqueued in buf_ring Notes: svn path=/head/; revision=193848
* Remove one INET dependency by calling the generalBjoern A. Zeeb2009-06-091-1/+1
| | | | | | | | | AF agnostic version for doing the routing lookup. Reviewed by: kmacy Notes: svn path=/head/; revision=193820
* Style fix.Hiroki Sato2009-06-091-7/+7
| | | | | | | Submitted by: bz Notes: svn path=/head/; revision=193815
* - Fix sanity check of GIFSOPTS ioctl.Hiroki Sato2009-06-092-6/+6
| | | | | | | | | - Rename option mask s/GIF_FULLOPTS/GIF_OPTMASK/ Spotted by: Eygene Ryabinkin, delphij Notes: svn path=/head/; revision=193796
* Remove two unneeded, hidden includes.Bjoern A. Zeeb2009-06-081-2/+0
| | | | Notes: svn path=/head/; revision=193748
* After r193232 rt_tables in vnet.h are no longer indirectly dependent onBjoern A. Zeeb2009-06-0811-16/+0
| | | | | | | | | | | | the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer depend on it for module builds. Remove the hidden include in flowtable.h as well and leave the two explicit #includes in ip_input.c and ip_output.c. Notes: svn path=/head/; revision=193744
* Introduce an infrastructure for dismantling vnet instances.Marko Zec2009-06-086-13/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Vnet modules and protocol domains may now register destructor functions to clean up and release per-module state. The destructor mechanisms can be triggered by invoking "vimage -d", or a future equivalent command which will be provided via the new jail framework. While this patch introduces numerous placeholder destructor functions, many of those are currently incomplete, thus leaking memory or (even worse) failing to stop all running timers. Many of such issues are already known and will be incrementaly fixed over the next weeks in smaller incremental commits. Apart from introducing new fields in structs ifnet, domain, protosw and vnet_net, which requires the kernel and modules to be rebuilt, this change should have no impact on nooptions VIMAGE builds, since vnet destructors can only be called in VIMAGE kernels. Moreover, destructor functions should be in general compiled in only in options VIMAGE builds, except for kernel modules which can be safely kldunloaded at run time. Bump __FreeBSD_version to 800097. Reviewed by: bz, julian Approved by: rwatson, kib (re), julian (mentor) Notes: svn path=/head/; revision=193731
* Fix and add a workaround on an issue of EtherIP packet with reversedHiroki Sato2009-06-072-11/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | version field sent via gif(4)+if_bridge(4). The EtherIP implementation found on FreeBSD 6.1, 6.2, 6.3, 7.0, 7.1, and 7.2 had an interoperability issue because it sent the incorrect EtherIP packets and discarded the correct ones. This change introduces the following two flags to gif(4): accept_rev_ethip_ver: accepts both correct EtherIP packets and ones with reversed version field, if enabled. If disabled, the gif accepts the correct packets only. This flag is enabled by default. send_rev_ethip_ver: sends EtherIP packets with reversed version field intentionally, if enabled. If disabled, the gif sends the correct packets only. This flag is disabled by default. These flags are stored in struct gif_softc and can be set by ifconfig(8) on per-interface basis. Note that this is an incompatible change of EtherIP with the older FreeBSD releases. If you need to interoperate older FreeBSD boxes and new versions after this commit, setting "send_rev_ethip_ver" is needed. Reviewed by: thompsa and rwatson Spotted by: Shunsuke SHINOMIYA PR: kern/125003 MFC after: 2 weeks Notes: svn path=/head/; revision=193664
* Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERICRobert Watson2009-06-0511-11/+0
| | | | | | | | | | | and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd Notes: svn path=/head/; revision=193511
* More cleanup in preparation of ipfw relocation (no actual code change):Luigi Rizzo2009-06-052-5/+5
| | | | | | | | | | | | | | | | | | | + move ipfw and dummynet hooks declarations to raw_ip.c (definitions in ip_var.h) same as for most other global variables. This removes some dependencies from ip_input.c; + remove the IPFW_LOADED macro, just test ip_fw_chk_ptr directly; + remove the DUMMYNET_LOADED macro, just test ip_dn_io_ptr directly; + move ip_dn_ruledel_ptr to ip_fw2.c which is the only file using it; To be merged together with rev 193497 MFC after: 5 days Notes: svn path=/head/; revision=193502
* move ifq_detach from if_detach to if_free; this permits callers toSam Leffler2009-06-021-3/+1
| | | | | | | | | | reference if_snd in the period between detach+free which helps simplify detach code Reviewed by: jhb, rwatson Notes: svn path=/head/; revision=193336
* Revert a recent netisr2 change: when billing packets to the currentRobert Watson2009-06-011-2/+0
| | | | | | | | | | CPU, don't lock the workstream, as its mutexes may not have been initialized if there are fewer workstreams than CPUs. Run into by: hps, ps Notes: svn path=/head/; revision=193243
* Convert the two dimensional array to be malloced and introduceBjoern A. Zeeb2009-06-015-31/+62
| | | | | | | | | | | | | | | | | | | an accessor function to get the correct rnh pointer back. Update netstat to get the correct pointer using kvm_read() as well. This not only fixes the ABI problem depending on the kernel option but also permits the tunable to overwrite the kernel option at boot time up to MAXFIBS, enlarging the number of FIBs without having to recompile. So people could just use GENERIC now. Reviewed by: julian, rwatson, zec X-MFC: not possible Notes: svn path=/head/; revision=193232
* Garbage collect NETISR_POLL and NETISR_POLLMORE, which are no longerRobert Watson2009-06-012-19/+28
| | | | | | | | | | | | | | | required for options DEVICE_POLLING. De-fragment the NETISR_ constant space and lower NETISR_MAXPROT from 32 to 16 -- when sizing queue arrays using this compile-time constant, significant amounts of memory are saved. Warn on the console when tunable values for netisr are automatically adjusted during boot due to exceeding limits, invalid values, or as a result of DEVICE_POLLING. Notes: svn path=/head/; revision=193230
* Reimplement the netisr framework in order to support parallel netisrRobert Watson2009-06-013-201/+1157
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | threads: - Support up to one netisr thread per CPU, each processings its own workstream, or set of per-protocol queues. Threads may be bound to specific CPUs, or allowed to migrate, based on a global policy. In the future it would be desirable to support topology-centric policies, such as "one netisr per package". - Allow each protocol to advertise an ordering policy, which can currently be one of: NETISR_POLICY_SOURCE: packets must maintain ordering with respect to an implicit or explicit source (such as an interface or socket). NETISR_POLICY_FLOW: make use of mbuf flow identifiers to place work, as well as allowing protocols to provide a flow generation function for mbufs without flow identifers (m2flow). Falls back on NETISR_POLICY_SOURCE if now flow ID is available. NETISR_POLICY_CPU: allow protocols to inspect and assign a CPU for each packet handled by netisr (m2cpuid). - Provide utility functions for querying the number of workstreams being used, as well as a mapping function from workstream to CPU ID, which protocols may use in work placement decisions. - Add explicit interfaces to get and set per-protocol queue limits, and get and clear drop counters, which query data or apply changes across all workstreams. - Add a more extensible netisr registration interface, in which protocols declare 'struct netisr_handler' structures for each registered NETISR_ type. These include name, handler function, optional mbuf to flow ID function, optional mbuf to CPU ID function, queue limit, and ordering policy. Padding is present to allow these to be expanded in the future. If no queue limit is declared, then a default is used. - Queue limits are now per-workstream, and raised from the previous IFQ_MAXLEN default of 50 to 256. - All protocols are updated to use the new registration interface, and with the exception of netnatm, default queue limits. Most protocols register as NETISR_POLICY_SOURCE, except IPv4 and IPv6, which use NETISR_POLICY_FLOW, and will therefore take advantage of driver- generated flow IDs if present. - Formalize a non-packet based interface between interface polling and the netisr, rather than having polling pretend to be two protocols. Provide two explicit hooks in the netisr worker for start and end events for runs: netisr_poll() and netisr_pollmore(), as well as a function, netisr_sched_poll(), to allow the polling code to schedule netisr execution. DEVICE_POLLING still embeds single-netisr assumptions in its implementation, so for now if it is compiled into the kernel, a single and un-bound netisr thread is enforced regardless of tunable configuration. In the default configuration, the new netisr implementation maintains the same basic assumptions as the previous implementation: a single, un-bound worker thread processes all deferred work, and direct dispatch is enabled by default wherever possible. Performance measurement shows a marginal performance improvement over the old implementation due to the use of batched dequeue. An rmlock is used to synchronize use and registration/unregistration using the framework; currently, synchronized use is disabled (replicating current netisr policy) due to a measurable 3%-6% hit in ping-pong micro-benchmarking. It will be enabled once further rmlock optimization has taken place. However, in practice, netisrs are rarely registered or unregistered at runtime. A new man page for netisr will follow, but since one doesn't currently exist, it hasn't been updated. This change is not appropriate for MFC, although the polling shutdown handler should be merged to 7-STABLE. Bump __FreeBSD_version. Reviewed by: bz Notes: svn path=/head/; revision=193219
* Introduce an interm userland-kernel API for creating vnets andMarko Zec2009-05-311-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | assigning ifnets from one vnet to another. Deletion of vnets is not yet supported. The interface is implemented as an ioctl extension so that no syscalls had to be introduced. This should be acceptable given that the new interface will be used for a short / interim period only, until the new jail management framwork gains the capability of managing vnets. This method for managing vimages / vnets has been in use for the past 7 years without any observable issues. The userland tool to be used in conjunction with the interim API can be found in p4: //depot/projects/vimage-commit2/src/usr.sbin/vimage/... and will most probably never get commited to svn. While here, bump copyright notices in kern_vimage.c and vimage.h to cover work done in year 2009. Approved by: julian (mentor) Discussed with: bz, rwatson Notes: svn path=/head/; revision=193166
* When user_frac in the polling subsystem is low it is going to busy theAttilio Rao2009-05-302-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | CPU for too long period than necessary. Additively, interfaces are kept polled (in the tick) even if no more packets are available. In order to avoid such situations a new generic mechanism can be implemented in proactive way, keeping track of the time spent on any packet and fragmenting the time for any tick, stopping the processing as soon as possible. In order to implement such mechanism, the polling handler needs to change, returning the number of packets processed. While the intended logic is not part of this patch, the polling KPI is broken by this commit, adding an int return value and the new flag IFCAP_POLLING_NOCOUNT (which will signal that the return value is meaningless for the installed handler and checking should be skipped). Bump __FreeBSD_version in order to signal such situation. Reviewed by: emaste Sponsored by: Sandvine Incorporated Notes: svn path=/head/; revision=193096
* Make the rmlock(9) interface a bit more like the rwlock(9) interface:Robert Watson2009-05-291-1/+1
| | | | | | | | | | | | | | | | | - Add rm_init_flags() and accept extended options only for that variation. - Add a flags space specifically for rm_init_flags(), rather than borrowing the lock_init() flag space. - Define flag RM_RECURSE to use instead of LO_RECURSABLE. - Define flag RM_NOWITNESS to allow an rmlock to be exempt from WITNESS checking; this wasn't possible previously as rm_init() always passed LO_WITNESS when initializing an rmlock's struct lock. - Add RM_SYSINIT_FLAGS(). - Rename embedded mutex in rmlocks to make it more obvious what it is. - Update consumers. - Update man page. Notes: svn path=/head/; revision=193030
* Add hierarchical jails. A jail may further virtualize its environmentJamie Gritton2009-05-271-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor) Notes: svn path=/head/; revision=192895
* rev bpf attach/detach event api to include the dltSam Leffler2009-05-251-2/+2
| | | | Notes: svn path=/head/; revision=192763