aboutsummaryrefslogtreecommitdiff
path: root/sys/netinet/igmp.c
Commit message (Collapse)AuthorAgeFilesLines
* igmp: Avoid an out-of-bounds access when zeroing countersMark Johnston2021-05-051-1/+1
| | | | | | | | | | When verifying, byte-by-byte, that the user-supplied counters are zero-filled, sysctl_igmp_stat() would check for zero before checking the loop bound. Perform the checks in the correct order. Reported by: KASAN MFC after: 1 week Sponsored by: The FreeBSD Foundation
* igmp: Avoid leaking mbuf when source validation failsMark Johnston2021-01-081-0/+1
| | | | | | PR: 252504 Submitted by: Panagiotis Tsolakos <panagiotis.tsolakos@gmail.com> MFC after: 3 days
* igmp: convert igmpstat to use PCPU countersMitchell Horne2020-11-081-19/+24
| | | | | | | | | | | | | | | | | | | | Currently there is no locking done to protect this structure. It is likely okay due to the low-volume nature of IGMP, but allows for the possibility of underflow. This appears to be one of the only holdouts of the conversion to counter(9) which was done for most protocol stat structures around 2013. This also updates the visibility of this stats structure so that it can be consumed from elsewhere in the kernel, consistent with the vast majority of VNET_PCPUSTAT structures. Reviewed by: kp Sponsored by: NetApp, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D27023 Notes: svn path=/head/; revision=367493
* net: clean up empty lines in .c and .h filesMateusz Guzik2020-09-011-2/+0
| | | | Notes: svn path=/head/; revision=365071
* Fix an issue of net.inet.igmp.stats handler.Hiroki Sato2020-03-071-2/+57
| | | | | | | | | | | The header of (struct igmpstat) could be cleared by sysctl(3). This can be reproduced by "netstat -s -z -p igmp". PR: 244584 MFC after: 1 week Notes: svn path=/head/; revision=358730
* Fix kernel panic while trying to read multicast stream.Hans Petter Selasky2020-02-171-0/+1
| | | | | | | | | | | | | | | | | When VIMAGE is enabled make sure the "m_pkthdr.rcvif" pointer is set for all mbufs being input by the IGMP/MLD6 code. Else there will be a NULL-pointer dereference in the netisr code when trying to set the VNET based on the incoming mbuf. Add an assert to catch this when queueing mbufs on a netisr to make debugging of similar cases easier. Found by: Vladislav V. Prodan PR: 244002 Reviewed by: bz@ MFC after: 1 week Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=358013
* White space cleanup -- remove trailing tab's or spacesRandall Stewart2020-02-121-2/+2
| | | | | | | | | from any line. Sponsored by: Netflix Inc. Notes: svn path=/head/; revision=357818
* Take the ifnet's address lock in igmp_v3_cancel_link_timers().Mark Johnston2020-01-031-2/+4
| | | | | | | | | | | | | inm_rele_locked() may remove the multicast address associated with inm. Reported by: syzbot+871c5d1fd5fac6c28f52@syzkaller.appspotmail.com Reviewed by: hselasky MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23009 Notes: svn path=/head/; revision=356321
* Quickly fix up r353683: enter the epoch before calling into netisr_dispatch().Gleb Smirnoff2019-10-171-0/+3
| | | | Notes: svn path=/head/; revision=353687
* igmp_v1v2_queue_report() doesn't require epoch.Gleb Smirnoff2019-10-171-1/+0
| | | | Notes: svn path=/head/; revision=353683
* Widen NET_EPOCH coverage.Gleb Smirnoff2019-10-071-25/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When epoch(9) was introduced to network stack, it was basically dropped in place of existing locking, which was mutexes and rwlocks. For the sake of performance mutex covered areas were as small as possible, so became epoch covered areas. However, epoch doesn't introduce any contention, it just delays memory reclaim. So, there is no point to minimise epoch covered areas in sense of performance. Meanwhile entering/exiting epoch also has non-zero CPU usage, so doing this less often is a win. Not the least is also code maintainability. In the new paradigm we can assume that at any stage of processing a packet, we are inside network epoch. This makes coding both input and output path way easier. On output path we already enter epoch quite early - in the ip_output(), in the ip6_output(). This patch does the same for the input path. All ISR processing, network related callouts, other ways of packet injection to the network stack shall be performed in net_epoch. Any leaf function that walks network configuration now asserts epoch. Tricky part is configuration code paths - ioctls, sysctls. They also call into leaf functions, so some need to be changed. This patch would introduce more epoch recursions (see EPOCH_TRACE) than we had before. They will be cleaned up separately, as several of them aren't trivial. Note, that unlike a lock recursion the epoch recursion is safe and just wastes a bit of resources. Reviewed by: gallatin, hselasky, cy, adrian, kristof Differential Revision: https://reviews.freebsd.org/D19111 Notes: svn path=/head/; revision=353292
* Mechanical cleanup of epoch(9) usage in network stack.Gleb Smirnoff2019-01-091-17/+25
| | | | | | | | | | | | | | | | | | | | | | | | - Remove macros that covertly create epoch_tracker on thread stack. Such macros a quite unsafe, e.g. will produce a buggy code if same macro is used in embedded scopes. Explicitly declare epoch_tracker always. - Unmask interface list IFNET_RLOCK_NOSLEEP(), interface address list IF_ADDR_RLOCK() and interface AF specific data IF_AFDATA_RLOCK() read locking macros to what they actually are - the net_epoch. Keeping them as is is very misleading. They all are named FOO_RLOCK(), while they no longer have lock semantics. Now they allow recursion and what's more important they now no longer guarantee protection against their companion WLOCK macros. Note: INP_HASH_RLOCK() has same problems, but not touched by this commit. This is non functional mechanical change. The only functionally changed functions are ni6_addrs() and ni6_store_addrs(), where we no longer enter epoch recursively. Discussed with: jtl, gallatin Notes: svn path=/head/; revision=342872
* Use the new VNET_DEFINE_STATIC macro when we are defining static VNETAndrew Turner2018-07-241-13/+13
| | | | | | | | | | | variables. Reviewed by: bz Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D16147 Notes: svn path=/head/; revision=336676
* UDP: further performance improvements on txMatt Macy2018-05-231-13/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cumulative throughput while running 64 netperf -H $DUT -t UDP_STREAM -- -m 1 on a 2x8x2 SKL went from 1.1Mpps to 2.5Mpps Single stream throughput increases from 910kpps to 1.18Mpps Baseline: https://people.freebsd.org/~mmacy/2018.05.11/udpsender2.svg - Protect read access to global ifnet list with epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender3.svg - Protect short lived ifaddr references with epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender4.svg - Convert if_afdata read lock path to epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender5.svg A fix for the inpcbhash contention is pending sufficient time on a canary at LLNW. Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15409 Notes: svn path=/head/; revision=334118
* netinet silence warningsMatt Macy2018-05-191-5/+4
| | | | Notes: svn path=/head/; revision=333869
* ifnet: Replace if_addr_lock rwlock with epoch + mutexMatt Macy2018-05-181-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Run on LLNW canaries and tested by pho@ gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I see an almost 12% improvement in received packet rate, and a larger improvement in bytes delivered all the way to userspace. When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1, I see, using nstat -I mce0 1 before the patch: InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32 4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32 4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32 4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32 4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32 4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32 4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32 After the patch InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51 5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51 5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51 5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51 5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52 5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52 Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15366 Notes: svn path=/head/; revision=333813
* r333175 introduced deferred deletion of multicast addresses in order to ↵Matt Macy2018-05-061-12/+18
| | | | | | | | | | | | | | | | | | permit the driver ioctl to sleep on commands to the NIC when updating multicast filters. More generally this permitted driver's to use an sx as a softc lock. Unfortunately this change introduced a race whereby a a multicast update would still be queued for deletion when ifconfig deleted the interface thus calling down in to _purgemaddrs and synchronously deleting _all_ of the multicast addresses on the interface. Synchronously remove all external references to a multicast address before enqueueing for delete. Reported by: lwhsu Approved by: sbruno Notes: svn path=/head/; revision=333309
* Separate list manipulation locking from state change in multicastStephen Hurd2018-05-021-76/+54
| | | | | | | | | | | | | | Multicast incorrectly calls in to drivers with a mutex held causing drivers to have to go through all manner of contortions to use a non sleepable lock. Serialize multicast updates instead. Submitted by: mmacy <mmacy@mattmacy.io> Reviewed by: shurd, sbruno Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14969 Notes: svn path=/head/; revision=333175
* sys: further adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-201-0/+2
| | | | | | | | | | | | | | | | | Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point. Notes: svn path=/head/; revision=326023
* Add some ntohl() love to r315277Eric van Gyzen2017-03-141-22/+24
| | | | | | | | | | | | | | | | | | | | inet_ntoa() and inet_ntoa_r() take the address in network byte-order. When I removed those calls, I should have replaced them with ntohl() to make the hex addresses slightly less unreadable. Here they are. See r315277 regarding classic blunders. vangyzen: you're deep in "no good deed" territory, it seems --badger Reported by: ian MFC after: 3 days MFC when: I finally get it right Sponsored by: Dell EMC Notes: svn path=/head/; revision=315286
* KTR: log IPv4 addresses in hex rather than dotted-quadEric van Gyzen2017-03-141-100/+45
| | | | | | | | | | | | | | | | | When I made the changes in r313821, I fell victim to one of the classic blunders, the most famous of which is: never get involved in a land war in Asia. But only slightly less well known is this: Keep your brain turned on and engaged when making a tedious, sweeping, mechanical change. KTR can correctly log the immediate integral values passed to it, as well as constant strings, but not non-constant strings, since they might change by the time ktrdump retrieves them. Reported by: glebius MFC after: 3 days Sponsored by: Dell EMC Notes: svn path=/head/; revision=315277
* Renumber copyright clause 4Warner Losh2017-02-281-1/+1
| | | | | | | | | | | | Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96 Notes: svn path=/head/; revision=314436
* Use inet_ntoa_r() instead of inet_ntoa() throughout the kernelEric van Gyzen2017-02-161-25/+69
| | | | | | | | | | | | | | | inet_ntoa() cannot be used safely in a multithreaded environment because it uses a static local buffer. Instead, use inet_ntoa_r() with a buffer on the caller's stack. Suggested by: glebius, emaste Reviewed by: gnn MFC after: 2 weeks Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D9625 Notes: svn path=/head/; revision=313821
* With clang 3.9.0, compiling sys/netinet/igmp.c results in the followingDimitry Andric2016-09-041-4/+4
| | | | | | | | | | | | | | | | | | | | | | | warning: sys/netinet/igmp.c:546:21: error: implicit conversion from 'int' to 'char' changes value from 148 to -108 [-Werror,-Wconstant-conversion] p->ipopt_list[0] = IPOPT_RA; /* Router Alert Option */ ~ ^~~~~~~~ sys/netinet/ip.h:153:19: note: expanded from macro 'IPOPT_RA' #define IPOPT_RA 148 /* router alert */ ^~~ This is because ipopt_list is an array of char, so IPOPT_RA is wrapped to a negative value. It would be nice to change ipopt_list to an array of u_char, but it changes the signature of the public struct ipoption, so add an explicit cast to suppress the warning. Reviewed by: imp MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D7777 Notes: svn path=/head/; revision=305389
* Get closer to a VIMAGE network stack teardown from top to bottom ratherBjoern A. Zeeb2016-06-211-51/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | than removing the network interfaces first. This change is rather larger and convoluted as the ordering requirements cannot be separated. Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and related modules to their own SI_SUB_PROTO_FIREWALL. Move initialization of "physical" interfaces to SI_SUB_DRIVERS, move virtual (cloned) interfaces to SI_SUB_PSEUDO. Move Multicast to SI_SUB_PROTO_MC. Re-work parts of multicast initialisation and teardown, not taking the huge amount of memory into account if used as a module yet. For interface teardown we try to do as many of them as we can on SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling over a higher layer protocol such as IP. In that case the interface has to go along (or before) the higher layer protocol is shutdown. Kernel hhooks need to go last on teardown as they may be used at various higher layers and we cannot remove them before we cleaned up the higher layers. For interface teardown there are multiple paths: (a) a cloned interface is destroyed (inside a VIMAGE or in the base system), (b) any interface is moved from a virtual network stack to a different network stack ("vmove"), or (c) a virtual network stack is being shut down. All code paths go through if_detach_internal() where we, depending on the vmove flag or the vnet state, make a decision on how much to shut down; in case we are destroying a VNET the individual protocol layers will cleanup their own parts thus we cannot do so again for each interface as we end up with, e.g., double-frees, destroying locks twice or acquiring already destroyed locks. When calling into protocol cleanups we equally have to tell them whether they need to detach upper layer protocols ("ulp") or not (e.g., in6_ifdetach()). Provide or enahnce helper functions to do proper cleanup at a protocol rather than at an interface level. Approved by: re (hrs) Obtained from: projects/vnet Reviewed by: gnn, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6747 Notes: svn path=/head/; revision=302054
* Add a `show igi_list` command to DDB to debug IGMP state.Bjoern A. Zeeb2016-06-061-0/+37
| | | | | | | | | Obtained from: projects/vnet MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=301527
* sys/net*: minor spelling fixes.Pedro F. Giffuni2016-05-031-2/+2
| | | | | | | No functional change. Notes: svn path=/head/; revision=298995
* netinet: for pointers replace 0 with NULL.Pedro F. Giffuni2016-04-151-1/+1
| | | | | | | | | | | These are mostly cosmetical, no functional change. Found with devel/coccinelle. Reviewed by: ae. tuexen Notes: svn path=/head/; revision=298066
* The variable is write once only and not used.Bjoern A. Zeeb2016-01-211-4/+0
| | | | | | | | | | | | | Recover the vertical space. Sponsored by: The FreeBSD Foundation MFC After: 3 days Obtained from: p4 CH=180830 Reviewed by: gnn, hiren Differential Revision: https://reviews.freebsd.org/D4898 Notes: svn path=/head/; revision=294514
* In the same way fix the problem described in r291578 for IGMPv3.Andrey V. Elsukov2015-12-011-0/+10
| | | | | | | | | | | | | | | | | | | In case when router has a lot of multicast groups, the reply can take several packets due to MTU limitation. Also we have a limit IGMP_MAX_RESPONSE_BURST == 4, that limits the number of packets we send in one shot. Then we recalculate the timer value and schedule the remaining packets for sending. The problem is that when we call igmp_v3_dispatch_general_query() to send remaining packets, we queue new reply in the same mbuf queue. And when number of packets is bigger than IGMP_MAX_RESPONSE_BURST, we get endless reply of IGMPv3 reports. To fix this, add the check for remaining packets in the queue. MFC after: 1 week Sponsored by: Yandex LLC Notes: svn path=/head/; revision=291579
* Convert in_ifaddr_lock and in6_ifaddr_lock to rmlock.Andrey V. Elsukov2015-07-291-3/+8
| | | | | | | | | | | | | | Both are used to protect access to IP addresses lists and they can be acquired for reading several times per packet. To reduce lock contention it is better to use rmlock here. Reviewed by: gnn (previous version) Obtained from: Yandex LLC Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D3149 Notes: svn path=/head/; revision=286001
* Improve patch for SA-15:04.igmp to solve a potential buffer overflow.Xin LI2015-04-071-4/+3
| | | | | | | | Reported by: bde Submitted by: oshogbo Notes: svn path=/head/; revision=281228
* Fix integer overflow in IGMP protocol.Xin LI2015-02-251-2/+2
| | | | | | | | | | | | | Security: FreeBSD-SA-15:04.igmp Security: CVE-2015-1414 Found by: Mateusz Kocielski, Logicaltrust Analyzed by: Marek Kroemeke, Mateusz Kocielski (shm@NetBSD.org) and 22733db72ab3ed94b5f8a1ffcde850251fe6f466 Submited by: Mariusz Zaborski <oshogbo@FreeBSD.org> Reviewed by: bms Notes: svn path=/head/; revision=279262
* - Rename 'struct igmp_ifinfo' into 'struct igmp_ifsoftc', since it reallyGleb Smirnoff2015-02-191-50/+60
| | | | | | | | | | | | | | | | represents a context. - Preserve name 'struct igmp_ifinfo' for a new structure, that will be stable API between userland and kernel. - Make sysctl_igmp_ifinfo() return the new 'struct igmp_ifinfo', instead of old one, which had a bunch of internal kernel structures in it. - Move all above declarations from in_var.h to igmp_var.h, since they are private to IGMP code. Sponsored by: Netflix Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=279026
* Use new struct mbufq instead of struct ifqueue to manage packet queues inGleb Smirnoff2015-02-191-56/+44
| | | | | | | | | | IPv4 multicast code. Sponsored by: Netflix Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=278978
* To ease changes to underlying mbuf structure and the mbuf allocator, reduceRobert Watson2015-01-051-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | the knowledge of mbuf layout, and in particular constants such as M_EXT, MLEN, MHLEN, and so on, in mbuf consumers by unifying various alignment utility functions (M_ALIGN(), MH_ALIGN(), MEXT_ALIGN() in a single M_ALIGN() macro, implemented by a now-inlined m_align() function: - Move m_align() from uipc_mbuf.c to mbuf.h; mark as __inline. - Reimplement M_ALIGN(), MH_ALIGN(), and MEXT_ALIGN() using m_align(). - Update consumers around the tree to simply use M_ALIGN(). This change eliminates a number of cases where mbuf consumers must be aware of whether or not mbufs returned by the allocator use external storage, but also assumptions about the size of the returned mbuf. This will make it easier to introduce changes in how we use external storage, as well as features such as variable-size mbufs. Differential Revision: https://reviews.freebsd.org/D1436 Reviewed by: glebius, trasz, gnn, bz Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=276692
* Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed.Gleb Smirnoff2014-11-071-11/+11
| | | | | | | Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=274225
* When deciding whether to call m_pullup() even though there is adequateRobert Watson2014-10-121-2/+2
| | | | | | | | | | | | | | | | | | | data in an mbuf, use M_WRITABLE() instead of a direct test of M_EXT; the latter both unnecessarily exposes mbuf-allocator internals in the protocol stack and is also insufficient to catch all cases of non-writability. (NB: m_pullup() does not actually guarantee that a writable mbuf is returned, so further refinement of all of these code paths continues to be required.) Reviewed by: bz MFC after: 3 days Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D900 Notes: svn path=/head/; revision=272984
* Fix one more compiler warning, m is not initialized.Konstantin Belousov2014-08-081-1/+1
| | | | Notes: svn path=/head/; revision=269726
* Fix argument to KTR after r269699 to unbreak LINT builds.Bjoern A. Zeeb2014-08-081-1/+1
| | | | Notes: svn path=/head/; revision=269705
* Merge 'struct ip6protosw' and 'struct protosw' into one. Now we haveKevin Lo2014-08-081-17/+21
| | | | | | | | | | only one protocol switch structure that is shared between ipv4 and ipv6. Phabric: D476 Reviewed by: jhb Notes: svn path=/head/; revision=269699
* The r48589 promised to remove implicit inclusion of if_var.h soon. PrepareGleb Smirnoff2013-10-261-0/+1
| | | | | | | | | | | to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h Sponsored by: Netflix Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=257176
* Restructure the mbuf pkthdr to make it fit for upcoming capabilities andAndre Oppermann2013-08-241-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | features. The changes in particular are: o Remove rarely used "header" pointer and replace it with a 64bit protocol/ layer specific union PH_loc for local use. Protocols can flexibly overlay their own 8 to 64 bit fields to store information while the packet is worked on. o Mechanically convert IP reassembly, IGMP/MLD and ATM to use pkthdr.PH_loc instead of pkthdr.header. o Extend csum_flags to 64bits to allow for additional future offload information to be carried (e.g. iSCSI, IPsec offload, and others). o Move the RSS hash type enumerator from abusing m_flags to its own 8bit rsstype field. Adjust accessor macros. o Add cosqos field to store Class of Service / Quality of Service information with the packet. It is not yet supported in any drivers but allows us to get on par with Cisco/Juniper in routing applications (plus MPLS QoS) with a modernized ALTQ. o Add four 8 bit fields l[2-5]hlen to store the relative header offsets from the start of the packet. This is important for various offload capabilities and to relieve the drivers from having to parse the packet and protocol headers to find out location of checksums and other information. Header parsing in drivers is a lot of copy-paste and unhandled corner cases which we want to avoid. o Add another flexible 64bit union to map various additional persistent packet information, like ether_vtag, tso_segsz and csum fields. Depending on the csum_flags settings some fields may have different usage making it very flexible and adaptable to future capabilities. o Restructure the CSUM flags to better signify their outbound (down the stack) and inbound (up the stack) use. The CSUM flags used to be a bit chaotic and rather poorly documented leading to incorrect use in many places. Bring clarity into their use through better naming. Compatibility mappings are provided to preserve the API. The drivers can be corrected one by one and MFC'd without issue. o The size of pkthdr stays the same at 48/56bytes (32/64bit architectures). Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=254804
* Add m_clrprotoflags() to clear protocol specific mbuf flags at up andAndre Oppermann2013-08-191-1/+1
| | | | | | | | | | | downwards layer crossings. Consistently use it within IP, IPv6 and ethernet protocols. Discussed with: trociny, glebius Notes: svn path=/head/; revision=254523
* Disable IGMPv3 link timers on a transition to IGMPv2.Bruce M Simpson2013-06-071-0/+1
| | | | | | | Submitted by: Alan Smithee Notes: svn path=/head/; revision=251502
* - Replace compat macros with function calls.Gleb Smirnoff2013-03-161-1/+1
| | | | Notes: svn path=/head/; revision=248373
* We can, and should use M_WAITOK here.Gleb Smirnoff2013-03-151-1/+1
| | | | | | | Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=248326
* Mechanically substitute flags from historic mbuf allocator withGleb Smirnoff2012-12-051-9/+9
| | | | | | | | | | | | malloc(9) flags within sys. Exceptions: - sys/contrib not touched - sys/mbuf.h edited manually Notes: svn path=/head/; revision=243882
* Do not reduce ip_len by size of IP header in the ip_input()Gleb Smirnoff2012-10-231-1/+1
| | | | | | | | | | | | | | before passing a packet to protocol input routines. For several protocols this mean that now protocol needs to do subtraction itself, and for another half this means that we do not need to add header length back to the packet. Make ip_stripoptions() to adjust ip_len, since now we enter this function with a packet header whose ip_len does represent length of entire packet, not payload only. Notes: svn path=/head/; revision=241923
* Switch the entire IPv4 stack to keep the IP packet headerGleb Smirnoff2012-10-221-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | in network byte order. Any host byte order processing is done in local variables and host byte order values are never[1] written to a packet. After this change a packet processed by the stack isn't modified at all[2] except for TTL. After this change a network stack hacker doesn't need to scratch his head trying to figure out what is the byte order at the given place in the stack. [1] One exception still remains. The raw sockets convert host byte order before pass a packet to an application. Probably this would remain for ages for compatibility. [2] The ip_input() still subtructs header len from ip->ip_len, but this is planned to be fixed soon. Reviewed by: luigi, Maxim Dounin <mdounin mdounin.ru> Tested by: ray, Olivier Cochard-Labbe <olivier cochard.me> Notes: svn path=/head/; revision=241913