aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMarcin Wojtas <mw@FreeBSD.org>2021-07-23 22:31:32 +0000
committerMarcin Wojtas <mw@FreeBSD.org>2021-10-07 16:10:32 +0000
commit112c1187c29a0d4a8dc327834dd1dc6acd40c7f7 (patch)
tree54b40093e29d5cd554a5ca1d109fd7e204a4432f
parent87ffe594707105711e7cbc1fa8d5b4a5a8a67165 (diff)
downloadsrc-112c1187c29a0d4a8dc327834dd1dc6acd40c7f7.tar.gz
src-112c1187c29a0d4a8dc327834dd1dc6acd40c7f7.zip
Upgrade ENA to v2.4.1
ena: Remove redundant declaration of ena_log_level. GCC6 raises a -Wredundant-decl error due to duplicate declarations in ena_fbsd_log.h and ena_plat.h. Sponsored by: Chelsio Communications (cherry picked from commit 8843787aa1bdbd10de6ba47a04489179ec2d2d3c) ena: Avoid unnecessary mbuf collapses for LLQ condition In case of Low-latency Queue, one small enough descriptor can be pushed directly to the ENA hw, thus saving one fragment. Check for this condition before performing collapse. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit c81f8c26115a64b9a97ecdb2a64e824dd839ee73) ena: Trigger reset on ena_com_prepare_tx failure All ena_com_prepare_tx errors other than ENA_COM_NO_MEM are fatal and require device reset. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 36130d2979d695dd439bc607feb00dcdb9a1937b) ena: Prevent reset after device destruction Check for ENA_FLAG_TRIGGER_RESET inside a locked context in order to avoid potential race conditions with ena_destroy_device. This aligns the reset task logic with the Linux driver. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 433ab9b6987b42b3e5b25b8b5dc7e5178c7ef9bb) ena: Add extra log messages Stay aligned with the Linux driver by adding the following logs: * inform the user about retrying queue creation * warn on non-empty ena_tx_buffer.mbuf prior to ena_tx_map_mbuf Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 77160654a162b5faa8ad7a02e18d2bef2589f868) ena: Add locking assertions ENA silently assumed that ena_up, ena_down and ena_start_xmit routines should be called within locked context. Driver's logic heavily assumes on concurrent access to those routines, so for safety and better documentation about this assumption, the locking assertions were added to the above functions. The assertion was added only for the main steps (skipping the helper functions) which can be called from multiple places including the kernel and the driver itself. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit cb98c439d66c303353a9f4abbbe9ddb51559c638) ena: Move RSS logic into its own source files Delegate RSS related functionality into separate .c/.h files in preparation for the full RSS support. While at it, reorder functions and remove prototypes for ones with internal linkage. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 986e7b9227668caf9620f207e3c1d708c87b634d) ena: Disable meta descriptor caching for netmap If LLQ is being used, `ena_tx_ctx.meta_valid` must stay enabled. This fixes netmap support on latest generation ENA HW and aligns it with the core driver behavior. As netmap doesn't support any csum offloads, the `adapter->disable_meta_caching` value can be simply passed to the HW. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit a831466830de6ab55fc03170290b313157196e81) ena: Share ena_global_lock between driver instances In order to use `ena_global_lock` in sysctl context, it must be kept outside the driver instance's software context, as sysctls can be called before attach and after detach, leading to lock use before sx_init and after sx_destroy otherwise. Solve this issue by turning `ena_global_lock` into a file scope variable, shared between all instances of the driver and associated sysctl context, and in turn initialized/destroyed in dedicated SYSINIT/SYSUNINIT functions. As a side effect, this change also fixes existing race in the reset routine, when simultaneously accessing sysctl exposed properties. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 07aff471c0de2de9a1dc5c7749c46b525bdd0201) ena: Add missing statistics Provide the following sysctl statistics in order to stay aligned with the Linux driver: * rx_ring.csum_good * tx_ring.unmask_interrupt_num Also rename the 'bad_csum' statistic name to 'csum_bad' for alignment. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 223c8cb12e951c63807300a0cbdc4a1569520b4b) ena: Implement full RSS reconfiguration Bind RX/TX queues and MSI-X vectors to matching CPUs based on the RSS bucket entries. Introduce sysctls for the following RSS functionality: - rss.indir_table: indirection table mapping - rss.indir_table_size: indirection table size - rss.key: RSS hash key (if Toeplitz used) Said sysctls are only available when compiled without `option RSS`, as kernel-side RSS support currently doesn't offer RSS reconfiguration. Migrate the hash algorithm from CRC32 to Toeplitz and change the initial hash value to 0x0 in order to match the standard Toeplitz implementation. Provide helpers for hash key inversion required for HW operations. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 6d1ef2abd330fac4057f092abbbdc28a568b4327) ena: fix building in-kernel driver When building ENA as compiled into the kernel, the driver would fail to build. Resolve the problem by introducing the following changes: 1. Add missing `ena_rss.c` entry in `sys/conf/files`. 2. Prevent SYSCTL_ADD_INT from throwing an assert due to an extra CTLTYPE_INT flag. Fixes: 986e7b92276 ("ena: Move RSS logic into its own source files") Fixes: 6d1ef2abd33 ("ena: Implement full RSS reconfiguration") Obtained from: Semihalf Sponsored by: Amazon, Inc. MFC after: 1 week (cherry picked from commit a3f0d18237bdcf272461d3b4b682de384c572144) ena: Update driver version to v2.4.1 Some of the changes in this release: * Hardware RSS hash key reconfiguration and indirection table reconfiguration support. * Full kernel RSS support. * Extra statistic counters. * Netmap support for ENAv3. * Locking assertions. * Extra log messages. * Reset handling fixes. Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 42c7760be3ea420668f625f2064ae347aa7e818e)
-rw-r--r--share/man/man4/ena.451
-rw-r--r--sys/conf/files2
-rw-r--r--sys/contrib/ena-com/ena_plat.h2
-rw-r--r--sys/dev/ena/ena.c302
-rw-r--r--sys/dev/ena/ena.h34
-rw-r--r--sys/dev/ena/ena_datapath.c32
-rw-r--r--sys/dev/ena/ena_netmap.c7
-rw-r--r--sys/dev/ena/ena_rss.c300
-rw-r--r--sys/dev/ena/ena_rss.h73
-rw-r--r--sys/dev/ena/ena_sysctl.c329
-rw-r--r--sys/modules/ena/Makefile2
11 files changed, 900 insertions, 234 deletions
diff --git a/share/man/man4/ena.4 b/share/man/man4/ena.4
index cd98fe2c84ba..aacf7956c9f8 100644
--- a/share/man/man4/ena.4
+++ b/share/man/man4/ena.4
@@ -269,6 +269,57 @@ command should be used:
.Bd -literal -offset indent
sysctl dev.ena.1.eni_metrics.sample_interval=10
.Ed
+.It Va dev.ena.X.rss.indir_table_size
+RSS indirection table size.
+The default is 128.
+Returns the number of entries in the RSS indirection table.
+.Pp
+Example:
+To read the RSS indirection table size, the following command should be used:
+.Bd -literal -offset indent
+sysctl dev.ena.0.rss.indir_table_size
+.Ed
+.It Va dev.ena.X.rss.indir_table
+RSS indirection table mapping.
+The default is x:y key-pairs of indir_table_size length.
+Updates selected indices of the RSS indirection table.
+.Pp
+The entry string consists of one or more x:y keypairs, where x stands for
+the table index and y for its new value. Table indices that don't need to be
+updated can be omitted from the string and will retain their existing values.
+.Pp
+If an index is entered more than once, the last value is used.
+.Pp
+Example:
+To update two selected indices in the RSS indirection table, e.g. setting index
+0 to queue 5 and then index 5 to queue 0, the following command should be used:
+.Bd -literal -offset indent
+sysctl dev.ena.0.rss.indir_table="0:5 5:0"
+.Ed
+.It Va dev.ena.X.rss.key
+RSS hash key.
+The default is 40 bytes long randomly generated hash key.
+Controls the RSS Toeplitz hash algorithm key value.
+.Pp
+Only available when driver compiled without the kernel side RSS support.
+.Pp
+Example:
+To change the RSS hash key value to
+.Pp
+0x6d, 0x5a, 0x56, 0xda, 0x25, 0x5b, 0x0e, 0xc2,
+.br
+0x41, 0x67, 0x25, 0x3d, 0x43, 0xa3, 0x8f, 0xb0,
+.br
+0xd0, 0xca, 0x2b, 0xcb, 0xae, 0x7b, 0x30, 0xb4,
+.br
+0x77, 0xcb, 0x2d, 0xa3, 0x80, 0x30, 0xf2, 0x0c,
+.br
+0x6a, 0x42, 0xb7, 0x3b, 0xbe, 0xac, 0x01, 0xfa
+.Pp
+the following command should be used:
+.Bd -literal -offset indent
+sysctl dev.ena.0.rss.key=6d5a56da255b0ec24167253d43a38fb0d0ca2bcbae7b30b477cb2da38030f20c6a42b73bbeac01fa
+.Ed
.El
.Sh DIAGNOSTICS
.Ss Device initialization phase
diff --git a/sys/conf/files b/sys/conf/files
index 21239c31e83f..69ebfe39340d 100644
--- a/sys/conf/files
+++ b/sys/conf/files
@@ -1700,6 +1700,8 @@ dev/ena/ena_datapath.c optional ena \
compile-with "${NORMAL_C} -I$S/contrib"
dev/ena/ena_netmap.c optional ena \
compile-with "${NORMAL_C} -I$S/contrib"
+dev/ena/ena_rss.c optional ena \
+ compile-with "${NORMAL_C} -I$S/contrib"
dev/ena/ena_sysctl.c optional ena \
compile-with "${NORMAL_C} -I$S/contrib"
contrib/ena-com/ena_com.c optional ena
diff --git a/sys/contrib/ena-com/ena_plat.h b/sys/contrib/ena-com/ena_plat.h
index b31821248398..274f795950c0 100644
--- a/sys/contrib/ena-com/ena_plat.h
+++ b/sys/contrib/ena-com/ena_plat.h
@@ -98,8 +98,6 @@ extern struct ena_bus_space ebs;
#define DEFAULT_ALLOC_ALIGNMENT 8
#define ENA_CDESC_RING_SIZE_ALIGNMENT (1 << 12) /* 4K */
-extern int ena_log_level;
-
#define container_of(ptr, type, member) \
({ \
const __typeof(((type *)0)->member) *__p = (ptr); \
diff --git a/sys/dev/ena/ena.c b/sys/dev/ena/ena.c
index 84d58c844332..84ef234cd937 100644
--- a/sys/dev/ena/ena.c
+++ b/sys/dev/ena/ena.c
@@ -63,9 +63,6 @@ __FBSDID("$FreeBSD$");
#include <net/if_media.h>
#include <net/if_types.h>
#include <net/if_vlan_var.h>
-#ifdef RSS
-#include <net/rss_config.h>
-#endif
#include <netinet/in_systm.h>
#include <netinet/in.h>
@@ -84,6 +81,7 @@ __FBSDID("$FreeBSD$");
#include "ena_datapath.h"
#include "ena.h"
#include "ena_sysctl.h"
+#include "ena_rss.h"
#ifdef DEV_NETMAP
#include "ena_netmap.h"
@@ -143,7 +141,6 @@ static void ena_free_io_irq(struct ena_adapter *);
static void ena_free_irqs(struct ena_adapter*);
static void ena_disable_msix(struct ena_adapter *);
static void ena_unmask_all_io_irqs(struct ena_adapter *);
-static int ena_rss_configure(struct ena_adapter *);
static int ena_up_complete(struct ena_adapter *);
static uint64_t ena_get_counter(if_t, ift_counter);
static int ena_media_change(if_t);
@@ -161,8 +158,6 @@ static int ena_set_queues_placement_policy(device_t, struct ena_com_dev *,
static uint32_t ena_calc_max_io_queue_num(device_t, struct ena_com_dev *,
struct ena_com_dev_get_features_ctx *);
static int ena_calc_io_queue_size(struct ena_calc_queue_size_ctx *);
-static int ena_rss_init_default(struct ena_adapter *);
-static void ena_rss_init_default_deferred(void *);
static void ena_config_host_info(struct ena_com_dev *, device_t);
static int ena_attach(device_t);
static int ena_detach(device_t);
@@ -186,6 +181,8 @@ static ena_vendor_info_t ena_vendor_info_array[] = {
{ 0, 0, 0 }
};
+struct sx ena_global_lock;
+
/*
* Contains pointers to event handlers, e.g. link state chage.
*/
@@ -265,27 +262,6 @@ fail_tag:
return (error);
}
-/*
- * This function should generate unique key for the whole driver.
- * If the key was already genereated in the previous call (for example
- * for another adapter), then it should be returned instead.
- */
-void
-ena_rss_key_fill(void *key, size_t size)
-{
- static bool key_generated;
- static uint8_t default_key[ENA_HASH_KEY_SIZE];
-
- KASSERT(size <= ENA_HASH_KEY_SIZE, ("Requested more bytes than ENA RSS key can hold"));
-
- if (!key_generated) {
- arc4random_buf(default_key, ENA_HASH_KEY_SIZE);
- key_generated = true;
- }
-
- memcpy(key, default_key, size);
-}
-
static void
ena_free_pci_resources(struct ena_adapter *adapter)
{
@@ -625,8 +601,10 @@ static int
ena_setup_tx_resources(struct ena_adapter *adapter, int qid)
{
device_t pdev = adapter->pdev;
+ char thread_name[MAXCOMLEN + 1];
struct ena_que *que = &adapter->que[qid];
struct ena_ring *tx_ring = que->tx_ring;
+ cpuset_t *cpu_mask = NULL;
int size, i, err;
#ifdef DEV_NETMAP
bus_dmamap_t *map;
@@ -710,8 +688,16 @@ ena_setup_tx_resources(struct ena_adapter *adapter, int qid)
tx_ring->running = true;
- taskqueue_start_threads(&tx_ring->enqueue_tq, 1, PI_NET,
- "%s txeq %d", device_get_nameunit(adapter->pdev), que->cpu);
+#ifdef RSS
+ cpu_mask = &que->cpu_mask;
+ snprintf(thread_name, sizeof(thread_name), "%s txeq %d",
+ device_get_nameunit(adapter->pdev), que->cpu);
+#else
+ snprintf(thread_name, sizeof(thread_name), "%s txeq %d",
+ device_get_nameunit(adapter->pdev), que->id);
+#endif
+ taskqueue_start_threads_cpuset(&tx_ring->enqueue_tq, 1, PI_NET,
+ cpu_mask, "%s", thread_name);
return (0);
@@ -1153,8 +1139,6 @@ ena_update_buf_ring_size(struct ena_adapter *adapter,
int rc = 0;
bool dev_was_up;
- ENA_LOCK_LOCK(adapter);
-
old_buf_ring_size = adapter->buf_ring_size;
adapter->buf_ring_size = new_buf_ring_size;
@@ -1189,8 +1173,6 @@ ena_update_buf_ring_size(struct ena_adapter *adapter,
}
}
- ENA_LOCK_UNLOCK(adapter);
-
return (rc);
}
@@ -1202,8 +1184,6 @@ ena_update_queue_size(struct ena_adapter *adapter, uint32_t new_tx_size,
int rc = 0;
bool dev_was_up;
- ENA_LOCK_LOCK(adapter);
-
old_tx_size = adapter->requested_tx_ring_size;
old_rx_size = adapter->requested_rx_ring_size;
adapter->requested_tx_ring_size = new_tx_size;
@@ -1244,8 +1224,6 @@ ena_update_queue_size(struct ena_adapter *adapter, uint32_t new_tx_size,
}
}
- ENA_LOCK_UNLOCK(adapter);
-
return (rc);
}
@@ -1268,8 +1246,6 @@ ena_update_io_queue_nb(struct ena_adapter *adapter, uint32_t new_num)
int rc = 0;
bool dev_was_up;
- ENA_LOCK_LOCK(adapter);
-
dev_was_up = ENA_FLAG_ISSET(ENA_FLAG_DEV_UP, adapter);
old_num = adapter->num_io_queues;
ena_down(adapter);
@@ -1299,8 +1275,6 @@ ena_update_io_queue_nb(struct ena_adapter *adapter, uint32_t new_num)
}
}
- ENA_LOCK_UNLOCK(adapter);
-
return (rc);
}
@@ -1459,6 +1433,7 @@ ena_create_io_queues(struct ena_adapter *adapter)
struct ena_que *queue;
uint16_t ena_qid;
uint32_t msix_vector;
+ cpuset_t *cpu_mask = NULL;
int rc, i;
/* Create TX queues */
@@ -1525,7 +1500,11 @@ ena_create_io_queues(struct ena_adapter *adapter)
queue->cleanup_tq = taskqueue_create_fast("ena cleanup",
M_WAITOK, taskqueue_thread_enqueue, &queue->cleanup_tq);
- taskqueue_start_threads(&queue->cleanup_tq, 1, PI_NET,
+#ifdef RSS
+ cpu_mask = &queue->cpu_mask;
+#endif
+ taskqueue_start_threads_cpuset(&queue->cleanup_tq, 1, PI_NET,
+ cpu_mask,
"%s queue %d cleanup",
device_get_nameunit(adapter->pdev), i);
}
@@ -1664,7 +1643,10 @@ ena_setup_mgmnt_intr(struct ena_adapter *adapter)
static int
ena_setup_io_intr(struct ena_adapter *adapter)
{
- static int last_bind_cpu = -1;
+#ifdef RSS
+ int num_buckets = rss_getnumbuckets();
+ static int last_bind = 0;
+#endif
int irq_idx;
if (adapter->msix_entries == NULL)
@@ -1682,15 +1664,12 @@ ena_setup_io_intr(struct ena_adapter *adapter)
ena_log(adapter->pdev, DBG, "ena_setup_io_intr vector: %d\n",
adapter->msix_entries[irq_idx].vector);
- /*
- * We want to bind rings to the corresponding cpu
- * using something similar to the RSS round-robin technique.
- */
- if (unlikely(last_bind_cpu < 0))
- last_bind_cpu = CPU_FIRST();
+#ifdef RSS
adapter->que[i].cpu = adapter->irq_tbl[irq_idx].cpu =
- last_bind_cpu;
- last_bind_cpu = CPU_NEXT(last_bind_cpu);
+ rss_getcpu(last_bind);
+ last_bind = (last_bind + 1) % num_buckets;
+ CPU_SETOF(adapter->que[i].cpu, &adapter->que[i].cpu_mask);
+#endif
}
return (0);
@@ -1782,6 +1761,19 @@ ena_request_io_irq(struct ena_adapter *adapter)
goto err;
}
irq->requested = true;
+
+#ifdef RSS
+ rc = bus_bind_intr(adapter->pdev, irq->res, irq->cpu);
+ if (unlikely(rc != 0)) {
+ ena_log(pdev, ERR, "failed to bind "
+ "interrupt handler for irq %ju to cpu %d: %d\n",
+ rman_get_start(irq->res), irq->cpu, rc);
+ goto err;
+ }
+
+ ena_log(pdev, INFO, "queue %d - cpu %d\n",
+ i - ENA_IO_IRQ_FIRST_IDX, irq->cpu);
+#endif
}
return (rc);
@@ -1910,6 +1902,7 @@ ena_unmask_all_io_irqs(struct ena_adapter *adapter)
{
struct ena_com_io_cq* io_cq;
struct ena_eth_io_intr_reg intr_reg;
+ struct ena_ring *tx_ring;
uint16_t ena_qid;
int i;
@@ -1918,47 +1911,12 @@ ena_unmask_all_io_irqs(struct ena_adapter *adapter)
ena_qid = ENA_IO_TXQ_IDX(i);
io_cq = &adapter->ena_dev->io_cq_queues[ena_qid];
ena_com_update_intr_reg(&intr_reg, 0, 0, true);
+ tx_ring = &adapter->tx_ring[i];
+ counter_u64_add(tx_ring->tx_stats.unmask_interrupt_num, 1);
ena_com_unmask_intr(io_cq, &intr_reg);
}
}
-/* Configure the Rx forwarding */
-static int
-ena_rss_configure(struct ena_adapter *adapter)
-{
- struct ena_com_dev *ena_dev = adapter->ena_dev;
- int rc;
-
- /* In case the RSS table was destroyed */
- if (!ena_dev->rss.tbl_log_size) {
- rc = ena_rss_init_default(adapter);
- if (unlikely((rc != 0) && (rc != EOPNOTSUPP))) {
- ena_log(adapter->pdev, ERR,
- "WARNING: RSS was not properly re-initialized,"
- " it will affect bandwidth\n");
- ENA_FLAG_CLEAR_ATOMIC(ENA_FLAG_RSS_ACTIVE, adapter);
- return (rc);
- }
- }
-
- /* Set indirect table */
- rc = ena_com_indirect_table_set(ena_dev);
- if (unlikely((rc != 0) && (rc != EOPNOTSUPP)))
- return (rc);
-
- /* Configure hash function (if supported) */
- rc = ena_com_set_hash_function(ena_dev);
- if (unlikely((rc != 0) && (rc != EOPNOTSUPP)))
- return (rc);
-
- /* Configure hash inputs (if supported) */
- rc = ena_com_set_hash_ctrl(ena_dev);
- if (unlikely((rc != 0) && (rc != EOPNOTSUPP)))
- return (rc);
-
- return (0);
-}
-
static int
ena_up_complete(struct ena_adapter *adapter)
{
@@ -2079,6 +2037,10 @@ err_setup_tx:
return (rc);
}
+ ena_log(pdev, INFO,
+ "Retrying queue creation with sizes TX=%d, RX=%d\n",
+ new_tx_ring_size, new_rx_ring_size);
+
set_io_rings_size(adapter, new_tx_ring_size, new_rx_ring_size);
}
}
@@ -2088,6 +2050,8 @@ ena_up(struct ena_adapter *adapter)
{
int rc = 0;
+ ENA_LOCK_ASSERT();
+
if (unlikely(device_is_attached(adapter->pdev) == 0)) {
ena_log(adapter->pdev, ERR, "device is not attached!\n");
return (ENXIO);
@@ -2205,13 +2169,13 @@ ena_media_status(if_t ifp, struct ifmediareq *ifmr)
struct ena_adapter *adapter = if_getsoftc(ifp);
ena_log(adapter->pdev, DBG, "Media status update\n");
- ENA_LOCK_LOCK(adapter);
+ ENA_LOCK_LOCK();
ifmr->ifm_status = IFM_AVALID;
ifmr->ifm_active = IFM_ETHER;
if (!ENA_FLAG_ISSET(ENA_FLAG_LINK_UP, adapter)) {
- ENA_LOCK_UNLOCK(adapter);
+ ENA_LOCK_UNLOCK();
ena_log(adapter->pdev, INFO, "Link is down\n");
return;
}
@@ -2219,7 +2183,7 @@ ena_media_status(if_t ifp, struct ifmediareq *ifmr)
ifmr->ifm_status |= IFM_ACTIVE;
ifmr->ifm_active |= IFM_UNKNOWN | IFM_FDX;
- ENA_LOCK_UNLOCK(adapter);
+ ENA_LOCK_UNLOCK();
}
static void
@@ -2228,9 +2192,9 @@ ena_init(void *arg)
struct ena_adapter *adapter = (struct ena_adapter *)arg;
if (!ENA_FLAG_ISSET(ENA_FLAG_DEV_UP, adapter)) {
- ENA_LOCK_LOCK(adapter);
+ ENA_LOCK_LOCK();
ena_up(adapter);
- ENA_LOCK_UNLOCK(adapter);
+ ENA_LOCK_UNLOCK();
}
}
@@ -2252,13 +2216,13 @@ ena_ioctl(if_t ifp, u_long command, caddr_t data)
case SIOCSIFMTU:
if (ifp->if_mtu == ifr->ifr_mtu)
break;
- ENA_LOCK_LOCK(adapter);
+ ENA_LOCK_LOCK();
ena_down(adapter);
ena_change_mtu(ifp, ifr->ifr_mtu);
rc = ena_up(adapter);
- ENA_LOCK_UNLOCK(adapter);
+ ENA_LOCK_UNLOCK();
break;
case SIOCSIFFLAGS:
@@ -2270,15 +2234,15 @@ ena_ioctl(if_t ifp, u_long command, caddr_t data)
"ioctl promisc/allmulti\n");
}
} else {
- ENA_LOCK_LOCK(adapter);
+ ENA_LOCK_LOCK();
rc = ena_up(adapter);
- ENA_LOCK_UNLOCK(adapter);
+ ENA_LOCK_UNLOCK();
}
} else {
if ((if_getdrvflags(ifp) & IFF_DRV_RUNNING) != 0) {
- ENA_LOCK_LOCK(adapter);
+ ENA_LOCK_LOCK();
ena_down(adapter);
- ENA_LOCK_UNLOCK(adapter);
+ ENA_LOCK_UNLOCK();
}
}
break;
@@ -2303,10 +2267,10 @@ ena_ioctl(if_t ifp, u_long command, caddr_t data)
if ((reinit != 0) &&
((if_getdrvflags(ifp) & IFF_DRV_RUNNING) != 0)) {
- ENA_LOCK_LOCK(adapter);
+ ENA_LOCK_LOCK();
ena_down(adapter);
rc = ena_up(adapter);
- ENA_LOCK_UNLOCK(adapter);
+ ENA_LOCK_UNLOCK();
}
}
@@ -2461,6 +2425,8 @@ ena_down(struct ena_adapter *adapter)
{
int rc;
+ ENA_LOCK_ASSERT();
+
if (!ENA_FLAG_ISSET(ENA_FLAG_DEV_UP, adapter))
return;
@@ -2526,6 +2492,10 @@ ena_calc_max_io_queue_num(device_t pdev, struct ena_com_dev *ena_dev,
/* 1 IRQ for for mgmnt and 1 IRQ for each TX/RX pair */
max_num_io_queues = min_t(uint32_t, max_num_io_queues,
pci_msix_count(pdev) - 1);
+#ifdef RSS
+ max_num_io_queues = min_t(uint32_t, max_num_io_queues,
+ rss_getnumbuckets());
+#endif
return (max_num_io_queues);
}
@@ -2722,90 +2692,6 @@ ena_calc_io_queue_size(struct ena_calc_queue_size_ctx *ctx)
return (0);
}
-static int
-ena_rss_init_default(struct ena_adapter *adapter)
-{
- struct ena_com_dev *ena_dev = adapter->ena_dev;
- device_t dev = adapter->pdev;
- int qid, rc, i;
-
- rc = ena_com_rss_init(ena_dev, ENA_RX_RSS_TABLE_LOG_SIZE);
- if (unlikely(rc != 0)) {
- ena_log(dev, ERR, "Cannot init indirect table\n");
- return (rc);
- }
-
- for (i = 0; i < ENA_RX_RSS_TABLE_SIZE; i++) {
- qid = i % adapter->num_io_queues;
- rc = ena_com_indirect_table_fill_entry(ena_dev, i,
- ENA_IO_RXQ_IDX(qid));
- if (unlikely((rc != 0) && (rc != EOPNOTSUPP))) {
- ena_log(dev, ERR, "Cannot fill indirect table\n");
- goto err_rss_destroy;
- }
- }
-
-#ifdef RSS
- uint8_t rss_algo = rss_gethashalgo();
- if (rss_algo == RSS_HASH_TOEPLITZ) {
- uint8_t hash_key[RSS_KEYSIZE];
-
- rss_getkey(hash_key);
- rc = ena_com_fill_hash_function(ena_dev, ENA_ADMIN_TOEPLITZ,
- hash_key, RSS_KEYSIZE, 0xFFFFFFFF);
- } else
-#endif
- rc = ena_com_fill_hash_function(ena_dev, ENA_ADMIN_CRC32, NULL,
- ENA_HASH_KEY_SIZE, 0xFFFFFFFF);
- if (unlikely((rc != 0) && (rc != EOPNOTSUPP))) {
- ena_log(dev, ERR, "Cannot fill hash function\n");
- goto err_rss_destroy;
- }
-
- rc = ena_com_set_default_hash_ctrl(ena_dev);
- if (unlikely((rc != 0) && (rc != EOPNOTSUPP))) {
- ena_log(dev, ERR, "Cannot fill hash control\n");
- goto err_rss_destroy;
- }
-
- return (0);
-
-err_rss_destroy:
- ena_com_rss_destroy(ena_dev);
- return (rc);
-}
-
-static void
-ena_rss_init_default_deferred(void *arg)
-{
- struct ena_adapter *adapter;
- devclass_t dc;
- int max;
- int rc;
-
- dc = devclass_find("ena");
- if (unlikely(dc == NULL)) {
- ena_log_raw(ERR, "SYSINIT: %s: No devclass ena\n", __func__);
- return;
- }
-
- max = devclass_get_maxunit(dc);
- while (max-- >= 0) {
- adapter = devclass_get_softc(dc, max);
- if (adapter != NULL) {
- rc = ena_rss_init_default(adapter);
- ENA_FLAG_SET_ATOMIC(ENA_FLAG_RSS_ACTIVE, adapter);
- if (unlikely(rc != 0)) {
- ena_log(adapter->pdev, WARN,
- "WARNING: RSS was not properly initialized,"
- " it will affect bandwidth\n");
- ENA_FLAG_CLEAR_ATOMIC(ENA_FLAG_RSS_ACTIVE, adapter);
- }
- }
- }
-}
-SYSINIT(ena_rss_init, SI_SUB_KICK_SCHEDULER, SI_ORDER_SECOND, ena_rss_init_default_deferred, NULL);
-
static void
ena_config_host_info(struct ena_com_dev *ena_dev, device_t dev)
{
@@ -2838,7 +2724,8 @@ ena_config_host_info(struct ena_com_dev *ena_dev, device_t dev)
(DRV_MODULE_VER_SUBMINOR << ENA_ADMIN_HOST_INFO_SUB_MINOR_SHIFT);
host_info->num_cpus = mp_ncpus;
host_info->driver_supported_features =
- ENA_ADMIN_HOST_INFO_RX_OFFSET_MASK;
+ ENA_ADMIN_HOST_INFO_RX_OFFSET_MASK |
+ ENA_ADMIN_HOST_INFO_RSS_CONFIGURABLE_FUNCTION_KEY_MASK;
rc = ena_com_set_host_attributes(ena_dev);
if (unlikely(rc != 0)) {
@@ -3539,16 +3426,12 @@ ena_reset_task(void *arg, int pending)
{
struct ena_adapter *adapter = (struct ena_adapter *)arg;
- if (unlikely(!ENA_FLAG_ISSET(ENA_FLAG_TRIGGER_RESET, adapter))) {
- ena_log(adapter->pdev, WARN,
- "device reset scheduled but trigger_reset is off\n");
- return;
+ ENA_LOCK_LOCK();
+ if (likely(ENA_FLAG_ISSET(ENA_FLAG_TRIGGER_RESET, adapter))) {
+ ena_destroy_device(adapter, false);
+ ena_restore_device(adapter);
}
-
- ENA_LOCK_LOCK(adapter);
- ena_destroy_device(adapter, false);
- ena_restore_device(adapter);
- ENA_LOCK_UNLOCK(adapter);
+ ENA_LOCK_UNLOCK();
}
/**
@@ -3577,8 +3460,6 @@ ena_attach(device_t pdev)
adapter = device_get_softc(pdev);
adapter->pdev = pdev;
- ENA_LOCK_INIT(adapter);
-
/*
* Set up the timer service - driver is responsible for avoiding
* concurrency, as the callout won't be using any locking inside.
@@ -3820,19 +3701,19 @@ ena_detach(device_t pdev)
ether_ifdetach(adapter->ifp);
/* Stop timer service */
- ENA_LOCK_LOCK(adapter);
+ ENA_LOCK_LOCK();
callout_drain(&adapter->timer_service);
- ENA_LOCK_UNLOCK(adapter);
+ ENA_LOCK_UNLOCK();
/* Release reset task */
while (taskqueue_cancel(adapter->reset_tq, &adapter->reset_task, NULL))
taskqueue_drain(adapter->reset_tq, &adapter->reset_task);
taskqueue_free(adapter->reset_tq);
- ENA_LOCK_LOCK(adapter);
+ ENA_LOCK_LOCK();
ena_down(adapter);
ena_destroy_device(adapter, true);
- ENA_LOCK_UNLOCK(adapter);
+ ENA_LOCK_UNLOCK();
/* Restore unregistered sysctl queue nodes. */
ena_sysctl_update_queue_node_nb(adapter, adapter->num_io_queues,
@@ -3861,13 +3742,14 @@ ena_detach(device_t pdev)
ena_free_pci_resources(adapter);
+ if (adapter->rss_indir != NULL)
+ free(adapter->rss_indir, M_DEVBUF);
+
if (likely(ENA_FLAG_ISSET(ENA_FLAG_RSS_ACTIVE, adapter)))
ena_com_rss_destroy(ena_dev);
ena_com_delete_host_info(ena_dev);
- ENA_LOCK_DESTROY(adapter);
-
if_free(adapter->ifp);
free(ena_dev->bus, M_DEVBUF);
@@ -3933,6 +3815,20 @@ static void ena_notification(void *adapter_data,
}
}
+static void
+ena_lock_init(void *arg)
+{
+ ENA_LOCK_INIT();
+}
+SYSINIT(ena_lock_init, SI_SUB_LOCK, SI_ORDER_FIRST, ena_lock_init, NULL);
+
+static void
+ena_lock_uninit(void *arg)
+{
+ ENA_LOCK_DESTROY();
+}
+SYSUNINIT(ena_lock_uninit, SI_SUB_LOCK, SI_ORDER_FIRST, ena_lock_uninit, NULL);
+
/**
* This handler will called for unknown event group or unimplemented handlers
**/
diff --git a/sys/dev/ena/ena.h b/sys/dev/ena/ena.h
index 1d06a3cb56de..f559f9127c11 100644
--- a/sys/dev/ena/ena.h
+++ b/sys/dev/ena/ena.h
@@ -34,14 +34,14 @@
#ifndef ENA_H
#define ENA_H
-#include <sys/types.h>
+#include "opt_rss.h"
#include "ena-com/ena_com.h"
#include "ena-com/ena_eth_com.h"
#define DRV_MODULE_VER_MAJOR 2
#define DRV_MODULE_VER_MINOR 4
-#define DRV_MODULE_VER_SUBMINOR 0
+#define DRV_MODULE_VER_SUBMINOR 1
#define DRV_MODULE_NAME "ena"
@@ -123,6 +123,8 @@
#define ENA_IO_TXQ_IDX(q) (2 * (q))
#define ENA_IO_RXQ_IDX(q) (2 * (q) + 1)
+#define ENA_IO_TXQ_IDX_TO_COMBINED_IDX(q) ((q) / 2)
+#define ENA_IO_RXQ_IDX_TO_COMBINED_IDX(q) (((q) - 1) / 2)
#define ENA_MGMNT_IRQ_IDX 0
#define ENA_IO_IRQ_FIRST_IDX 1
@@ -201,7 +203,9 @@ struct ena_irq {
void *cookie;
unsigned int vector;
bool requested;
+#ifdef RSS
int cpu;
+#endif
char name[ENA_IRQNAME_SIZE];
};
@@ -214,7 +218,10 @@ struct ena_que {
struct taskqueue *cleanup_tq;
uint32_t id;
+#ifdef RSS
int cpu;
+ cpuset_t cpu_mask;
+#endif
struct sysctl_oid *oid;
};
@@ -281,19 +288,21 @@ struct ena_stats_tx {
counter_u64_t queue_wakeup;
counter_u64_t queue_stop;
counter_u64_t llq_buffer_copy;
+ counter_u64_t unmask_interrupt_num;
};
struct ena_stats_rx {
counter_u64_t cnt;
counter_u64_t bytes;
counter_u64_t refil_partial;
- counter_u64_t bad_csum;
+ counter_u64_t csum_bad;
counter_u64_t mjum_alloc_fail;
counter_u64_t mbuf_alloc_fail;
counter_u64_t dma_mapping_err;
counter_u64_t bad_desc_num;
counter_u64_t bad_req_id;
counter_u64_t empty_rx_ring;
+ counter_u64_t csum_good;
};
struct ena_ring {
@@ -402,8 +411,6 @@ struct ena_adapter {
struct resource *msix;
int msix_rid;
- struct sx global_lock;
-
/* MSI-X */
struct msix_entry *msix_entries;
int msix_vecs;
@@ -432,7 +439,7 @@ struct ena_adapter {
uint32_t buf_ring_size;
/* RSS*/
- uint8_t rss_ind_tbl[ENA_RX_RSS_TABLE_SIZE];
+ struct ena_indir *rss_indir;
uint8_t mac_addr[ETHER_ADDR_LEN];
/* mdio and phy*/
@@ -480,16 +487,21 @@ struct ena_adapter {
#define ENA_RING_MTX_LOCK(_ring) mtx_lock(&(_ring)->ring_mtx)
#define ENA_RING_MTX_TRYLOCK(_ring) mtx_trylock(&(_ring)->ring_mtx)
#define ENA_RING_MTX_UNLOCK(_ring) mtx_unlock(&(_ring)->ring_mtx)
+#define ENA_RING_MTX_ASSERT(_ring) \
+ mtx_assert(&(_ring)->ring_mtx, MA_OWNED)
-#define ENA_LOCK_INIT(adapter) \
- sx_init(&(adapter)->global_lock, "ENA global lock")
-#define ENA_LOCK_DESTROY(adapter) sx_destroy(&(adapter)->global_lock)
-#define ENA_LOCK_LOCK(adapter) sx_xlock(&(adapter)->global_lock)
-#define ENA_LOCK_UNLOCK(adapter) sx_unlock(&(adapter)->global_lock)
+#define ENA_LOCK_INIT() \
+ sx_init(&ena_global_lock, "ENA global lock")
+#define ENA_LOCK_DESTROY() sx_destroy(&ena_global_lock)
+#define ENA_LOCK_LOCK() sx_xlock(&ena_global_lock)
+#define ENA_LOCK_UNLOCK() sx_unlock(&ena_global_lock)
+#define ENA_LOCK_ASSERT() sx_assert(&ena_global_lock, SA_XLOCKED)
#define clamp_t(type, _x, min, max) min_t(type, max_t(type, _x, min), max)
#define clamp_val(val, lo, hi) clamp_t(__typeof(val), val, lo, hi)
+extern struct sx ena_global_lock;
+
static inline int ena_mbuf_count(struct mbuf *mbuf)
{
int count = 1;
diff --git a/sys/dev/ena/ena_datapath.c b/sys/dev/ena/ena_datapath.c
index 15bd09c489cf..0e6a6fe82038 100644
--- a/sys/dev/ena/ena_datapath.c
+++ b/sys/dev/ena/ena_datapath.c
@@ -36,6 +36,9 @@ __FBSDID("$FreeBSD$");
#ifdef DEV_NETMAP
#include "ena_netmap.h"
#endif /* DEV_NETMAP */
+#ifdef RSS
+#include <net/rss_config.h>
+#endif /* RSS */
/*********************************************************************
* Static functions prototypes
@@ -103,6 +106,7 @@ ena_cleanup(void *arg, int pending)
RX_IRQ_INTERVAL,
TX_IRQ_INTERVAL,
true);
+ counter_u64_add(tx_ring->tx_stats.unmask_interrupt_num, 1);
ena_com_unmask_intr(io_cq, &intr_reg);
}
@@ -128,6 +132,9 @@ ena_mq_start(if_t ifp, struct mbuf *m)
struct ena_ring *tx_ring;
int ret, is_drbr_empty;
uint32_t i;
+#ifdef RSS
+ uint32_t bucket_id;
+#endif
if (unlikely((if_getdrvflags(adapter->ifp) & IFF_DRV_RUNNING) == 0))
return (ENODEV);
@@ -139,7 +146,13 @@ ena_mq_start(if_t ifp, struct mbuf *m)
* It should improve performance.
*/
if (M_HASHTYPE_GET(m) != M_HASHTYPE_NONE) {
- i = m->m_pkthdr.flowid % adapter->num_io_queues;
+#ifdef RSS
+ if (rss_hash2bucket(m->m_pkthdr.flowid, M_HASHTYPE_GET(m),
+ &bucket_id) == 0)
+ i = bucket_id % adapter->num_io_queues;
+ else
+#endif
+ i = m->m_pkthdr.flowid % adapter->num_io_queues;
} else {
i = curcpu % adapter->num_io_queues;
}
@@ -516,7 +529,7 @@ ena_rx_checksum(struct ena_ring *rx_ring, struct ena_com_rx_ctx *ena_rx_ctx,
ena_rx_ctx->l3_csum_err)) {
/* ipv4 checksum error */
mbuf->m_pkthdr.csum_flags = 0;
- counter_u64_add(rx_ring->rx_stats.bad_csum, 1);
+ counter_u64_add(rx_ring->rx_stats.csum_bad, 1);
ena_log_io(pdev, DBG, "RX IPv4 header checksum error\n");
return;
}
@@ -527,11 +540,12 @@ ena_rx_checksum(struct ena_ring *rx_ring, struct ena_com_rx_ctx *ena_rx_ctx,
if (ena_rx_ctx->l4_csum_err) {
/* TCP/UDP checksum error */
mbuf->m_pkthdr.csum_flags = 0;
- counter_u64_add(rx_ring->rx_stats.bad_csum, 1);
+ counter_u64_add(rx_ring->rx_stats.csum_bad, 1);
ena_log_io(pdev, DBG, "RX L4 checksum error\n");
} else {
mbuf->m_pkthdr.csum_flags = CSUM_IP_CHECKED;
mbuf->m_pkthdr.csum_flags |= CSUM_IP_VALID;
+ counter_u64_add(rx_ring->rx_stats.csum_good, 1);
}
}
}
@@ -801,6 +815,11 @@ ena_check_and_collapse_mbuf(struct ena_ring *tx_ring, struct mbuf **mbuf)
/* One segment must be reserved for configuration descriptor. */
if (num_frags < adapter->max_tx_sgl_size)
return (0);
+
+ if ((num_frags == adapter->max_tx_sgl_size) &&
+ ((*mbuf)->m_pkthdr.len < tx_ring->tx_max_header_size))
+ return (0);
+
counter_u64_add(tx_ring->tx_stats.collapse, 1);
collapsed_mbuf = m_collapse(*mbuf, M_NOWAIT,
@@ -962,6 +981,9 @@ ena_xmit_mbuf(struct ena_ring *tx_ring, struct mbuf **mbuf)
tx_info = &tx_ring->tx_buffer_info[req_id];
tx_info->num_of_bufs = 0;
+ ENA_WARN(tx_info->mbuf != NULL, adapter->ena_dev,
+ "mbuf isn't NULL for req_id %d\n", req_id);
+
rc = ena_tx_map_mbuf(tx_ring, tx_info, *mbuf, &push_hdr, &header_len);
if (unlikely(rc != 0)) {
ena_log_io(pdev, WARN, "Failed to map TX mbuf\n");
@@ -995,6 +1017,8 @@ ena_xmit_mbuf(struct ena_ring *tx_ring, struct mbuf **mbuf)
tx_ring->que->id);
} else {
ena_log(pdev, ERR, "failed to prepare tx bufs\n");
+ ena_trigger_reset(adapter,
+ ENA_REGS_RESET_DRIVER_INVALID_STATE);
}
counter_u64_add(tx_ring->tx_stats.prepare_ctx_err, 1);
goto dma_error;
@@ -1065,6 +1089,8 @@ ena_start_xmit(struct ena_ring *tx_ring)
int ena_qid;
int ret = 0;
+ ENA_RING_MTX_ASSERT(tx_ring);
+
if (unlikely((if_getdrvflags(adapter->ifp) & IFF_DRV_RUNNING) == 0))
return;
diff --git a/sys/dev/ena/ena_netmap.c b/sys/dev/ena/ena_netmap.c
index daed81986f13..1525b1efd954 100644
--- a/sys/dev/ena/ena_netmap.c
+++ b/sys/dev/ena/ena_netmap.c
@@ -278,7 +278,7 @@ ena_netmap_reg(struct netmap_adapter *na, int onoff)
enum txrx t;
int rc, i;
- ENA_LOCK_LOCK(adapter);
+ ENA_LOCK_LOCK();
ENA_FLAG_CLEAR_ATOMIC(ENA_FLAG_TRIGGER_RESET, adapter);
ena_down(adapter);
@@ -315,7 +315,7 @@ ena_netmap_reg(struct netmap_adapter *na, int onoff)
ENA_FLAG_SET_ATOMIC(ENA_FLAG_DEV_UP_BEFORE_RESET, adapter);
rc = ena_restore_device(adapter);
}
- ENA_LOCK_UNLOCK(adapter);
+ ENA_LOCK_UNLOCK();
return (rc);
}
@@ -426,6 +426,7 @@ ena_netmap_tx_frame(struct ena_netmap_ctx *ctx)
ena_tx_ctx.num_bufs = tx_info->num_of_bufs;
ena_tx_ctx.req_id = req_id;
ena_tx_ctx.header_len = header_len;
+ ena_tx_ctx.meta_valid = adapter->disable_meta_caching;
/* There are no any offloads, as the netmap doesn't support them */
@@ -444,6 +445,8 @@ ena_netmap_tx_frame(struct ena_netmap_ctx *ctx)
} else {
ena_log_nm(adapter->pdev, ERR,
"Failed to prepare Tx bufs\n");
+ ena_trigger_reset(adapter,
+ ENA_REGS_RESET_DRIVER_INVALID_STATE);
}
counter_u64_add(tx_ring->tx_stats.prepare_ctx_err, 1);
diff --git a/sys/dev/ena/ena_rss.c b/sys/dev/ena/ena_rss.c
new file mode 100644
index 000000000000..116eaa425b01
--- /dev/null
+++ b/sys/dev/ena/ena_rss.c
@@ -0,0 +1,300 @@
+/*-
+ * SPDX-License-Identifier: BSD-2-Clause
+ *
+ * Copyright (c) 2015-2021 Amazon.com, Inc. or its affiliates.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * $FreeBSD$
+ *
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include "opt_rss.h"
+
+#include "ena_rss.h"
+
+/*
+ * This function should generate unique key for the whole driver.
+ * If the key was already genereated in the previous call (for example
+ * for another adapter), then it should be returned instead.
+ */
+void
+ena_rss_key_fill(void *key, size_t size)
+{
+ static bool key_generated;
+ static uint8_t default_key[ENA_HASH_KEY_SIZE];
+
+ KASSERT(size <= ENA_HASH_KEY_SIZE, ("Requested more bytes than ENA RSS key can hold"));
+
+ if (!key_generated) {
+ arc4random_buf(default_key, ENA_HASH_KEY_SIZE);
+ key_generated = true;
+ }
+
+ memcpy(key, default_key, size);
+}
+
+/*
+ * ENA HW expects the key to be in reverse-byte order.
+ */
+static void
+ena_rss_reorder_hash_key(u8 *reordered_key, const u8 *key, size_t key_size)
+{
+ int i;
+
+ key = key + key_size - 1;
+
+ for (i = 0; i < key_size; ++i)
+ *reordered_key++ = *key--;
+}
+
+int ena_rss_set_hash(struct ena_com_dev *ena_dev, const u8 *key)
+{
+ enum ena_admin_hash_functions ena_func = ENA_ADMIN_TOEPLITZ;
+ u8 hw_key[ENA_HASH_KEY_SIZE];
+
+ ena_rss_reorder_hash_key(hw_key, key, ENA_HASH_KEY_SIZE);
+
+ return (ena_com_fill_hash_function(ena_dev, ena_func, hw_key,
+ ENA_HASH_KEY_SIZE, 0x0));
+}
+
+int ena_rss_get_hash_key(struct ena_com_dev *ena_dev, u8 *key)
+{
+ u8 hw_key[ENA_HASH_KEY_SIZE];
+ int rc;
+
+ rc = ena_com_get_hash_key(ena_dev, hw_key);
+ if (rc != 0)
+ return rc;
+
+ ena_rss_reorder_hash_key(key, hw_key, ENA_HASH_KEY_SIZE);
+
+ return (0);
+}
+
+static int
+ena_rss_init_default(struct ena_adapter *adapter)
+{
+ struct ena_com_dev *ena_dev = adapter->ena_dev;
+ device_t dev = adapter->pdev;
+ int qid, rc, i;
+
+ rc = ena_com_rss_init(ena_dev, ENA_RX_RSS_TABLE_LOG_SIZE);
+ if (unlikely(rc != 0)) {
+ ena_log(dev, ERR, "Cannot init indirect table\n");
+ return (rc);
+ }
+
+ for (i = 0; i < ENA_RX_RSS_TABLE_SIZE; i++) {
+#ifdef RSS
+ qid = rss_get_indirection_to_bucket(i) % adapter->num_io_queues;
+#else
+ qid = i % adapter->num_io_queues;
+#endif
+ rc = ena_com_indirect_table_fill_entry(ena_dev, i,
+ ENA_IO_RXQ_IDX(qid));
+ if (unlikely((rc != 0) && (rc != EOPNOTSUPP))) {
+ ena_log(dev, ERR, "Cannot fill indirect table\n");
+ goto err_rss_destroy;
+ }
+ }
+
+
+#ifdef RSS
+ uint8_t rss_algo = rss_gethashalgo();
+ if (rss_algo == RSS_HASH_TOEPLITZ) {
+ uint8_t hash_key[RSS_KEYSIZE];
+
+ rss_getkey(hash_key);
+ rc = ena_rss_set_hash(ena_dev, hash_key);
+ } else
+#endif
+ rc = ena_com_fill_hash_function(ena_dev, ENA_ADMIN_TOEPLITZ, NULL,
+ ENA_HASH_KEY_SIZE, 0x0);
+ if (unlikely((rc != 0) && (rc != EOPNOTSUPP))) {
+ ena_log(dev, ERR, "Cannot fill hash function\n");
+ goto err_rss_destroy;
+ }
+
+ rc = ena_com_set_default_hash_ctrl(ena_dev);
+ if (unlikely((rc != 0) && (rc != EOPNOTSUPP))) {
+ ena_log(dev, ERR, "Cannot fill hash control\n");
+ goto err_rss_destroy;
+ }
+
+ rc = ena_rss_indir_init(adapter);
+
+ return (rc == EOPNOTSUPP ? 0 : rc);
+
+err_rss_destroy:
+ ena_com_rss_destroy(ena_dev);
+ return (rc);
+}
+
+/* Configure the Rx forwarding */
+int
+ena_rss_configure(struct ena_adapter *adapter)
+{
+ struct ena_com_dev *ena_dev = adapter->ena_dev;
+ int rc;
+
+ /* In case the RSS table was destroyed */
+ if (!ena_dev->rss.tbl_log_size) {
+ rc = ena_rss_init_default(adapter);
+ if (unlikely((rc != 0) && (rc != EOPNOTSUPP))) {
+ ena_log(adapter->pdev, ERR,
+ "WARNING: RSS was not properly re-initialized,"
+ " it will affect bandwidth\n");
+ ENA_FLAG_CLEAR_ATOMIC(ENA_FLAG_RSS_ACTIVE, adapter);
+ return (rc);
+ }
+ }
+
+ /* Set indirect table */
+ rc = ena_com_indirect_table_set(ena_dev);
+ if (unlikely((rc != 0) && (rc != EOPNOTSUPP)))
+ return (rc);
+
+ /* Configure hash function (if supported) */
+ rc = ena_com_set_hash_function(ena_dev);
+ if (unlikely((rc != 0) && (rc != EOPNOTSUPP)))
+ return (rc);
+
+ /* Configure hash inputs (if supported) */
+ rc = ena_com_set_hash_ctrl(ena_dev);
+ if (unlikely((rc != 0) && (rc != EOPNOTSUPP)))
+ return (rc);
+
+ return (0);
+}
+
+static void
+ena_rss_init_default_deferred(void *arg)
+{
+ struct ena_adapter *adapter;
+ devclass_t dc;
+ int max;
+ int rc;
+
+ dc = devclass_find("ena");
+ if (unlikely(dc == NULL)) {
+ ena_log_raw(ERR, "SYSINIT: %s: No devclass ena\n", __func__);
+ return;
+ }
+
+ max = devclass_get_maxunit(dc);
+ while (max-- >= 0) {
+ adapter = devclass_get_softc(dc, max);
+ if (adapter != NULL) {
+ rc = ena_rss_init_default(adapter);
+ ENA_FLAG_SET_ATOMIC(ENA_FLAG_RSS_ACTIVE, adapter);
+ if (unlikely(rc != 0)) {
+ ena_log(adapter->pdev, WARN,
+ "WARNING: RSS was not properly initialized,"
+ " it will affect bandwidth\n");
+ ENA_FLAG_CLEAR_ATOMIC(ENA_FLAG_RSS_ACTIVE, adapter);
+ }
+ }
+ }
+}
+SYSINIT(ena_rss_init, SI_SUB_KICK_SCHEDULER, SI_ORDER_SECOND, ena_rss_init_default_deferred, NULL);
+
+int
+ena_rss_indir_get(struct ena_adapter *adapter, uint32_t *table)
+{
+ int rc, i;
+
+ rc = ena_com_indirect_table_get(adapter->ena_dev, table);
+ if (rc != 0) {
+ if (rc == EOPNOTSUPP)
+ device_printf(adapter->pdev,
+ "Reading from indirection table not supported\n");
+ else
+ device_printf(adapter->pdev,
+ "Unable to get indirection table\n");
+ return (rc);
+ }
+
+ for (i = 0; i < ENA_RX_RSS_TABLE_SIZE; ++i)
+ table[i] = ENA_IO_RXQ_IDX_TO_COMBINED_IDX(table[i]);
+
+ return (0);
+}
+
+int
+ena_rss_indir_set(struct ena_adapter *adapter, uint32_t *table)
+{
+ int rc, i;
+
+ for (i = 0; i < ENA_RX_RSS_TABLE_SIZE; ++i) {
+ rc = ena_com_indirect_table_fill_entry(adapter->ena_dev, i,
+ ENA_IO_RXQ_IDX(table[i]));
+ if (rc != 0) {
+ device_printf(adapter->pdev,
+ "Cannot fill indirection table entry %d\n", i);
+ return (rc);
+ }
+ }
+
+ rc = ena_com_indirect_table_set(adapter->ena_dev);
+ if (rc == EOPNOTSUPP)
+ device_printf(adapter->pdev,
+ "Writing to indirection table not supported\n");
+ else if (rc != 0)
+ device_printf(adapter->pdev,
+ "Cannot set indirection table\n");
+
+ return (rc);
+}
+
+int
+ena_rss_indir_init(struct ena_adapter *adapter)
+{
+ struct ena_indir *indir = adapter->rss_indir;
+ int rc;
+
+ if (indir == NULL) {
+ adapter->rss_indir = indir = malloc(sizeof(struct ena_indir),
+ M_DEVBUF, M_WAITOK | M_ZERO);
+ if (indir == NULL)
+ return (ENOMEM);
+ }
+
+ rc = ena_rss_indir_get(adapter, indir->table);
+ if (rc != 0) {
+ free(adapter->rss_indir, M_DEVBUF);
+ adapter->rss_indir = NULL;
+
+ return (rc);
+ }
+
+ ena_rss_copy_indir_buf(indir->sysctl_buf, indir->table);
+
+ return (0);
+}
diff --git a/sys/dev/ena/ena_rss.h b/sys/dev/ena/ena_rss.h
new file mode 100644
index 000000000000..42bec6fb2aa6
--- /dev/null
+++ b/sys/dev/ena/ena_rss.h
@@ -0,0 +1,73 @@
+/*-
+ * SPDX-License-Identifier: BSD-2-Clause
+ *
+ * Copyright (c) 2015-2021 Amazon.com, Inc. or its affiliates.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * $FreeBSD$
+ *
+ */
+
+#ifndef ENA_RSS_H
+#define ENA_RSS_H
+
+#include "opt_rss.h"
+
+#include <sys/types.h>
+
+#ifdef RSS
+#include <net/rss_config.h>
+#endif
+
+#include "ena.h"
+
+#define ENA_RX_RSS_MSG_RECORD_SZ 8
+
+struct ena_indir {
+ uint32_t table[ENA_RX_RSS_TABLE_SIZE];
+ /* This is the buffer wired to `rss.indir_table` sysctl. */
+ char sysctl_buf[ENA_RX_RSS_TABLE_SIZE * ENA_RX_RSS_MSG_RECORD_SZ];
+};
+
+int ena_rss_set_hash(struct ena_com_dev *ena_dev, const u8 *key);
+int ena_rss_get_hash_key(struct ena_com_dev *ena_dev, u8 *key);
+int ena_rss_configure(struct ena_adapter *);
+int ena_rss_indir_get(struct ena_adapter *adapter, uint32_t *table);
+int ena_rss_indir_set(struct ena_adapter *adapter, uint32_t *table);
+int ena_rss_indir_init(struct ena_adapter *adapter);
+
+static inline void
+ena_rss_copy_indir_buf(char *buf, uint32_t *table)
+{
+ int i;
+
+ for (i = 0; i < ENA_RX_RSS_TABLE_SIZE; ++i) {
+ buf += snprintf(buf, ENA_RX_RSS_MSG_RECORD_SZ + 1,
+ "%s%d:%d", i == 0 ? "" : " ", i, table[i]);
+ }
+}
+
+#endif /* !(ENA_RSS_H) */
diff --git a/sys/dev/ena/ena_sysctl.c b/sys/dev/ena/ena_sysctl.c
index c9a5419811a8..7337f6578e68 100644
--- a/sys/dev/ena/ena_sysctl.c
+++ b/sys/dev/ena/ena_sysctl.c
@@ -31,19 +31,31 @@
#include <sys/param.h>
__FBSDID("$FreeBSD$");
+#include "opt_rss.h"
+
#include "ena_sysctl.h"
+#include "ena_rss.h"
static void ena_sysctl_add_wd(struct ena_adapter *);
static void ena_sysctl_add_stats(struct ena_adapter *);
static void ena_sysctl_add_eni_metrics(struct ena_adapter *);
static void ena_sysctl_add_tuneables(struct ena_adapter *);
+/* Kernel option RSS prevents manipulation of key hash and indirection table. */
+#ifndef RSS
+static void ena_sysctl_add_rss(struct ena_adapter *);
+#endif
static int ena_sysctl_buf_ring_size(SYSCTL_HANDLER_ARGS);
static int ena_sysctl_rx_queue_size(SYSCTL_HANDLER_ARGS);
static int ena_sysctl_io_queues_nb(SYSCTL_HANDLER_ARGS);
static int ena_sysctl_eni_metrics_interval(SYSCTL_HANDLER_ARGS);
+#ifndef RSS
+static int ena_sysctl_rss_key(SYSCTL_HANDLER_ARGS);
+static int ena_sysctl_rss_indir_table(SYSCTL_HANDLER_ARGS);
+#endif
/* Limit max ENI sample rate to be an hour. */
#define ENI_METRICS_MAX_SAMPLE_INTERVAL 3600
+#define ENA_HASH_KEY_MSG_SIZE (ENA_HASH_KEY_SIZE * 2 + 1)
static SYSCTL_NODE(_hw, OID_AUTO, ena, CTLFLAG_RD | CTLFLAG_MPSAFE, 0,
"ENA driver parameters");
@@ -83,6 +95,8 @@ SYSCTL_BOOL(_hw_ena, OID_AUTO, force_large_llq_header, CTLFLAG_RDTUN,
&ena_force_large_llq_header, 0,
"Increases maximum supported header size in LLQ mode to 224 bytes, while reducing the maximum Tx queue size by half.\n");
+int ena_rss_table_size = ENA_RX_RSS_TABLE_SIZE;
+
void
ena_sysctl_add_nodes(struct ena_adapter *adapter)
{
@@ -90,6 +104,9 @@ ena_sysctl_add_nodes(struct ena_adapter *adapter)
ena_sysctl_add_stats(adapter);
ena_sysctl_add_eni_metrics(adapter);
ena_sysctl_add_tuneables(adapter);
+#ifndef RSS
+ ena_sysctl_add_rss(adapter);
+#endif
}
static void
@@ -238,6 +255,10 @@ ena_sysctl_add_stats(struct ena_adapter *adapter)
"llq_buffer_copy", CTLFLAG_RD,
&tx_stats->llq_buffer_copy,
"Header copies for llq transaction");
+ SYSCTL_ADD_COUNTER_U64(ctx, tx_list, OID_AUTO,
+ "unmask_interrupt_num", CTLFLAG_RD,
+ &tx_stats->unmask_interrupt_num,
+ "Unmasked interrupt count");
/* RX specific stats */
rx_node = SYSCTL_ADD_NODE(ctx, queue_list, OID_AUTO,
@@ -256,8 +277,8 @@ ena_sysctl_add_stats(struct ena_adapter *adapter)
"refil_partial", CTLFLAG_RD,
&rx_stats->refil_partial, "Partial refilled mbufs");
SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO,
- "bad_csum", CTLFLAG_RD,
- &rx_stats->bad_csum, "Bad RX checksum");
+ "csum_bad", CTLFLAG_RD,
+ &rx_stats->csum_bad, "Bad RX checksum");
SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO,
"mbuf_alloc_fail", CTLFLAG_RD,
&rx_stats->mbuf_alloc_fail, "Failed mbuf allocs");
@@ -276,6 +297,9 @@ ena_sysctl_add_stats(struct ena_adapter *adapter)
SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO,
"empty_rx_ring", CTLFLAG_RD,
&rx_stats->empty_rx_ring, "RX descriptors depletion count");
+ SYSCTL_ADD_COUNTER_U64(ctx, rx_list, OID_AUTO,
+ "csum_good", CTLFLAG_RD,
+ &rx_stats->csum_good, "Valid RX checksum calculations");
}
/* Stats read from device */
@@ -398,6 +422,45 @@ ena_sysctl_add_tuneables(struct ena_adapter *adapter)
ena_sysctl_io_queues_nb, "I", "Number of IO queues.");
}
+/* Kernel option RSS prevents manipulation of key hash and indirection table. */
+#ifndef RSS
+static void
+ena_sysctl_add_rss(struct ena_adapter *adapter)
+{
+ device_t dev;
+
+ struct sysctl_ctx_list *ctx;
+ struct sysctl_oid *tree;
+ struct sysctl_oid_list *child;
+
+ dev = adapter->pdev;
+
+ ctx = device_get_sysctl_ctx(dev);
+ tree = device_get_sysctl_tree(dev);
+ child = SYSCTL_CHILDREN(tree);
+
+ /* RSS options */
+ tree = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "rss",
+ CTLFLAG_RW | CTLFLAG_MPSAFE, NULL, "Receive Side Scaling options.");
+ child = SYSCTL_CHILDREN(tree);
+
+ /* RSS hash key */
+ SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "key",
+ CTLTYPE_STRING | CTLFLAG_RW | CTLFLAG_MPSAFE, adapter, 0,
+ ena_sysctl_rss_key, "A", "RSS key.");
+
+ /* Tuneable RSS indirection table */
+ SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "indir_table",
+ CTLTYPE_STRING | CTLFLAG_RW | CTLFLAG_MPSAFE, adapter, 0,
+ ena_sysctl_rss_indir_table, "A", "RSS indirection table.");
+
+ /* RSS indirection table size */
+ SYSCTL_ADD_INT(ctx, child, OID_AUTO, "indir_table_size",
+ CTLFLAG_RD | CTLFLAG_MPSAFE, &ena_rss_table_size, 0,
+ "RSS indirection table size.");
+}
+#endif /* RSS */
+
/*
* ena_sysctl_update_queue_node_nb - Register/unregister sysctl queue nodes.
@@ -442,6 +505,12 @@ ena_sysctl_buf_ring_size(SYSCTL_HANDLER_ARGS)
uint32_t val;
int error;
+ ENA_LOCK_LOCK();
+ if (unlikely(!ENA_FLAG_ISSET(ENA_FLAG_DEVICE_RUNNING, adapter))) {
+ error = EINVAL;
+ goto unlock;
+ }
+
val = 0;
error = sysctl_wire_old_buffer(req, sizeof(val));
if (error == 0) {
@@ -449,13 +518,14 @@ ena_sysctl_buf_ring_size(SYSCTL_HANDLER_ARGS)
error = sysctl_handle_32(oidp, &val, 0, req);
}
if (error != 0 || req->newptr == NULL)
- return (error);
+ goto unlock;
if (!powerof2(val) || val == 0) {
ena_log(adapter->pdev, ERR,
"Requested new Tx buffer ring size (%u) is not a power of 2\n",
val);
- return (EINVAL);
+ error = EINVAL;
+ goto unlock;
}
if (val != adapter->buf_ring_size) {
@@ -470,6 +540,9 @@ ena_sysctl_buf_ring_size(SYSCTL_HANDLER_ARGS)
adapter->buf_ring_size);
}
+unlock:
+ ENA_LOCK_UNLOCK();
+
return (error);
}
@@ -480,6 +553,12 @@ ena_sysctl_rx_queue_size(SYSCTL_HANDLER_ARGS)
uint32_t val;
int error;
+ ENA_LOCK_LOCK();
+ if (unlikely(!ENA_FLAG_ISSET(ENA_FLAG_DEVICE_RUNNING, adapter))) {
+ error = EINVAL;
+ goto unlock;
+ }
+
val = 0;
error = sysctl_wire_old_buffer(req, sizeof(val));
if (error == 0) {
@@ -487,13 +566,14 @@ ena_sysctl_rx_queue_size(SYSCTL_HANDLER_ARGS)
error = sysctl_handle_32(oidp, &val, 0, req);
}
if (error != 0 || req->newptr == NULL)
- return (error);
+ goto unlock;
if (val < ENA_MIN_RING_SIZE || val > adapter->max_rx_ring_size) {
ena_log(adapter->pdev, ERR,
"Requested new Rx queue size (%u) is out of range: [%u, %u]\n",
val, ENA_MIN_RING_SIZE, adapter->max_rx_ring_size);
- return (EINVAL);
+ error = EINVAL;
+ goto unlock;
}
/* Check if the parameter is power of 2 */
@@ -501,7 +581,8 @@ ena_sysctl_rx_queue_size(SYSCTL_HANDLER_ARGS)
ena_log(adapter->pdev, ERR,
"Requested new Rx queue size (%u) is not a power of 2\n",
val);
- return (EINVAL);
+ error = EINVAL;
+ goto unlock;
}
if (val != adapter->requested_rx_ring_size) {
@@ -517,6 +598,9 @@ ena_sysctl_rx_queue_size(SYSCTL_HANDLER_ARGS)
adapter->requested_rx_ring_size);
}
+unlock:
+ ENA_LOCK_UNLOCK();
+
return (error);
}
@@ -530,18 +614,25 @@ ena_sysctl_io_queues_nb(SYSCTL_HANDLER_ARGS)
uint32_t old_num_queues, tmp = 0;
int error;
+ ENA_LOCK_LOCK();
+ if (unlikely(!ENA_FLAG_ISSET(ENA_FLAG_DEVICE_RUNNING, adapter))) {
+ error = EINVAL;
+ goto unlock;
+ }
+
error = sysctl_wire_old_buffer(req, sizeof(tmp));
if (error == 0) {
tmp = adapter->num_io_queues;
error = sysctl_handle_int(oidp, &tmp, 0, req);
}
if (error != 0 || req->newptr == NULL)
- return (error);
+ goto unlock;
if (tmp == 0) {
ena_log(adapter->pdev, ERR,
"Requested number of IO queues is zero\n");
- return (EINVAL);
+ error = EINVAL;
+ goto unlock;
}
/*
@@ -555,7 +646,8 @@ ena_sysctl_io_queues_nb(SYSCTL_HANDLER_ARGS)
ena_log(adapter->pdev, ERR,
"Requested number of IO queues is higher than maximum "
"allowed (%u)\n", adapter->msix_vecs - ENA_ADMIN_MSIX_VEC);
- return (EINVAL);
+ error = EINVAL;
+ goto unlock;
}
if (tmp == adapter->num_io_queues) {
ena_log(adapter->pdev, ERR,
@@ -574,6 +666,9 @@ ena_sysctl_io_queues_nb(SYSCTL_HANDLER_ARGS)
ena_sysctl_update_queue_node_nb(adapter, old_num_queues, tmp);
}
+unlock:
+ ENA_LOCK_UNLOCK();
+
return (error);
}
@@ -584,19 +679,26 @@ ena_sysctl_eni_metrics_interval(SYSCTL_HANDLER_ARGS)
uint16_t interval;
int error;
+ ENA_LOCK_LOCK();
+ if (unlikely(!ENA_FLAG_ISSET(ENA_FLAG_DEVICE_RUNNING, adapter))) {
+ error = EINVAL;
+ goto unlock;
+ }
+
error = sysctl_wire_old_buffer(req, sizeof(interval));
if (error == 0) {
interval = adapter->eni_metrics_sample_interval;
error = sysctl_handle_16(oidp, &interval, 0, req);
}
if (error != 0 || req->newptr == NULL)
- return (error);
+ goto unlock;
if (interval > ENI_METRICS_MAX_SAMPLE_INTERVAL) {
ena_log(adapter->pdev, ERR,
"ENI metrics update interval is out of range - maximum allowed value: %d seconds\n",
ENI_METRICS_MAX_SAMPLE_INTERVAL);
- return (EINVAL);
+ error = EINVAL;
+ goto unlock;
}
if (interval == 0) {
@@ -611,5 +713,208 @@ ena_sysctl_eni_metrics_interval(SYSCTL_HANDLER_ARGS)
adapter->eni_metrics_sample_interval = interval;
+unlock:
+ ENA_LOCK_UNLOCK();
+
return (0);
}
+
+#ifndef RSS
+/*
+ * Change the Receive Side Scaling hash key.
+ */
+static int
+ena_sysctl_rss_key(SYSCTL_HANDLER_ARGS)
+{
+ struct ena_adapter *adapter = arg1;
+ struct ena_com_dev *ena_dev = adapter->ena_dev;
+ enum ena_admin_hash_functions ena_func;
+ char msg[ENA_HASH_KEY_MSG_SIZE];
+ char elem[3] = { 0 };
+ char *endp;
+ u8 rss_key[ENA_HASH_KEY_SIZE];
+ int error, i;
+
+ ENA_LOCK_LOCK();
+ if (unlikely(!ENA_FLAG_ISSET(ENA_FLAG_DEVICE_RUNNING, adapter))) {
+ error = EINVAL;
+ goto unlock;
+ }
+
+ if (unlikely(!ENA_FLAG_ISSET(ENA_FLAG_RSS_ACTIVE, adapter))) {
+ error = ENOTSUP;
+ goto unlock;
+ }
+
+ error = sysctl_wire_old_buffer(req, sizeof(msg));
+ if (error != 0)
+ goto unlock;
+
+ error = ena_com_get_hash_function(adapter->ena_dev, &ena_func);
+ if (error != 0) {
+ device_printf(adapter->pdev, "Cannot get hash function\n");
+ goto unlock;
+ }
+
+ if (ena_func != ENA_ADMIN_TOEPLITZ) {
+ error = EINVAL;
+ device_printf(adapter->pdev, "Unsupported hash algorithm\n");
+ goto unlock;
+ }
+
+ error = ena_rss_get_hash_key(ena_dev, rss_key);
+ if (error != 0) {
+ device_printf(adapter->pdev, "Cannot get hash key\n");
+ goto unlock;
+ }
+
+ for (i = 0; i < ENA_HASH_KEY_SIZE; ++i)
+ snprintf(&msg[i * 2], 3, "%02x", rss_key[i]);
+
+ error = sysctl_handle_string(oidp, msg, sizeof(msg), req);
+ if (error != 0 || req->newptr == NULL)
+ goto unlock;
+
+ if (strlen(msg) != sizeof(msg) - 1) {
+ error = EINVAL;
+ device_printf(adapter->pdev, "Invalid key size\n");
+ goto unlock;
+ }
+
+ for (i = 0; i < ENA_HASH_KEY_SIZE; ++i) {
+ strncpy(elem, &msg[i * 2], 2);
+ rss_key[i] = strtol(elem, &endp, 16);
+
+ /* Both hex nibbles in the string must be valid to continue. */
+ if (endp == elem || *endp != '\0' || rss_key[i] < 0) {
+ error = EINVAL;
+ device_printf(adapter->pdev,
+ "Invalid key hex value: '%c'\n", *endp);
+ goto unlock;
+ }
+ }
+
+ error = ena_rss_set_hash(ena_dev, rss_key);
+ if (error != 0)
+ device_printf(adapter->pdev, "Cannot fill hash key\n");
+
+unlock:
+ ENA_LOCK_UNLOCK();
+
+ return (error);
+}
+
+/*
+ * Change the Receive Side Scaling indirection table.
+ *
+ * The sysctl entry string consists of one or more `x:y` keypairs, where
+ * x stands for the table index and y for its new value.
+ * Table indices that don't need to be updated can be omitted from the string
+ * and will retain their existing values. If an index is entered more than once,
+ * the last value is used.
+ *
+ * Example:
+ * To update two selected indices in the RSS indirection table, e.g. setting
+ * index 0 to queue 5 and then index 5 to queue 0, the below command should be
+ * used:
+ * sysctl dev.ena.0.rss.indir_table="0:5 5:0"
+ */
+static int
+ena_sysctl_rss_indir_table(SYSCTL_HANDLER_ARGS)
+{
+ int num_queues, error;
+ struct ena_adapter *adapter = arg1;
+ struct ena_com_dev *ena_dev;
+ struct ena_indir *indir;
+ char *msg, *buf, *endp;
+ uint32_t idx, value;
+
+ ENA_LOCK_LOCK();
+ if (unlikely(!ENA_FLAG_ISSET(ENA_FLAG_DEVICE_RUNNING, adapter))) {
+ error = EINVAL;
+ goto unlock;
+ }
+
+ if (unlikely(!ENA_FLAG_ISSET(ENA_FLAG_RSS_ACTIVE, adapter))) {
+ error = ENOTSUP;
+ goto unlock;
+ }
+
+ ena_dev = adapter->ena_dev;
+ indir = adapter->rss_indir;
+ msg = indir->sysctl_buf;
+
+ if (unlikely(indir == NULL)) {
+ error = ENOTSUP;
+ goto unlock;
+ }
+
+ error = sysctl_handle_string(oidp, msg, sizeof(indir->sysctl_buf), req);
+ if (error != 0 || req->newptr == NULL)
+ goto unlock;
+
+ num_queues = adapter->num_io_queues;
+
+ /*
+ * This sysctl expects msg to be a list of `x:y` record pairs,
+ * where x is the indirection table index and y is its value.
+ */
+ for (buf = msg; *buf != '\0'; buf = endp) {
+ idx = strtol(buf, &endp, 10);
+
+ if (endp == buf || idx < 0) {
+ device_printf(adapter->pdev, "Invalid index: %s\n",
+ buf);
+ error = EINVAL;
+ break;
+ }
+
+ if (idx >= ENA_RX_RSS_TABLE_SIZE) {
+ device_printf(adapter->pdev, "Index %d out of range\n",
+ idx);
+ error = ERANGE;
+ break;
+ }
+
+ buf = endp;
+
+ if (*buf++ != ':') {
+ device_printf(adapter->pdev, "Missing ':' separator\n");
+ error = EINVAL;
+ break;
+ }
+
+ value = strtol(buf, &endp, 10);
+
+ if (endp == buf || value < 0) {
+ device_printf(adapter->pdev, "Invalid value: %s\n",
+ buf);
+ error = EINVAL;
+ break;
+ }
+
+ if (value >= num_queues) {
+ device_printf(adapter->pdev, "Value %d out of range\n",
+ value);
+ error = ERANGE;
+ break;
+ }
+
+ indir->table[idx] = value;
+ }
+
+ if (error != 0) /* Reload indirection table with last good data. */
+ ena_rss_indir_get(adapter, indir->table);
+
+ /* At this point msg has been clobbered by sysctl_handle_string. */
+ ena_rss_copy_indir_buf(msg, indir->table);
+
+ if (error == 0)
+ error = ena_rss_indir_set(adapter, indir->table);
+
+unlock:
+ ENA_LOCK_UNLOCK();
+
+ return (error);
+}
+#endif /* RSS */
diff --git a/sys/modules/ena/Makefile b/sys/modules/ena/Makefile
index c6a0c56e7ffb..d619b5d7fa56 100644
--- a/sys/modules/ena/Makefile
+++ b/sys/modules/ena/Makefile
@@ -35,7 +35,7 @@
KMOD = if_ena
SRCS = ena_com.c ena_eth_com.c
-SRCS += ena.c ena_sysctl.c ena_datapath.c ena_netmap.c
+SRCS += ena.c ena_sysctl.c ena_datapath.c ena_netmap.c ena_rss.c
SRCS += device_if.h bus_if.h pci_if.h
SRCS += opt_rss.h
CFLAGS += -I${SRCTOP}/sys/contrib