Save context switch per I/O for iSCSI and IOCTL frontends.
Introduce new CTL core KPI ctl_run(), preprocessing I/Os in the caller context instead of scheduling another thread just for that. This call may sleep, that is not acceptable for some frontends like the original CAM/FC one, but iSCSI already has separate sleepable per-connection RX threads, and another thread scheduling is mostly just a waste of time. IOCTL frontend actually waits for the I/O completion in the caller thread, so the use of another thread for this has even less sense. With this change I can measure ~5% IOPS improvement on 4KB iSCSI I/Os to ZFS. MFC after: 1 month
Here is a roadmap of some of the primary functions in ctl.c. Starting here
and following the various leaf functions will show the command flow.
-ctl_queue() This is where commands from the frontend ports come
+ctl_queue() / ctl_run() This is where commands from the frontend ports come
ctl_queue_sense() This is only used for non-packetized SCSI. i.e.