Wasted Processing Time Due to NVMe Interrupts

https://github.com/scylladb/seastar/issues/507

In these post-meltdown days interrupts are expensive. The default NVMe configuration does not coalesce interrupts, and because NVMe completion rate is typically much faster than task-quota-ms, we’ll see an interrupt per completion with no batching.

The following command colaesces up to 10 NVME interrupts for a period of 200 usec:

$ sudo nvme set-feature /dev/nvme0n1 --feature-id 8 --value 522
set-feature:08 (Interrupt Coalescing), value:0x00020a

The lower byte (0x0a) specifies the number of interrupts to coalesce, the upper byte (0x02) the amount of time to coalesc, in units of 100 usec. I verified that it works on my machine:

 0  1      0 17624724 707592 6554924    0    0 29212     0 7865 14509  2  4 75 19  0
 0  1      0 17580412 748492 6562684    0    0 40900     0 8206 15738  3  5 74 19  0
 0  1      0 17538508 782944 6570208    0    0 34452     0 8203 15118  3  5 74 19  0
 0  1      0 17513640 801532 6576968    0    0 18588     0 6379 11206  2  3 75 19  0
 0  1      0 17418400 836392 6635976    0    0 34776 26880 11401 19015  6  7 69 19  0
 1  1      0 17372184 872704 6647284    0    0 36312     0 11340 21723  5  7 71 17  0
 0  1      0 17349952 908576 6632780    0    0 35872     0 11164 19146  6  7 70 18  0
 0  1      0 17295972 951260 6643108    0    0 42684     0 12694 20619  3  5 74 18  0

In the beginning of the run, coalescing was enabled, and towards the end I disabled it (and interrupt rate went up).

Updated: