Is It Possible to Change Which Core Timer Interrupts Happen On

https://stackoverflow.com/questions/45472215/is-it-possible-to-change-which-core-timer-interrupts-happen-on

On my Debian 8 system, when I run the command watch -n0.1 –no-title cat /proc/interrupts, I get the output below.

       CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7                                                                                                                                                                                       [0/1808]   0:         46          0          0      10215          0          0          0          0   IO-APIC-edge      timer   1:          1          0          0          2          0          0          0          0   IO-APIC-edge      i8042   8:          0          0          0          1          0          0          0          0   IO-APIC-edge      rtc0   9:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   acpi  12:          0          0          0          4          0          0          0          0   IO-APIC-edge      i8042  18:          0          0          0          0          8          0          0          0   IO-APIC-fasteoi   i801_smbus  19:       7337          0          0          0          0          0          0          0   IO-APIC-fasteoi   ata_piix, ata_piix  21:          0         66          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb1  23:          0          0         35          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb2  40:     208677          0          0          0          0          0          0          0  HPET_MSI-edge      hpet2  41:          0       4501          0          0          0          0          0          0  HPET_MSI-edge      hpet3  42:          0          0       2883          0          0          0          0          0  HPET_MSI-edge      hpet4  43:          0          0          0       1224          0          0          0          0  HPET_MSI-edge      hpet5  44:          0          0          0          0       1029          0          0          0  HPET_MSI-edge      hpet6  45:          0          0          0          0          0          0          0          0   PCI-MSI-edge      aerdrv, PCIe PME  46:          0          0          0          0          0          0          0          0   PCI-MSI-edge      PCIe PME  47:          0          0          0          0          0          0          0          0   PCI-MSI-edge      PCIe PME  48:          0          0          0          0          0          0          0          0   PCI-MSI-edge      PCIe PME  49:          0          0          0          0          0       8570          0          0   PCI-MSI-edge      eth0-rx-0  50:          0          0          0          0          0          0       1684          0   PCI-MSI-edge      eth0-tx-0  51:          0          0          0          0          0          0          0          2   PCI-MSI-edge      eth0 NMI:          8          2          2          2          1          2          1         49   Non-maskable interrupts LOC:         36         31         29         26         21       7611        886       1390   Local timer interrupts SPU:          0          0          0          0          0          0          0          0   Spurious interrupts PMI:          8          2          2          2          1          2          1         49   Performance monitoring interrupts IWI:          0          0          0          1          1          0          1          0   IRQ work interrupts RTR:          7          0          0          0          0          0          0          0   APIC ICR read retries RES:        473       1027       1530        739       1532       3567       1529       1811   Rescheduling interrupts CAL:        846       1012       1122       1047        984       1008       1064       1145   Function call interrupts TLB:          2          7          5          3         12         15         10          6   TLB shootdowns TRM:          0          0          0          0          0          0          0          0   Thermal event interrupts THR:          0          0          0          0          0          0          0          0   Threshold APIC interrupts MCE:          0          0          0          0          0          0          0          0   Machine check exceptions MCP:          4          4          4          4          4          4          4          4   Machine check polls THR:          0          0          0          0          0          0          0          0   Hypervisor callback interrupts ERR:          0 MIS:          0

Observe that the timer interrupt is firing mostly on CPU3.

Is it possible to move the timer interrupt to CPU0?

linuxlinux-kernelx86-64interrupt

Share Improve this question Follow asked Aug 2, 2017 at 22:50 merlin2011’s user avatar merlin2011 76.3k4747 gold badges218218 silver badges364364 bronze badges

Why do you want to do this? – 
tangrs
Commented Aug 3, 2017 at 1:39
To reduce interference on core 3. – 
merlin2011
Commented Aug 3, 2017 at 1:57
This sounds like an X-Y problem. What are you really trying to achieve? – 
tangrs
Commented Aug 3, 2017 at 4:27
4
I'm trying to fully allocate a core to a latency sensitive application and minimise other activities. – 
merlin2011
Commented Aug 3, 2017 at 5:05 

Add a comment 2 Answers Sorted by: 6

The name of the concept is IRQ SMP affinity.

It’s possible to set the smp_affinity of an IRQ by setting the affinity mask in /proc/irq//smp_affinity or the affinity list in /proc/irq//smp_affinity_list. The affinity mask is a bit field where each bit represents a core, the IRQ is allowed to be served on the cores corresponding to bits set.

The command

echo 1 > /proc/irq/0/smp_affinity

executed as root should pin the IRQ0 to CPU0. The conditional is mandatory as setting the affinity for an IRQ is subject to a set of prerequisites, the list includes: an interrupt controller that supports a redirection table (like the IO-APIC), the affinity mask must contains at least one active CPUs, the IRQ affinity must not be managed by the kernel and the feature must be enabled.

In my virtualised Debian 8 system I was unable to set the affinity of the IRQ0, failing with an EIO error. I was also unable to track down the exact reason. If you are willing to dive into the Linux source code, you can start from write_irq_affinity in proc.c Share Improve this answer Follow answered Aug 7, 2017 at 10:31 Margaret Bloom’s user avatar Margaret Bloom 44.6k55 gold badges9191 silver badges132132 bronze badges Sign up to request clarification or add additional context in comments. Comments 0

Use isolcpus. It may not reduce your timer interrupts to 0, but on our servers they are greatly reduced.

If you use isolcpus, then the kernel will not affine interrupts to your CPUs that it might otherwise do. For example, we have systems with 12 core dual CPUs. We noticed NVME interrupts on our CPU1 (the second CPU), even with the CPUs isolated via tuned and its cpu-partitioning scheme. nvme drives on our Dell systems are connected to the PCIe lanes on CPU1, hence the interrupts on those cores.

As per my ticket with Red Hat (and Margaret Bloom, who wrote an excellent answer here), if you don’t want the interrupts to be affined to your CPUs, you need to use isolcpus on the kernel boot line. And lo and behold, I tried it and our interrupts went to 0 for the NVME drives on all isolated CPU cores.

I have not attempted to isolate ALL cores on CPU1; I don’t know if they’ll simply be affined to CPU0 or what.

And, in a short summary: any interrupt in /proc/interrupts with “MSI” in the name, is managed by the kernel.

Updated: