In-class: PCI Enumeration
In-class: PCI Enumeration
https://www.khoury.northeastern.edu/~pjd/cs7680/homework/pci-enumeration.html
This assignment explores the 32-bit PCI bus, creating a utility in xv6 to list PCI devices and their parameters.
Submit your solutions before the beginning of the next lecture to the CS7680 submission web site.
Please collaborate with others on these exercises.
PCI bus overview The PCI bus (see wiki.osdev.org/PCI provides a standard architecture for accessing I/O devices on most modern computers, supporting discovery and configuration of attached devices as well as both memory-mapped and IO-mapped (e.g. via inb, outb instructions) access. In particular PCI provides a mechanism where each card or device specifies both its identity and the resources needed (memory-mapped or I/O space, interrupts) and the BIOS and/or OS can discover all devices in the system and allocate interrupts, physical memory space, and I/O space to each device.
PCI has three levels of structure - bus, device, and function. Buses are structured as a tree, with bus 0 as the root and additional buses connected by PCI bridges. An individual PCI card (or chip on the motherboard) that attaches to a PCI bus is a device; note that PCI bridges are themselves devices. Simple devices implement a single function (function 0); more complex devices (e.g. a multi-port ethernet card) may contain multiple functions, which operate (for the most part) like independent PCI devices.
The PCI bus itself is implemented by the motherboard chipset (actually now it’s typically integrated onto the CPU itself), which handles tasks such as bus arbitration - ensuring that only one device can transmit on the bus at any time. In parallel PCI the data lines are shared between all devices on a bus, and arbitration is handled by separate bus request lines from each slot to the PCI controller, and corresponding bus grant lines from the controller back to each slot.
The PCI controller is also responsible for providing software access to the configuration space for each device, a 256-byte set of registers which identifies the device and allows configuration of device properties. This is performed via a separate signal to each card or device, so the CPU can detect and configure all devices on the bus without any prior knowledge of e.g. what address they are mapped at.
lspci and examples The lspci command in Linux lists the devices connected to the PCI bus; here is an example from one of the machines in the first-floor computer lab:
pjd@prosoccer:~$ lspci 00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09) 00:01.0 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port (rev 09) 00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 (rev 04) 00:16.3 Serial controller: Intel Corporation 6 Series/C200 Series Chipset Family KT Controller (rev 04) 00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04) 00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 04) 00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller (rev 04) 00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b4) 00:1c.2 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 3 (rev b4) 00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 04) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a4) 00:1f.0 ISA bridge: Intel Corporation Q67 Express Chipset Family LPC Controller (rev 04) 00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family SATA AHCI Controller (rev 04) 00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 04) 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cedar [Radeon HD 5000/6000/7350/8350 Series] 03:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 03) The configuration space stores 16-bit vendor and device IDs, rather than these descriptive names; lspci performs a table lookup to report these names. The actual numeric values on the same machine are as follows. (note that Intel is vendor id 0x8086) pjd@prosoccer:~$ lspci -n 00:00.0 0600: 8086:0100 (rev 09) 00:01.0 0604: 8086:0101 (rev 09) 00:16.0 0780: 8086:1c3a (rev 04) 00:16.3 0700: 8086:1c3d (rev 04) 00:19.0 0200: 8086:1502 (rev 04) 00:1a.0 0c03: 8086:1c2d (rev 04) 00:1b.0 0403: 8086:1c20 (rev 04) 00:1c.0 0604: 8086:1c10 (rev b4) 00:1c.2 0604: 8086:1c14 (rev b4) 00:1d.0 0c03: 8086:1c26 (rev 04) 00:1e.0 0604: 8086:244e (rev a4) 00:1f.0 0601: 8086:1c4e (rev 04) 00:1f.2 0106: 8086:1c02 (rev 04) 00:1f.3 0c05: 8086:1c22 (rev 04) 01:00.0 0300: 1002:68f9 03:00.0 0c03: 1033:0194 (rev 03) [note - tables and some descriptions adapted from http://wiki.osdev.org/PCI] Configuration space access BIOS and OS code must be able to access these configuration registers in order to discover and configure PCI devices. On the PC platform this is done via two 32-bit IO ports, CONFIG_ADDRESS (0xCF8) and CONFIG_DATA (0xCFC). An address specifying bus, device, function, and a 4-byte-aligned byte offset is written to CONFIG_ADDRESS: 31 30 - 24 23 - 16 15 - 11 10 - 8 7 - 0 Enable Bit Reserved Bus Number Device Number Function Number Offset and then a 32-bit value is read from or written to CONFIG_DATA. If no device is present, the result will be 0xFFFFFFFF. (note that 0xFFFF is an invalid vendor code, so the absence of a device can be detected by reading the first 32-bit word of the corresponding configuration space) Reading configuration space on XV6 For this assignment you will add a user-space utility to XV6 to enumerate the PCI configuration space. IO port access First you’ll need access to I/O instructions within a user process. The Intel architecture provides two mechanisms which allow access to in and out instructions while in user space:
TSS bitmap - a bitmap can be specified at the end of the TSS which identifies which I/O ports may be accessed from user mode (ring 3). EFLAGS - bits 12 and 13 of the EFLAGS register indicate the priority level necessary to access any I/O port; by default these bits are zero. The TSS bitmap hasn’t been implemented in XV6, and we might not want to extend it far enough to cover ports 0xCF8 and 0xCFC anyway. (that would take over 400 bytes per process) Instead you’ll implement a new system call, iopl, which takes a single argument in the range 0-3 and sets the I/O privilege level bits in EFLAGS to that value; your readpci utility can then call iopl(3) and will be able to access I/O ports directly.
Hint - where does EFLAGS come from when you return to user space? Do you actually have to modify the current EFLAGS register in the iopl system call?
Create a user program named readpci (add it to UPROGS in your makefile) and start out with the following code to test iopl:
#include “types.h” #include “user.h” #include “x86.h”
int main(int argc, char **argv) { iopl(3); // allow ring 3 to access IO ports unsigned char c = inb(0x3bc); printf(1, “%x\n”, c); exit(); } If iopl was implemented correctly it should run, printing FF, while it will die with a fault otherwise.
PCI register access To access CONFIG_ADDR and CONFIG_DATA you’ll need 32-bit versions of the in/out instructions, which aren’t provided in XV6:
unsigned int inl(int port) { unsigned int data; __asm __volatile(“inl %w1,%0” : “=a” (data) : “d” (port)); return data; } void outl(int port, unsigned int data) { __asm __volatile(“outl %0,%w1” : : “a” (data), “d” (port)); } Modify readpci to take 3 numeric arguments - bus, device, and function - and read and print the corresponding configuration space in hex. If you want the printout to look nice like this you’ll have to modify printf.c to support formats like “%02x” :-)
xv6… cpu0: starting init: starting sh $ readpci 0 1 0 00: 86 80 00 70 03 00 00 02 00 00 01 06 00 00 80 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 F4 1A 00 11 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 $ Configuration space structure All PCI configuration spaces share a common header: register bits 31-24 bits 23-16 bits 15-8 bits 7-0 00 Device ID Vendor ID 04 Status Command 08 Class code Subclass Prog IF Revision ID 0C BIST Header type Latency Timer Cache Line Sise The fields are as follows, ignoring ones that we don’t need: Device ID: Identifies the particular device. Where valid IDs are allocated by the vendor. Vendor ID: Identifies the manufacturer of the device. Where valid IDs are allocated by PCI-SIG to ensure uniqueness and 0xFFFF is an invalid value that will be returned on read accesses to Configuration Space registers of non-existent devices. Status, Command: for handling various error cases Class Code: A read-only register that specifies the type of function the device performs. Subclass: A read-only register that specifies the specific function the device performs. Prog IF: A read-only register that specifies a register-level programming interface the device has, if it has any at all. Revision ID: So a vendor can fix something without allocating a new device ID. BIST: built-in self-test stuff. Header Type: bit 7 (0x80) indicates whether it is a multi-function device, while interesting values of the remaining bits are: 00 = general device, 01 = PCI-to-PCI bridge. Latency Timer, Cache Line Sise: very hardware-specific stuff. We’ll let the BIOS set it properly and ignore it. Note that all 16 and 32-bit values are in little-endian order. (not surprisingly, as it was originally developed by Intel) This means that you can access them directly on a PC without needing to translate byte order.
For a general-purpose device the remainder of the configuration space is structured as follows:
register bits 31-24 bits 23-16 bits 15-8 bits 7-0 10 Base address #0 (BAR0) 14 Base address #1 (BAR1) 18 Base address #2 (BAR2) 1C Base address #3 (BAR3) 20 Base address #4 (BAR4) 24 Base address #5 (BAR5) 28 Cardbus CIS Pointer 2C Subsystem ID Subsystem Vendor ID 30 Expansion ROM base address 34 Reserved Capabilities Pointer 38 Reserved 3C Max latency Min Grant Interrupt PIN Interrupt Line The fields (other than base addresses) are: CardBus CIS Pointer: obsolete Interrupt Line: PIC IRQ 0-15, or FF (no IRQ) Interrupt Pin: PCI-specific, or for use with IOAPIC. Max Latency: hardware-specific Min Grant: hardware-specific Capabilities Pointer: never mind… We’ll ignore all of these values for now, and talk about them during class.
PCI bridges In a PCI bridge this portion of the configuration data has the following structure:
register bits 31-24 bits 23-16 bits 15-8 bits 7-0 10 Base address #0 (BAR0) 14 Base address #1 (BAR1) 18 Secondary Latency Timer Subordinate Bus Number Secondary Bus Number Primary Bus Number 1C Secondary Status I/O Limit I/O Base 20 Memory Limit Memory Base 24 Prefetchable Memory Limit Prefetchable Memory Base 28 Prefetchable Base Upper 32 Bits 2C Prefetchable Limit Upper 32 Bits 30 I/O Limit Upper 16 Bits I/O Base Upper 16 Bits 34 Reserved Capability Pointer 38 Expansion ROM base address 3C Bridge Control Interrupt PIN Interrupt Line The main field of interest is the Secondary Bus Number, which identifies the “child” bus in the tree. For example, since the root PCI bus is always 0, if there is a bus 1 there must be a PCI bridge on bus 0 with secondary bus number 1.
You can recursively enumerate devices on the PCI bus by scanning bus 0, and whenever you detect a PCI bridge recursively scanning its secondary bus. In doing this keep in mind:
There can be up to 32 devices on a bus. Each device number corresponds to a slot, so some may be missing in the middle. (e.g. empty PCI slots) There can be up to 8 functions in a multi-function device; however you can stop when you get to the first missing one. The first PCI bridge is going to be the host bridge, with a configuration record indicating that it bridges from bus 0 to bus 0. Don’t enumerate it recursively. You may find the file pci-cfg.h useful, as it contains C structure definitions for PCI device and bridge configurations.
Modify readpci to recursively enumerate PCI buses. (e.g. if invoked as readpci -scan instead of e.g. readpci 0 1 0) To test recursive enumeration you will need a newer version of QEMU - you can download the latest version for Ubuntu 12.04.3 (i.e. the virtual machine image most of you are using) as follows:
mkdir -p ~/bin cd ~/bin wget http://www.ccs.neu.edu/~pjd/cs7680/qemu-system-i386 chmod +x qemu-system-i386 In your xv6 directory you should then be able to run QEMU as follows:
~/bin/qemu-system-i386 -nographic -hdb fs.img xv6.img -smp 1 -m 512 -bios /usr/share/qemu/bios.bin
-device i82801b11-bridge-device rtl8139,bus=pci.1 -device i82801b11-bridge -device ne2k_pci,bus=pci.2
This is the equivalent of running make qemu-nox - if you want the graphical window, eliminate the -nographic argument. It should print several warning messages and then start up, and if you enumerate the PCI bus correctly you’ll see devices on buses 1 and 2.
Submit: Your working readpci.c file.
Optional With the following code modification you can compile readpci.c for Linux with the command gcc -DHOST_VERSION readpci.c -o readpci; you will also need to change all your printf calls from printf(1, “…”); to cprintf(“…”);. You should then be able to run your readpci command on any Linux system where you have root access. (e.g. your class virtual machine image)
#ifdef HOST_VERSION #include #include #include typedef unsigned char uchar; #define cprintf printf #else #include “types.h” #include “user.h” #include “x86.h” #define cprintf(…) printf(1, VA_ARGS) #endif Last modified: Mon Mar 17 23:43:06 EDT 2014