PCI Express In Depth for Windows Vista and Beyond

Download Report

Transcript PCI Express In Depth for Windows Vista and Beyond

PCI Express In Depth
For Windows Vista
And Beyond
Allen Marshall
Lead Program Manager
Vinod Mamtani
Software Development Engineer
Core Platform Architecture
Microsoft Corporation
Agenda
PCI Express support in Windows Vista
PCI Express firmware support
Enabling native PCI Express support
Enabling flexible resource assignment
Optimal PCI device resources
Active state power management
MSI support
PCI Express Features
Supported in Windows Vista
Memory mapped CFG space access, extended
CFG space access, and segment support
Active State Power Management (ASPM)
Native Power Management Events (PME)
Message-Signaled Interrupts (MSI/MSI-X)
Native Hot Plug
Advanced Error Reporting (AER)
Multilevel resource rebalance
PCI Express Features
Supported in Windows Vista
Miscellaneous base features, including
Capability version field checking
PCI Express hardware ID and New compatible device
ID identification and matching
Updated device class code parsing
Phantom functions
Device serial numbers
PCI Express tree hierarchy checking
Setting of Max Payload Size and Max Request Size fields
in the Device Control register to match root port settings
Transactions pending support
Clock power management (CLK_REQ)
PCI Express Features
Not supported
Virtual channel
For other than channel 0
Isochronous transfers
Slot power budgeting
Vista will not change the BIOS configuration
for a device
Vista will save and restore the configuration
across sleep transitions
PCI Express Features
Non-snoop I/O
Windows Vista will disable non-snoop I/O
by default
Windows Vista DMA model assumes cache
coherent DMA
Except for devices on VGA path
Windows Vista clears the Enable Non Snoop
bit in the Device Control register
Device drivers may enable this bit
Update this register during Start handling
Windows Vista will thereafter preserve this value
across power state transitions
PCI Driver Updates
Architectural changes in PCI
Multilevel resource rebalance
Reduced I/O space support
Optimal default PCI bridge resource
window sizes
Subtractive decode PCI bridges
Support for 64-bit resources
Greater integration of PCMCIA support in PCI
Re-structure of legacy R2 card interrupt detection
PCI Express Firmware
Enabling native PCI Express support
By default, Windows Vista starts in PCI
compatibility mode
No PCI Express features assumed or enabled
_OSC is used by firmware and the operating
system to
Report OS capabilities to the platform
Report platform capabilities to the OS
Transfer control of PCI Express features from
firmware to the operating system
PCI Express Firmware
Reporting Windows Vista capabilities
Windows Vista reports support for the
following capabilities to the firmware
via _OSC method
Extended PCI config space
Message-signaled interrupts
Clock power management
PCI segment groups
PCI Express Firmware
Negotiating control of native features
Dependencies exist between
PCI Express features
Windows Vista requires the platform to grant
control to the OS over all of
AER
Native PME
Hot plug
Express capability
Otherwise, Windows Vista will not assume
control of any of these features
PCI Express Firmware
Negotiating control of native features
Control negotiated via _OSC method
When granting control of native
features, firmware should grant
control of unimplemented features
This signals Windows Vista that it is safe to
assume control of the implemented features
PCI Device Resources
Address space constraints
Physical address space below 4GB
continues to face increased demand
MCFG introduces large memory hole
Larger amounts of system RAM
Greater numbers of PCI or
PCI Express devices
Increasing device resource requirements
Devices limited to 32-bit DMA support
PCI Device Resources
Address space constraints
This problem is mitigated by
PCI Express 64-bit prefetchable BARs
Required by WHQL logo program
Allows Windows to assign resources above 4 GB
Emergence of mainstream 64-bit platforms
PCI Device Resources
PCI Resource Arbitration
Windows configures and starts a PCI bridge
before scanning the secondary side of the bridge
for PCI devices
All devices on the bridge are arbitrated
with resources that fall inside the bridge’s
resource window
Subtractive bridges that also do positive decoding
have resources arbitrated from the bridge window
Legacy and non-PCI devices will be arbitrated with
resources outside bridge window
PCI Device Resources
Bridge window configuration
Windows XP and Windows Server 2003 do not reconfigure
the bridge windows based on the requirements of a device
behind the bridge
May lead to a PCI device not starting due to lack of resources
Even though enough device resources are available to the system
In some cases, boot configuration of PCI devices by firmware
works best for Windows versions prior to Windows Vista
Mobile PCs that don’t expose PCI expansion slots
Server PCs that support device hot plug
Large server configurations with extensive I/O
Platform has better visibility into specific resource requirements
than the OS can ascertain during boot
PCI Device Resources
Bridge window configuration
Windows Vista supports multi-level resource rebalance
Allows Vista to dynamically reconfigure resource assignments
across multiple hierarchical levels in a device tree
Windows Vista default bridge resource windows sizes are optimized
for deep PCI Express hierarchies
I/O windows default to 4 K
Memory windows default to 1 MB
If a PCI device’s resource requirement cannot be arbitrated inside the
current bridge resource window, Vista reconfigures the PCI bridge with
a new set of resources to accommodate the PCI device requirements
Avoiding boot configuration of all PCI devices works best on Vista
Platform must boot configure required boot devices
If device requiring boot configuration are behind a bridge,
you must boot config all devices behind the bridge
PCI Device Resources
Properly define MMCFG space
Windows Vista parses MCFG table for
memory mapped config access
This memory is marked off-limits for device
resource assignment
Bus number range must match bus range
for PCI root bus
Earlier versions of Windows should have an
ACPI motherboard resource which claims the
exact same MM config region
Place motherboard resource at correct location
in namespace
PCI Device Resources
Properly define MMCFG space
_SEG method must be defined for PCI
Root Bus that matches segment in MCFG
Segment number encoded in bus number
range for module devices
Avoid common pitfalls
Define memory region accurately, don’t overlap
other devices (like local APIC)!
Ensure regions are the same for Windows Vista
and earlier versions of Windows
Implement SAL revision >= 3.2 for Itanium
platforms for extended config access
PCI Device Resources
Device resources above 4 GB
Devices with boot configurations above 4 GB are
handled differently across Windows versions
Windows Vista always respects boot configuration
of devices above 4 GB
If the processor and operating system version support
accessing addresses > 4 GB
Windows XP and Window Server 2003 ignores
boot configurations above 4 GB
If resources cannot be allocated below 4 GB, a range
above 4 GB will be assigned
Regardless of the processor or Windows addressing
capability, which may leave the device inoperable
PCI Device Resources
Firmware resource allocation
The different behaviors of Windows
versions present conflicting requirements
to platform firmware
This necessitates a flexible approach
to firmware development
See whitepaper for details on how to
enable optimal resource assignment
for Windows Vista and earlier versions
of Windows
PCI Device Resources
Need ability to ignore boot config
A mechanism is needed for the platform to
indicate to the OS that boot configurations
can be ignored for a device hierarchy
Enables Windows Vista to ignore boot
configured device resources
Provides for greater resource allocation flexibility
Allows firmware to boot configure devices
for best compatibility with Windows XP and
Windows Server 2003
This allows backward compatibility and smooth
transition to future operating systems
PCI Device Resources
_DSM for Ignoring Boot Config
_DSM is an optional ACPI control method that
enables devices to provide device-specific
control functions
_DSM usage for PCI is defined in the PCI Firmware
Specification, Rev. 3.0
The _DSM method for PCI devices is optional on
Windows Vista and is not evaluated on Windows
XP and Windows Server 2003
Microsoft has proposed an ECR to the PCI
Firmware Specification to add an additional
function definition to ignore boot configuration
of PCI devices
PCI Device Resources
Platform usage of _DSM
Assign root bridge resources spanning
4 GB boundary
All devices in the path must support 64-bit
prefetchable BARs
Apply boot configurations to all devices for
compatibility with earlier versions of Windows
Implement _DSM allowing boot configuration
to be ignored
Windows Vista will place all resources
above 4 GB
Active State Power Management
Overview
ASPM is required for PCI Express devices
Serial links remain active to
maintain synchronization
ASPM offers significant power savings
≈ 1W to 3W, depending on device or
lane width
This is especially important on mobile PCs
Roadmap includes enabling ASPM on as many
devices as possible
Active State Power Management
ASPM in Windows Vista
Enabling ASPM in Windows Vista is based on
Hardware capabilities
L0s is required as per the PCI Express Base specification
L1 is required for ExpressCard
Exit time latencies
System power policy
System-level controls
Device-level ASPM controls
Active State Power Management
ASPM hardware capabilities
Hardware must be capable of ASPM
as reported in its Link Capabilities register
Windows Vista checks this register
for all PCI Express devices in the
hierarchy, including
Root Ports
Switch Ports
PCI Express-to-PCI or -PCI-X Bridges
Device Endpoints
Active State Power Management
ASPM exit time latencies
L0s or L1 is always enabled for a Switch or Root
Port when a device is present on the link
To enable ASPM for endpoints Windows Vista
must first calculate exit latencies
To ensure that the overall hierarchy latency
is within the links requirements for an endpoint
Windows Vista calculates exit time latency
in accordance with the PCI Express
Base Specification
Active State Power Management
ASPM exit time latencies
Windows Vista first calculates latencies separately
for both L0s and L1
L0s is managed independently for both Root-facing
and Endpoint-facing Links
Windows Vista computes overall latency starting at
the Root Port, progressing to the Endpoint, and then
returning from the Endpoint back up to the Root Port
IHVs must reflect the L1 exit latency timing accurately
in the Link Capabilities register
So that Windows Vista can calculate exit latencies as precisely
as possible
Active State Power Management
System power policy
ASPM settings are linked to overall system
power policy settings in the operating system
Windows Vista power policy settings allow
for negotiation of these ASPM states
Off (L0)
Moderate Power Savings (L0s)
Maximum Power Savings (L0s or L1)
Active State Power Management
System power policy
Windows Vista default ASPM power policy settings
System power policy
Power
sourc
e
High
performan
ce
Balanced
Power saver
AC
Off
Moderate power savings
(L0s)
Maximum power savings
(L0s/L1)
DC
Off
Maximum power savings
(L0s/L1)
Maximum power savings
(L0s/L1)
Active State Power Management
System-level controls
Windows Vista enables ASPM based on
two mechanisms
The version of PCI Express Base
Specification with which PCI Express devices
in the system comply
Platform firmware override mechanisms
Active State Power Management
System-level controls
Microsoft encountered some devices that comply with
PCI Express Base Specification 1.0 but did not implement
ASPM correctly
Too many broken devices to enable ASPM by default on all PCI
Express devices
Device PCI Express Base Specification Revision
Compliance determines whether ASPM is enabled
by default
Devices which support revision 1.1 have ASPM enabled by default
Role-based Error Reporting capability bit in the Device
Capabilities register used to determine 1.1 revision compliance
Active State Power Management
Firmware override mechanisms
System BIOS may also control ASPM operation
on Windows Vista
The BIOS may enable ASPM on pre-1.1 devices via
boot configuration
Windows Vista may override this setting
Based on the result of the device version and device
.inf file opt-in/opt-out directives
The BIOS may disable ASPM system-wide
Microsoft has proposed a new ACPI flag to allow firmware
to convey to OSPM that ASPM should not be enabled on
a non-compliant platform
Active State Power Management
Device level ASPM controls
For systems with pre-1.1 hardware, an
“opt-in” flag has been defined to allow
bypassing the pre-1.1 check
Skipping this check allows specific devices
known to work to use ASPM
A device “opt-out” mechanism allows a driver
to specify that a device it controls does not
properly support ASPM
This targets post-1.1 devices
Active State Power Management
Device level ASPM controls
This value can be populated by the device driver
INF file at install time
The machine.inf file contains a section to set the value
Device INFs need include only this section by
using Include and Needs directives to get the
desired behavior
[PciASPMOptIn]
AddReg=PciASPMOptIn.RegHW
[PciASPMOptIn.RegHW]
HKR,e5b3b5ac-9725-4f78-963f-03dfb1d828c7,ASPMOptIn,0x10001,1
[PciASPMOptOut]
AddReg=PciASPMOptOut.RegHW
[PciASPMOptOut.RegHW]
HKR,e5b3b5ac-9725-4f78-963f-03dfb1d828c7,ASPMOptOut,0x10001,1
Active State Power Management
Device level ASPM controls
Enabling ASPM on Pre-1.1 Devices
The driver developer must place the following
entry in the device driver’s INF file
[DDInstall.HW]
Include=machine.inf
Needs=PciASPMOptIn
Active State Power Management
Device level ASPM controls
Disabling ASPM
The driver developer must place the following
entry in the device driver’s INF file
[DDInstall.HW]
Include=machine.inf
Needs=PciASPMOptOut
Message Signaled Interrupts
MSI/MSI-X overview
Device drivers should opt-in to get MSI and MSI-X
messages assigned to the device
Device drivers must use the IoConnectInterruptEx
call to attach a message service routine (MSR) for
these messages
Windows Vista allows 16 MSI messages and up to 2048
MSI-X messages per device.
It is possible to attached separate MSRs for
each message
See whitepaper for detailed samples
Message Signaled Interrupts
When fewer messages are available
PCI driver programs all assigned MSI-X
messages to a device MSI-X table
When a device assigned fewer than
requested messages, PCI driver fills table
with duplicate messages
Single MSI-X message programmed for each entry
All assigned messages programmed in sequence,
then fill remaining set with first message
Driver can change the MSI-X table entries and
disable/enable them as it deems appropriate
Call To Action
Consider PCI resource allocation when
designing firmware for platforms that are
running Windows operating systems
Implement and validate ASPM in your
PCI Express devices
Test and validate native PCI Express
features and firmware with Windows Vista
Implement MSI aware device drivers
Additional Resources
Web Resources
Whitepapers
http://www.microsoft.com/whdc/system/bus/pci/
default.mspx
Related sessions
CPA002 ACPI in Windows Vista
CPA060 Kernel Plug and Play in Windows Vista
PCI support
Send e-mail to Microsoft PCI Express Support at
pciesup @ microsoft.com
Backup
PCI Express Firmware
ACPI _OSC method
Required for host bridges that originate a PCI
Express hierarchy
_OSC must be present in order for Windows
Vista to enable PCI Express features
If _OSC is not present, Windows Vista will not
assume control of any PCI Express features
Essentially runs in Windows XP mode
In Device Manager, PCI Express root ports will
show up as PCI to PCI bridges
PCI Device Resources
Update PCI _DSM to ignore boot config
UUID
E5C937D0-3553-4d7a-9117-EA4D19C3434D
Revision
Function
Description
1
1
PCI Express Slot Information
1
2
PCI Express Slot Number
1
3
Vendor-specific Token ID
1
4
PCI Bus Capabilities
1
5
Ignore PCI Boot Configuration
PCI Device Resources
Firmware recommendations
Reserve non-conflicting resources above 4 GB
in the _CRS method of a PCI root bus
Only one PCI root bus may have resources that spans
the 4 GB boundary
Use a QWORD memory descriptor in the _CRS method
of a PCI root bus to define a memory range
This range is then available as a PCI device memory resource
to the entire hierarchy that emanates from the root bus
Windows Vista uses this range
If the processor and operating system version allow it
PCI Device Resources
Firmware recommendations
Assign boot configurations for PCI devices
below 4 GB to provide compatibility with
Windows XP and Windows Server 2003
Implement the new _DSM method to
allow Windows Vista to ignore PCI device
boot configurations
This ensures the most flexible resource
allocation on Windows Vista
Active State
Power Management
Firmware override mechanisms
Proposed IAPC_BOOT_ARCH flag
BOOT_ARCH
Bit
length
Bit
offset
Description
If set, indicates that the motherboard supports user-visible devices on the LPC or ISA
bus. User-visible devices are devices that have end-user accessible connectors (for
example, LPT port), or devices for which the OS must load a device driver so that an
end-user application can use a device. If clear, the OS may assume there are no such
devices and that all devices in the system can be detected exclusively via industry
standard device enumeration mechanisms (including the ACPI namespace).
LEGACY_DEVICES
1
0
8042
1
1
If set, indicates that the motherboard contains support for a port 60 and 64 based
keyboard controller, usually implemented as an 8042 or equivalent micro-controller.
VGA Not Present
1
2
If set, indicates to OSPM that it must not blindly probe the VGA hardware (that responds
to MMIO addresses A0000h-BFFFFh and IO ports 3B0h-3BBh and 3C0h-3DFh) that
may cause machine check on this system. If clear, indicates to OSPM that it is safe to
probe the VGA hardware..
MSI Not Supported
1
3
If set, indicates to OSPM that it must not enable Message Signaled Interrupts (MSI) on
this platform.
PCIe ASPM Controls
1
4
If set, indicates to OSPM that it must not enable ASPM on this platform.
Reserved
11
5
Must be 0.
© 2006 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market
conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.