[cgl_discussion] Project Review: HT Support
venkatesh.pallipadi at intel.com
Fri Oct 4 13:34:07 PDT 2002
Requirements related to HT Support project
Requirement 2.8: Hyper-Threading of CPUs
Requirement 6.9: Process Affinity
How HT Support meets the CGL requirements
This project adds various HT related patches required for:
- enabling/disabling HT in linux kernel,
- improving the performance of HT enabled system.
This project also adds:
- /proc based APIs for cpu affinity.
Project design information
The project is a collection of HT related patches, for enabling HT and to
improve the performance of HT based system. Brief description of the
Enabling/Disabling/Identification of HT:
User can either choose to enable/disable HT at the boot time. However, once
initialized, they can't be changed during the run-time. During the boot,
BIOS provides the information to the OS using different formats like MP
table, ACPI table. Of these tables, ACPI table contains the information
about Hyper-Threaded processors and MP-table contains information only about
Processor Packages (and not logical Processors), Previously, linux has been
using MP-table to identify the processors in the system. In order to detect
the presence of HT processors, we need to look at ACPI table, in place of MP
table. To enable HT we now have to look at ACPI table for processor
information. HT is enabled by default. However, it can be disabled by "noht"
kernel parameter at boot time.
An additional field "cpu_package" is added under each processor in
/proc/cpuinfo, to distinguish between physical and HT processors.
"cpu_package" field will have same value for different logical processors in
the same package. Sample (processor,cpu_package) pairs-
Synchronous access to shared resources:
mtrr is a shared resource in a cpu_package and we need additional
synchronization while accessing mtrr from processors belonging to same
microcode is another shared resource and we need additional synchronization
in the microcode update driver.
One of the optimizations for an HT based system is to offset the user stack
address, so that we can minimize L1 cache misses, due to two logical
processors, running on same physical package. The offset can be a
pseudo-random number that is also a multiple of the reference offset or 128
bytes. Usually, this number is chosen to be around 8KB.
There is a need for an API like processor affinity that can be used to get
maximum performance out of HT enabled system. Using these APIs, applications
can bind the some of its threads to specific packages, or specific logical
processors across packages.
/proc based cpu_affinity patch, using the standard process migration
interface, provides a simple interface for processor affinity. The advantage
is that it can directly be used both in the command line and scripts.
Sample usage: Get affinity: "cat /proc/1/affinity"; Set affinity: "echo
0000000f > /proc/1/affinity"
The kernel patch for HT support is located in the cgl development tree
More information about the cgl_discussion