Implementation of integrated graphics

Download Report

Transcript Implementation of integrated graphics

Többmagos/sokmagos
processzorok-2
Sima Dezső
2013 Október
Version 1.2
Áttekintés
•
1. Többmagos processzorok megjelenésének szükségszerűsége
•
2. Homogén többmagos processzorok
•
•
2.1 Hagyományos többmagos processzorok
•
2.2 Sokmagos processzorok
3. Heterogén többmagos processzorok
• 3.1 Mester/szolga elvű többmagos processzorok
• 3.2 Csatolt többmagos processzorok
•
4. Kitekintés
3. Heterogén többmagos processzorok
3. Heterogén többmagos processzorok (1)
Multicore processors
Heterogenous
multicores
Homogenous
multicores
Conventional
MC processors
2≤
Desktops
n≤8
cores
Manycore
processors
Master/slave
architectures
Add-on
architectures
with >8 cores
Servers
MPC
CPU
General purpose
computing
Prototypes/
experimental systems
MM/3D/HPC
production stage
3.1 ábra Többmagos processzorok főbb osztályai
GPU
HPC
near future
3.1 Heterogén többmagos mester/szolga elvű TP-ok
A Cell processzor
3.1 Heterogén mester/szolga elvű TP-ok - A Cell (1)
Cell BE
• Sony, IBM és Toshiba közös terméke
• Cél: Játékok/multimédia, HPC alkalmazások
Playstation 3 (PS3)
• Előzmények:
2000 nyara:
02/2006:
08/ 2007
05/ 2008
QS2x Blade Szerver család
(2 Cell BE/blade)
Az architektúra alapjainak meghatározása
Cell Blade QS20
Cell Blade QS21
Cell Blade QS22
3.1 Heterogén mester/szolga elvű TP-ok - A Cell (2)
SPE: Synergistic Procesing Element
SPU: Synergistic Processor Unit
SXU: Synergistic Execution Unit
LS: Local Store of 256 KB
SMF: Synergistic Mem. Flow Unit
EIB: Element Interface Bus
PPE: Power Processing Element
PPU: Power Processing Unit
PXU: POWER Execution Unit
MIC: Memory Interface Contr.
BIC: Bus Interface Contr.
XDR: Rambus DRAM
3.1.1 ábra: A Cell BE blokk diagramja [1]
3.1 Heterogén mester/szolga elvű TP-ok - A Cell (3)
3.1.2 ábra: A Cell BE lapka (221mm2, 234 mtrs) [1]
3.1 Heterogén mester/szolga elvű TP-ok - A Cell (4)
3.1.3 ábra: A Cell BE lapka – EIB [1]
3.1 Heterogén mester/szolga elvű TP-ok - A Cell (5)
3.1.4 ábra: Az EIB működési elve [1]
3.1 Heterogén mester/szolga elvű TP-ok - A Cell (6)
3.1.5 ábra: Konkurens átvitelek az EIB-en [1]
3.1 Heterogén mester/szolga elvű TP-ok - A Cell (7)
Példa egy komplex alkalmazás futtatása (digitális TV dekódolása) a Cell processzoron [2]
3.1 Heterogén mester/szolga elvű TP-ok - A Cell (8)
A Cell teljesítménye és a NIK részvétele a Cell teljesítmény-vizsgálataiban
• Teljesítmény @ 3.2 GHz:
QS21 Csúcs SP FP: 409,6 GFlops (3.2 GHz x 2x8 SPE x 2x4 SP FP/cycle)
• Cell BE - NIK
2007: Faculty Award (Cell 3Đ app./Teaching)
2008: IBM – NIK Kutatási Együttműködési Szerződés: Teljesítményvizsgálatok
• IBM Böblingen Lab
• IBM Austin Lab
3.1 Heterogén mester/szolga elvű TP-ok - A Cell (9)
The Roadrunner
6/2008 : International Supercomputing Conference, Dresden
A világ 500 leggyorsabb számítógépe listáján (Top500):
1. Roadrunner
1 Petaflops (1015) fenntartott teljesítmény (Linpack)
3.1 Heterogén mester/szolga elvű TP-ok - A Cell (10)
3.1.6 ábra:A világ leggyorsabb számítógépe: IBM Roadrunner (Los Alamos 2008) [3]
3.1 Heterogén mester/szolga elvű TP-ok - A Cell (11)
3.1.7 ábra: A Roadrunner főbb jellemzői [1]
3.2 Heterogén csatolt többmagos processzorok
3.2 Heterogén csatolt többmagos processzorok (1)
Multicore processors
Heterogenous
multicores
Homogenous
multicores
Conventional
MC processors
2≤
Desktops
n≤8
cores
Manycore
processors
Master/slave
architectures
Add-on
architectures
with >8 cores
Servers
MPC
CPU
General purpose
computing
Prototypes/
experimental systems
MM/3D/HPC
production stage
3.2.1 ábra: Többmagos processzorok főbb jellemzői
GPU
HPC
near future
3.2 Heterogén csatolt többmagos processzorok (2)
Csatolt elvű végrehajtás elve GPGPU-k esetén (a legegyszerűbb (kötegelt) szervezést
feltételezve) [4]
Host
kernel0<<<>>>()
(Adatpárh. progr.)
kernel1<<<>>>()
Device
3.2 Heterogén csatolt többmagos processzorok (3)
Megjegyzés a működési elvhez
• Heterogén csatolt többmagos processzorok: feldolgozás gyorsítók (accelerators)
• A működési elv szempontjából előzmény: heterogén csatolt társprocesszoros rendszerek
Példák: korai személyi számítógépek lebegőpontos társprocesszorokkal
Intel 286 + 287
386 + 387
Az Intel 486-nak már volt saját “on-chip” lebegőpontos egysége (FPU)
(az SX és SL modelek kivételével)
3.2 Heterogén csatolt többmagos processzorok (4)
Heterogén csatolt többmagos processzorok legfontosabb implementációi
Heterogén csatolt többmagos processzorok
Integrált grafika
Okostelefonok/táblagépek
3.2.1 Az Integrált grafika megjelenése
3.2.1 Az Integrált grafika megjelenése (1)
Áttérés angol nyelvű slide-ok használatára
3.2.1 Az Integrált grafika megjelenése (2)
Implementation of integrated graphics
Implementation of integrated graphics
In the north bridge
In a multi-chip processor package
on a separate die
On the processor die
Both the CPU and the GPU are on separate dies
and are mounted into a single package
P
IG
GPU
NB
South Bridge
Implementations about
1999 - 2009
Mem.
CPU
NB
P
P
Mem.
GPU CPU
Mem.
Periph. Contr.
South Bridge
Intel’s Havendale (DT) and
Auburndale (M)
(scheduled for 1H/2009
but cancelled)
Arrandale (DT, 1/2010) and
Clarkdale (M, 1/2010)
Intel’s Sandy Bridge (1/2011) and
Ivy Bridge (4/2012)
AMD’s Swift (scheduled for 2009
but canceled)
AMD’s Bobcat-based APUs (M, 1/2011)
Llano APUs (DT, 6/2011)
Trinity APUs (DT, Q4/2012)
3.2.1 Az Integrált grafika megjelenése (3)
Implementation of integrated graphics
Implementation of integrated graphics
In the north bridge
In a multi-chip processor package
on a separate die
On the processor die
Both the CPU and the GPU are on separate dies
and are mounted into a single package
P
IG
GPU
NB
South Bridge
Implementations about
1999 - 2009
Mem.
CPU
NB
P
P
Mem.
GPU CPU
Mem.
Periph. Contr.
South Bridge
Intel’s Havendale (DT) and
Auburndale (M)
(scheduled for 1H/2009
but cancelled)
Arrandale (DT, 1/2010) and
Clarkdale (M, 1/2010)
Intel’s Sandy Bridge (1/2011) and
Ivy Bridge (4/2012)
AMD’s Swift (scheduled for 2009
but canceled)
AMD’s Bobcat-based APUs (M, 1/2011)
Llano APUs (DT, 6/2011)
Trinity APUs (DT, Q4/2012)
3.2.1 Az Integrált grafika megjelenése (4)
Example 1: Intel’s Havendale (DT) and Auburndale (M) multi-chip CPU/GPU
processor plans [5]
• Revealed in 9/2007.
• Scheduled for 1H/2009 but cancelled about 1/2009.
• Both parts were based on the 2. gen. Nehalem (Lynnfield) architecture (45 nm), as shown
below.
PCI-E
Core Core
Power
Core Core
DDR3
DDR3 IMC
Graphics
Same LGA 1160
platform
Thread
Thread
Thread
Thread
8M
PCI-E
4M
Graphics
GPU
DDR3 IMC
DDR3
MCP Processor
Display
Digital
Thread
Thread
Thread
Thread
Power
Thread
Thread
Thread
Thread
DMI
Analog
Core Core
Havendale processor
(Multi-chip package – MCP)
Lynnfield processor
(Monolithic die)
Schedule:
• 2H ’08 First Samples
• 1H ’09 Production
• TDP < 95 W
I/O Control
Processors
I/O functions
Ibexpeak PCH
PCIe, SATA,
NVRAM, etc.
Display
Link
DMI
Display
VGA
Analog
SDVO, HDMI
Display Port,
DVI
Digital
I/O Control
Processors
I/O functions
Ibexpeak PCH
No integrated
graphics
PCIe, SATA,
NVRAM, etc.
3.2.1 Az Integrált grafika megjelenése (5)
Example 2: Intel’s Westmere-EP based multi-chip CPU/GPU processors (2010)-1 [6]
Clarkdale (desktop)
Arrandale (mobile)
3.2.1 Az Integrált grafika megjelenése (6)
Positioning of Clarkdale (DT) and Arrandale (M) in Intel’s roadmap [7]
3.2.1 Az Integrált grafika megjelenése (7)
Single PCH for Intel’s Westmere-EP based multi-chip CPU/GPU processors (2010) [7]
PCH
(Peripheral Control Hub)
(Dedicatedmegjelenése
graphics
3.2.1 Az Integrált grafika
(8)
via graphics card)
Removing integrated graphics (IGFX) from the north bridge to the processor [7]
(Dedicated graphics
via graphics card)
(Dedicatedmegjelenése
graphics
3.2.1 Az Integrált grafika
(8a)
via graphics card)
Removing integrated graphics (IGFX) from the north bridge to the processor [7]
(Dedicated graphics
via graphics card)
(Dedicated
graphics
3.2.1 Az Integrált grafika
megjelenése
(8b)
via graphics card)
Removing integrated graphics (IGFX) from the north bridge to the processor [7]
(Dedicated graphics
via graphics card)
3.2.1 Az Integrált grafika megjelenése (9)
Implementation of commercial graphics on the processor die
Implementation of integrated graphics
In the north bridge
In a multi-chip processor package
on a separate die
On the processor die
Both the CPU and the GPU
are on separate dies
and are mounted into
a single package
P
IG
GPU
NB
Mem.
South Bridge
Implementations around
1999 - 2009
CPU
NB
P
P
Mem.
GPU CPU
Mem.
Periph. Contr.
South Bridge
Intel’s Havendale (DT) and
Auburndale (M)
(scheduled for 1H/2009
but cancelled)
Arrandale (DT, 1/2010) and
Clarkdale (M, 1/2010)
Intel’s Sandy Bridge (1/2011) and
Ivy Bridge (4/2012)
AMD’s Swift (scheduled for 2009)
AMD’s Bobcat-based APUs (M, 1/2011) and
Llano APUs (DT, 6/2011)
Trinity APUs (DT, Q4/2012)
3.2.2 Intel’s Sandy Bridge
3.2.2 Intel’s Sandy Bridge (1)
3.2.2 Intel’s Sandy Bridge [8]
Key microarchitecture features of the Sandy Bridge vs the Nehalem
3.2.2 Intel’s Sandy Bridge (2)
Die plot of the 4C Sandy Bridge processor [9]
256 KB L2
(9 clk)
256 KB L2
(9 clk)
256 KB L2
(9 clk)
256 KB L2
(9 clk)
Hyperthreading
32K L1D (3 clk)
AES Instr.
AVX 256 bit
VMX Unrestrict.
4 Operands
20 nm2 / Core
@ 1.0 1.4 GHz
(to L3 connected)
(25 clk)
256 b/cycle Ring Architecture
DDR3-1600
Sandy Bridge 4C
32 nm
995 mtrs/216 mm2
¼ MB L2/C
8 MB L3
PCIe 2.0
3.2.2 Intel’s Sandy Bridge (3)
Block diagram of Intel’s Sandy Bridge with 6 Series PCH [10]
Core i3-21xx, 2C, 2/2011
Core i5-23xx/24xx/25xx, 4C, 1/2011
Core i7-26xx, 4C, 1/2011
1
Intel 6 series PCH1
1Except
P67
that does not provide
a display controller in the PCH
3.2.3 Intel’s Ivy Bridge
3.2.3 Intel’s Ivy Bridge (1)
3.2.3 Intel’s Ivy Bridge [11]
Key microarchitecture features of the Ivy Bridge vs the Sandy Bridge
3.2.3 Intel’s Ivy Bridge (2)
Contrasting the die plots of Ivy Bridge vs. Sandy Bridge (at the same feature size)-1 [12]
Ivy Bridge-DT
22 nm
1480 mtrs
160 mm2
Sandy Bridge-DT
32 nm
995 mtrs
216 mm2
3.2.3 Intel’s Ivy Bridge (3)
Contrasting the die plots of Ivy Bridge vs Sandy Bridge (at the same feature size)-2 [12]
Note
In the Ivy Bridge Intel devoted much more emphasis to graphics processing than in the
Sandy Bridge to compete with AMD’s graphics superiority.
3.2.4 AMD’s Swift Fusion APU plan
3.2.4 AMD’s Swift Fusion APU plan (1)
3.2.4 AMD’s Swift Fusion APU plan
Preliminaries
In 10/2006 AMD acquired the graphics firm ATI and at the same day they announced that
“AMD plans to create a new class of x86 processors that integrate the central processing unit
(CPU) and graphics processing unit (GPU) at the silicon level, codenamed “Fusion [13].”
Remark
Although in the above statement AMD designated the silicon level integration of the CPU and GPU
as the Fusion initiative, in some other publications they call both the package level and
the silicon level integration of the CPU and GPU as the Fusion technology, as shown
in the next figure [14]
3.2.4 AMD’s Swift Fusion APU plan (2)
Extended interpretation of the term Fusion technology in some AMD publications [14]
Despite this disambiguation, subsequently AMD understood the term Fusion usually as the
silicon level integration of the CPU and the GPU.
3.2.4 AMD’s Swift Fusion APU plan (3)
• In 12/2007 at their Financial Analyst Day AMD gave birth to a new term by designating
their processors implementing the Fusion concept as APUs (Accelerated Processing Units).
• At the same time AMD announced their first APU family called the Swift family [15] as well.
3.2.4 AMD’s Swift Fusion APU plan (4)
• In 11/2008 again at their Financial Analyst Day AMD postponed the introduction of
Fusion-based APU processors until the company transitions to the 32 nm technology [16] [17].
No Swift APU!
3.2.4 AMD’s Swift Fusion APU plan (5)
Remark
This is a similar move as done by Intel with their 45 nm Havendale (DT) and Auburndale (M)
in-package integrated multi-chip CPU+GPU projects.
As leaked from industry sources in 1/2009 Intel canceled their 45 nm multi-chip processor
plans in favor of 32-nm multi-chip processors to be introduced in Q1/2010 [18].
3.2.5 AMD’s K12 (Llano)-based APU lines
3.2.5 AMD’s K12 (Llano)-based APU lines (1)
3.2.5 AMD’s Llano-based APU lines [19]
• Introduced: 6/2011.
• The Llano line belongs to the Fusion APU (Accelerated Processing Unit) series as it includes
beyond a number of CPUs also a GPU to accelerate vision computing (graphics and media).
• Processors of the Llano lines have up to 4 CPU cores and a GPU.
Nevertheless, AMD sells Llano based desktop lines as well with disabled GPUs.
These lines are branded as Athlon II X4/X2 or Sempron lines.
• 32 nm technology, 228 mm2, 1450 mtrs.
3.2.5 AMD’s K12 (Llano)-based APU lines (2)
Die plot of the Llano processor [20]
3.2.5 AMD’s K12 (Llano)-based APU lines (3)
Example: AMD’s Llano-based A-series mobile lines [21]
3.2.5 AMD’s K12 (Llano)-based APU lines (4)
Conceptual difference between AMD’s Fusion APU’s and Intel’s Sandy Bridge CPUs [22]
3.2.5 AMD’s K12 (Llano)-based APU lines (5)
AMD’s Llano APU processor with the A75 FCH [23]
Lynx platform
FCH: Fusion Control Hub
3.2.6 AMD’s Piledriver-based Trinity desktop APU line
3.2.6 AMD’s Piledriver-based Trinity desktop APU line (1)
3.2.6 AMD’s Piledriver-based Trinity desktop APU line
Announced in 6/2012
Launched: 10/2012
The Trinity APU is based on the Piledriver Compute Module, which is a redesign of the ill fated
Bulldozer Compute Module.
32 nm feature size, 226 mm2, 1.303 billion transistors (almost the same figures as for Lliano)
3.2.6 AMD’s Piledriver-based Trinity desktop APU line (2)
The Piledriver Compute Module of Trinity [24]
3.2.6 AMD’s Piledriver-based Trinity desktop APU line (3)
AMD’s Trinity die [29]
32 nm
226 mm2
1.303 billion transistors
2 Piledriver modules (4 cores)
GPU
3.2.6 AMD’s Piledriver-based Trinity desktop APU line (4)
Comparing die areas devoted to cores and graphics in AMD’s Llano and Trinity []
Quad cores
32 nm
228 mm2
1.450 billion transistors
Dual modules (quad cores)
32 nm
226 mm2
1.303 billion transistors
http://www.tomshardware.com/reviews/a10-4600m-trinity-piledriver,3202-4.html
3.2.6 AMD’s Piledriver-based Trinity desktop APU line (6)
The Comal platform that incorporates the (Piledriver-based) Trinity APU and the
A70M PCH [26]
3.2.7 Okostelefonok, táblagépek
3.2.7 Okostelefonok, táblagépek
Ld. később külön fejezetként.
4. Kitekintés
4. Kitekintés (1)
Kitekintés
Heterogenous
multicores
Master/slave
architectures
Add-on
architectures
Több CPU
Több gyorsító
4.1 ábra: Hetererogén többmagos processzorok várható fejlődése
Referenciák
References (1)
[1]: Wright C., Henning P., Bergen B., Roadrunner Tutorial, An Introduction to Roadrunner,
and the Cell Processor, Febr. 7 2008,
http://www.lanl.gov/orgs/hpc/roadrunner/pdfs/Roadrunner-tutorial-session-1-web1.pdf
[2]: Blachford N., Cell Architecture Explained, v.02, 2005,
http://www.blachford.info/computer/Cell/Cell2_v2.html
[3]: Ricker T., World's fastest: IBM's Roadrunner supercomputer breaks petaflop barrier using
Cell and Opteron processors, Engadget, June 9 2008, http://www.engadget.com/2008/
06/09/worlds-fastest-ibms-roadrunner-supercomputer-breaks-petaflop/
[4]: NVIDIA CUDA Compute Unified Device Architecture, Programming Guide, Version 1.1,
Nov. 29 2007, http://moss.csc.ncsu.edu/~mueller/cluster/nvidia/1.1/NVIDIA_CUDA_
Programming_Guide_1.1.pdf
[5]: RS – Intel 2009 Desktop Platform Overview, Sept. 2007,
http://pic.xfastest.com/z/INTEL%202009%20%20Overview/2009Overview.ppt
[6]: Smith S.L., Intel Roadmap Overview, IDF 2009, Sept. 22 2009,
http://download.intel.com/pressroom/kits/events/idffall_2009/pdfs/IDF_SSmith_Briefing.pdf
[7]: Smith S.L., 32nm Westmere Family of Processors, 2009,
http://download.intel.com/pressroom/kits/32nm/westmere/32nm_WSM_Press.pdf
[8]: Kahn O., Piazza T., Valentine B.: Technology Insight: Intel Next Generation Microarchitecture
Codename Sandy Bridge, IDF 2010, extreme.pcgameshardware.de/.../281270d1288260884bonusmaterial-pc- games-hardware-12-2010-sf10_spcs001_100.pdf
References (2)
[9]: Intel Sandy Bridge Review, Bit-tech, Jan. 3 2011,
http://www.bit-tech.net/hardware/cpus/2011/01/03/intel-sandy-bridge-review/1
[10]: 2nd Generation Intel Core Processor Family Desktop, Datasheet, Vol.1, Jan. 2011,
http://pdfs.icecat.biz/pdf/28565951-9811.pdf
[11]: George V., Piazza T., Jiang H., Technology Insight: Intel Next Generation Microarchitecture,
Codename Ivy Bridge, IDF 2011, SPCS005
[12]: Athow D., Picture : Ivy Bridge vs Sandy Bridge GPU Die Sizes Compare, ITProPortal,
April 24 2012, http://www.itproportal.com/2012/04/24/picture-ivy-bridge-vs-sandybridge-gpu-die-sizes-compared/
[13]: AMD Completes ATI Acquisition and Creates Processing Powerhouse, Oct. 25 2006,
http://www.amd.com/us/press-releases/Pages/Press_Release_113741.aspx
[14]: AMD Torrenza and Fusion together, Metal Ghost, March 22 2007,
http://www.metalghost.ro/index.php?option=com_content&view=article&id=233:amdtorrenza-and-fusion-together
[15]: Rivas M., AMD 2007 Financial Analyst Day Presentation, Dec. 13 2007
[16]: AMD Financial Analyst Day 2008, Nov. 13 2008,
http://gbcw.wordpress.com/2008/11/13/amd-financial-analyst-day-2008/
[17]: Hruska J., AMD Fusion now pushed back to 2011, Ars Technica, Nov. 14 2008,
http://arstechnica.com/uncategorized/2008/11/amd-fusion-now-pushed-back-to-2011/
References (3)
[18]: Intel cans 45nm “Auburndale” and “Havendale” Fusion CPUs!, Jan. 31 2009,
http://theovalich.wordpress.com/2009/01/31/exclusive-intels-cans-45nm-auburndaleand-havendale-fusion-cpus/
[19]: Wikipedia, Turion, http://en.wikipedia.org/wiki/Griffin_(processor)#Turion_X2_Ultra
[20]: Foley D., AMD’s „LLANO” Fusion APU, Hot Chips 23, Aug. 19 2011,
http://www.hotchips.org/archives/hc23/HC23-papers/HC23.19.9-Desktop-CPUs/
HC23.19.930-Llano-Fusion-Foley-AMD.pdf
[21]: AMD A-Series APU, EMEA Press Call, June 7 2011,
http://img.zwame.pt/nemesis11/Amd_A_series/AMD.pdf
[22]: Kirsch N., AMD Llano A-Series APU Sabine Notebook Platform Review, Legit Reviews,
June 13 2011, http://www.legitreviews.com/article/1636/1/
[23]: Chiappetta M., AMD A8-3850 Llano APU and Lynx Platform Preview, Hot Hardware,
June 30 2011, http://hothardware.com/Reviews/AMD-A83850-Llano-APU-and-LynxPlatform-Preview/?page=2
[24]: Walrath J., AMD, Vishera, and Beyond: New Design Philosophy Dictates a Faster Pace,
PC Perspective, July 5 2012, http://www.pcper.com/reviews/Editorial/AMD-Vishera-andBeyond-New-Design-Philosophy-Dictates-Faster-Pace/How-Does-Vishera
[25]: Wasson S., AMD's A10-4600M 'Trinity' APU reviewed, Tech Report, May 16 2012,
http://techreport.com/review/22932/amd-a10-4600m-trinity-apu-reviewed
References (4)
[26]: Paul D., Meet the new AMD APUs Series A-2nd generation “Trinity”, TechNews, May 15 2012,
http://technewspedia.com/meet-the-new-amd-apus-series-a-2-nd-generation-trinity/
[27]: OMAP 5 Mobile Applications Platform, Product Bulletin, Texas Instruments, 2011,
http://www.ti.com/pdfs/wtbu/SWCT010.pdf
[28]: Hibben M., Texas Instruments and the Big Chip Maker Anachronism, Nov. 16 2012,
http://beta.fool.com/markhibben/2012/11/16/texas-instruments-and-big-chip-makeranachronism/16680/
[29]: Shimpi A.L., AMD A10-5800K & A8-5600K Review: Trinity on the Desktop, Part 1,
AnandTech, Sept. 27 2012, http://www.anandtech.com/show/6332/amd-trinity-a105800k-a8-5600k-review-part-1
Köszönöm a figyelmet!
3.2.6 AMD’s Piledriver-based Trinity desktop APU line (6)
Trinity’s Unified North Bridge []
http://www.hardwarecanucks.com/forum/hardware-canucks-reviews/54260-amd-trinity-going-mobile-new-apu-4.html
GNB: Graphics North Bridge
RMB: Radeon Memory Bus
http://hothardware.com/Reviews/AMD-Trinity-A104600M-Processor-Review/?page=3
Trinity Unified North Bridge
http://www.hardwarecanucks.com/forum/hardware-canucks-reviews/54260-amd-trinity-going-mobile-new-apu-4.html
http://www.hardwarecanucks.com/forum/hardware-canucks-reviews/54260-amd-trinity-going-mobile-new-apu-4.html
The links between each section of the APU follow in the same footsteps as the previous
generation but AMD has refined certain interconnects with the goal of speeding up information
transfers. The AMD Fusion Compute Link is still considered to be a medium bandwidth
connection which manages the complex interaction between the onboard GPU, the CPU’s cache
and the system memory. Unlike in the past, AMD has finally refined this interconnect, giving the
GPU direct access to a coherent memory space while the CPU can now directly access the GPU’s
dedicated framebuffer if needed. This is one of the primary reasons why Trinity’s theoretical
data throughput has jumped from 572 GFLOPS to 736 GFLOPS.
The Radeon Memory Bus on the other hand is the all-important link between the onboard
graphics coprocessor and the primary on-chip memory controller. Rather than acting like a
traffic cop (a la Fusion Compute Link) which tries to direct the flow of information, this memory
bus is all about the GPU having unhindered high bandwidth access to the system’s memory
controllers.
In the previous generations of AMD IGPs, before Llano came around, the Northbridge’s graphics
processor had to jump through a series of hoops before gaining access to onboard memory
which is partially why 128MB of “SidePort” memory was sometimes added. However, the APU’s
single chip, all in one solution allows for the elimination of many potential bottlenecks.
http://www.hardwarecanucks.com/forum/hardware-canucks-reviews/54260-amd-trinity-going-mobile-new-apu-4.html
Trinity
This unit adds virtual address access discrete graphics, allowing an external GPU to directly
access the same virtual address space as the CPU through page tables. As you can imagine,
this is a key part of the programming model for AMD’s Heterogeneous Systems Architecture
(HSA).
http://www.tomshardware.com/reviews/a10-4600m-trinity-piledriver,3202-4.html