Quad-Core Opteron: architecture and roadmaps
Author: Date: 15.09.2006 |
|
A month ago, in mid-August, our readers were able to read the new about AMD's announcement of the new generation of server processors Opteron and the company's preliminary roadmaps on the the release of first Quad-Core Opteron chips. In fact, first more or less authentic details of the Quad-Core architecture of Opteron CPUs started appearing much earlier – for instance, still during the Computex 2006 in June there were the very first news on the debut of quad-core Opteron processors of stepping F to come up in approximately the first quarter of 2007 – following both the 65-nm and probably 90-nm process technologies. However, even after the official announcement there is still neither absolute clarity nor details of the architectural traits of Quad-Core Opteron chips, there was too little information on that.
That is why journalists were dying with impatience looking forward to visiting AMD's Moscow even scheduled for 5 September. This time, Pierre Brunswick, vice president for sales and marketing in Russia and ex-USSR, Eastern Europe and Turkey, was expected to take part in the press conference together with Guiseppe Amato, technical director at the sales and marketing department in Europe, Middle East, and Africa. A technical expert is always a feast in our place, and this time our expectations came true. Moreover, during the presentation Mr. Amato told a lot of interesting and exclusive details on AMD's new chips and initiatives, as well as answered a number of uneasy questions, for which our appreciations to him. On the whole, the event was held at a dignified technical level let alone extensive marketing droplets in almost every slide criticizing the major competitor. That is why today we can tell you a lot of new and interesting detail, and it is up to you to clear the text off the adverts.
During the presentation, Mr. Amato touched upon the following key topics:
- AMD's next-generation Opteron processors on the base of the new platform in view of further migration to the quad-core architecture;
- Comparison of AMD Opteron and Intel Woodcrest platforms in terms of expediency of specific benchmarking techniques testing varied parameters to estimate the real performance;
- Advantages of the AMD Virtualization technology;
- Investment protection for end-users.
That is the sequence the report was made up.
Architectural traits of new-generation AMD Opteron processors
Well, we start with the most interesting part - the composition of 4-core AMD processors which will be dubbed as Santa Rosa and Deerhound (working names). We note straight off that to the question "Will 4-core AMD Opteron processors made following the 90-nm process technology be ever shipped?" Mr.Amato distinctly responded that end-users would receive 65-nm chips only. Therefore, you can forget about the early forecasts on the probable release of server 4-core AMD processors following the 90-nm process technology – to all appearances, chips like these will be used at early stages for testing purposes.
The burden of the part of the report devoted to architectural features of the future Quad-Core AMD Opteron processors was about the backward compatibility to the current generation of Socket F chips, as well as novelties that improve the key factor - the performance-per-Watt.
According to the presented information , despite the increase in physical size of the chips and substantial reorganization of the interior architecture, the new 4-core processors will preserve the former TDP range typical for the previous 2-core Opteron chips; that is, the TDP of the new chips is promised to be at about 95 W. Along with that, new processors will offer support for the AMD-V, i.e.the AMD Virtualization technology.
Among the key technologies implemented in the new 4-core AMD Opteron processors are:
- Native Quad-Core Design - the "native " quad-core architecture with four cores on a single substrate
- Enhanced AMD PowerNow! - the extended and improved power-saving optimization technology that allows for dynamic reduction of the power consumption by cores – up to 75% in the standby mode
- Direct Connect Architecture – allows effectively eliminating part of the traditional "bottlenecks" of the x86 architecture: direct connection of I/O HyperTransport buses (up to 8 GB/s), real-time communication between processors; integrated memory controller that effectively reduces the latency and affects the performance positively; direct connection to DDR2 memory
- Advanced Process Technology – improved 65-nm process technology that uses the SOI (Silicon-on-Insulator), the small leakage currents allow improving the performance-per-watt and reducing the heat emission of 32-bit instruction fetch
- Improved branch prediction mechanism
- Out-of-order command execution
- Dual-thread control of 128-bit SSE instructions
- Up to four double-precision floating-point operations per cycle
- Extensions for processing bit groups (LZCNT/POPCNT)
- Handling SSE extensions (EXTRQ/INSERTQ, MOVNTSD/MOVNTSS)
As an additional advantage of the new 4-core processors, there is also a balanced effective structure of the cache: 64 K of data cache and 64 K of L1 cache instructions, 512 K of L2 cache per each core, and finally the overall distributed L3 cache – 2 MB (Santa Rosa?) and more (4 MB – Deerhound?) per CPU.
One of the most interesting and vivid slides of the presentation is of course the company's roadmap for further generations of produce, showing the specifications for not only the processors but chipsets and platforms in general.
As you can see, the new Quad-Core Opteron processors with L3 cache whose release is planned for the next year - to be more precise, the second quarter of 2007 - will also offer support for the TCP Offload, be equipped with Gigabit Ethernet, Serial SCSI, Serial ATA II with support for RAID. The further generation of chips whose release is planned for 2008 will offer support for the Direct Connect Architecture 2.0 (HT 3.0?), greater cache and a number of other novelties; it will also support the PCI Express 2, ten Gigabit Ethernet controllers etc.
In terms of the further wave of innovations which are meant to make the consumer's life easier and improve the performance, Mr.Amato dwelled on the specifications of Torrenza, Trinity, and Raiden technologies. For instance, the Torrenza technology meant to accelerate data processing is based on the Direct Connect Computing technology, and its implementation will be effected at the expense of the HTX slot and dedicated hardware accelerators.
The Trinity technology implemented on the hardware level of the chip will be in charge of improved system security, implementation of virtualization and improved controllability.
Finally, the reduction of total cost of ownership (TCO) and extended capabilities of the client equipment, including those gained through virtualization, will be offered by the Raiden technology.
As an additional benefit of the AMD Opteron platform, the presenter brought in the fact that the "life" of the current processor socket for server chips – 1207-pin Socket F - is promised to last up to 2009, which means the time when AMD decides to implement the integrated memory controller with support for FB-DIMM in Opteron chips.
Whose processors are faster?
The main idea of that part of of the report devoted to the matters of right testing sounded about like this: to produce adequate results of tests in comparing systems, running a single test is not enough - a series of tests is required. For instance, a synthetic test can be used to estimate load upon the memory, I/O operation of the system; nevertheless, such benchmarks are unable to emulate real applications. That is why at AMD they believe that to measure the real performance more focus should be placed on using real applications. Besides, as Mr. Amato stressed, "artificial acceleration" for a specific architecture is not a rare occasion in benchmarking suites.
Then the audience was shown a series of tests – first, those at which server systems based on the Intel architecture take a lead, and then those on the base of the AMD architecture. In so doing, Mr. Amato stressed that those benchmarking suites were made up on the base of real applications.
At this very point, I am not going to argue with anyone, but it seems to me that if during the presentation the emphasis is made on direct contraposition of own development versus the competitor solutions, then it get the impression that for a persuasive demonstration of superiority some specific benchmarking suites may have been pre-selected. By no means I am hinting to the optimization - but statistics shows there are always benchmarking suites where a specific architecture is more winning and confident. Those who are curious about the details may read the Third Quarter 2006 SPEC JBB2005 Results, but I'd better confine myself to a mere demonstration of some most expressive slides of the presentation on this subject. At the end of the article, you can find a few more links on that.
We also noticed that while showing the peak performance of server systems, modern benchmarks do not show the most important part – the currently popular Performance/watt indicator, that is, performance per unit of power consumed. Therefore, benchmarks are not the only indicator of performance. Moreover, tests that include less than four threads are not a suitable measure for dual-processor systems. Nevertheless, after the slogan "don't be misled by the results of SPECint and SPECjbb", we noticed that comparison of AMD processors versus Xeon Woodcrest series chips of equal clock speeds brings quite competitive results. The findings of Mr. Amato are as follows: the TPI (True Performance Initiative) approach is good, but some random tests are unpractical.
In terms of the Performance/Watt
To compare performance in terms of the power consumed, Mr. Amato brought in a series of slides where systems based on Intel Woodcrest CPU and AMD Opteron series 2000 were opposed. I'd like to point that out - the "systems" and not just CPUs taken alone with all the respective "binding items" of the chipset. Regarding that, they reminded us that the components of the north bridge are an integral part of the Opteron CPU architecture, due to which the total TDP of the server system based on AMD chips looks more preferable.
It was also promised that with the implementation of the new generation of AMD CPU architecture this advantage will become even more enhanced – of course if we don't mind the superiority of systems based on Intel Xeon processors at benchmarks dubbed as the Intel Sponsored Results. I bow to the lofty style of the marketing people at AMD.
In the end, summing it all up, a conclusion was made regarding systems based on AMD Opteron which were referred to as "systems giving an optimim reduction in TCO through a strategy of consistent use of common software, steady migration to new generations of chips, energy-efficient architecture and use of DDR-2 memory...", etc. By the way, the below slide showing a roadmap for the start of support for new memory types– in particular, for AMD the support for FD-DIMM is planned for 2008 - clearly indicates that Intel is given the role of a "clearing locomotive" in future.
The virtualization technology (AMD-V)
The virtualization technology proposed by AMD was also stated as a contraposition to the development of Intel. We can't say it helped better digest the AMD version of the technology, or estimate the advantages of a specific technology.
Among the advantages of the AMD-V technology mentioned were (together with the parallel criticism of the Intel architecture) security provided by own hardware implementation of the Device Exclusion Vector (DEV), performance provided due to the Direct Connect architecture, Tagged TLBs that reduce load upon the memory channel while launching a new virtual machine, as well as specific Nested Page Tables whose introduction in 2007 should favor faster switching between virtual machines.
That's how it is getting along. Please don't get me wrong - I am not at all an opponent of the nice effective slides showing how a system by one camp smashes the competitor system into pieces - I was a bit shocked that during the evening I found out no less about Intel architectures than about AMD architectures. With time, journalists become insensible to the way when materials are presented as a contraposition to the competitor's products, but this time it was really ... aggressive.
In the end, AMD has completed development of the Quad-Core Opteron processors. Quite soon – in the second quarter of 2007, the first 4-core AMD chips manufactured following the 65-nm process technology at Fab36 in Dresden will start coming off the assembly line. The new 4-core processors will be electrically compatible to the generation of 2-core 1207-pin Socket F AMD Opteron chips (bearing the 4-digit marking), which will provide minimum costs and protection of investments.
Appendix
References:
|
Top Stories: |
|
|
|
MoBo:
|
|
|
|
VGA Card:
|
|
|
|
CPU & Memory:
|
|