WHILE MANY other high-end processors had multiple buses, scalable SMP interconnects and so on long time ago, alleviating the SMP scaling bottleneck somewhat, Intel still sticks to the old "single clogged FSB" approach, resulting in choppy SMP scaling performance even when using CPUs with humungous caches, like the 8 MB XeonMP or 9 MB Itanium 2. If there wasn't Opteron, Intel could probably continue like this forever, but, as we all know, Opteron is here and, despite AMD's sometimes amateurish business attitude, it does, with its superb scaling, steal quite a bit of high-end dough from Intel these days.
So, Intel will (in a month or so) start pushing the dual-FSB Blackford platform for Bensley, later Woodcrest processors, while the XeonMP successors will rely on Caneland chipset in 2007, with monstruous four FSB connections! In both cases, of course, the multi-FSB approach will be matched by an in-sync higher bandwidth DRAM system to improve overall system throughput. And all that while (supposedly) Intel teams crack their brains on how to make the future CSI better than the current HyperTransport...
But hold on, my past experiences make this current Intel approach very familiar - are they unconsciously following something that someone else already did in the last century? Well, let's look again at the Alphas, age 1998, 20th Century. That year, a new Alpha processor, 21264 (EV6), was announced, which, besides a new, exceptionally fast 4-issue out of order engine, also featured a new bus structure - instead of the shared 128-bit bus like on the 21164, now each CPU had its own 64-bit DDR point-to-point path to the chipset north bridge portion - that point-to-point path is also know as EV6 bus, used as you can guess in all the Athlon CPUs until Athlon64 came about. So, in the first 21264 generation, we had two chipsets, Tsunami (1998) and Typhoon (1999). Now, look at the diagrams and compare them with Blackford (2006) and Caneland (2007) - interesting?
Now, in 2001, before its untimely murder for corporate reshuffling gains, Alpha's 21364 EV7 brought the "father of HyperTransport" - the scalable EV7 interconnect to the world. And yes, those EV7's did scale better in SMP configurations than the HT scales today - simply, every EV7, besides a local 12.8 GB/s memory path, had four links to the other CPUs (each running at 6.4 GB/s, the bandwidth of a whole typical quad-CPU Itanium2 box!!) plus one dedicated 3.2 GB/s I/O link at the same time. Now, I guess CSI aims to do exactly something like this, using the 2001 Alpha approach to beat the 2005 HyperTransport in, say, 2009??
As our Charlie mentioned, why not simply produce one single chip that attaches to a single point-to-point link - so yes, a single Pentium XE could work in a single-bus desktop, dual-bus workstation and a quad-bus server. And yes, you could pay a premium for the flavours with, say, higher cache or faster FSB or more cores... whether you're a high-end server user or a total high-end games freak. The whole point is, why did they wait this long to figure out that single shared FSB has to go away, especially on the high end?
The INQuirer