ibm pc — What was the first multiprocessor x86 motherboard?
Preface
The question is a bit unclear(*1) about the margins set regarding:
- Must it be a single motherboard or do separate assemblies qualify?
- Must it be PC-compatible or does any x86 system qualify?
- Must the board have been available separately (to the general public) or do complete assembled systems qualify?
So answers do vary a lot depending on what criteria are to be applied, offering year tags from 1979 all the way to 1995. Pick your choice.
Three Quick Answers:
- If it’s just about being x86 and SMP, then the earliest 8086 multibus boards will qualify — even way before the PC.
- If it’s about PC compatibility, then the dual 80386 configured Compaq SystemPro will make it.
- In the most strict sense of the above criteria, then the motherboards found by Stephen Kitt are the answer you’re looking for.
The Long Read
The original 8086 was already not only capable of sharing the bus between multiple CPUs, it was a core feature to enable operation of configurations with an 8089 I/O processor — as well as multiple CPUs and multiple I/O processors in one system. It was there from the very beginning. In fact, the first ever systems made by Intel for Multibus have been used in SMP configurations.
Intel can’t really be blamed for IBM using the CPU in a somewhat crippling design.
Multiprocessor support on the IBM PC-based platform, as far as I know, required BIOS support and additional hardware support, and introduced things like the MPS specification and the APIC.
Not really. It just simplifies standard usage.
So which motherboard was the first to support multiple CPUs and SMP?
Well, ignoring pre/non-PC MP systems like Sequent’s 80386-based Symmetry systems (1987), the dual 80386 Compaq SystemPro of 1989 might be a good starting point for dual CPU and PC compatible. That was before the APIC and way before MPS. Though, I’m not really sure if it does fully fit the SMP category, as the second CPU acted as a dedicated I/O processor running driver code; it didn’t take any user tasks. So while memory access was common to both, processing was asymmetric.
Then again, according to a quite interesting benchmarking project (found by Stehen Kitt), Windows NT 3.1 does seem to recognize both CPUs, so there might be more research necessary.
The very same machine could be ordered after 1990 with dual 80486 which also introduced APIC (*2). But APIC was restricted to Intel systems as its workings were patented — AMD and other compatible manufacturers introduced openPIC, but with limited success (*3). There was simply no real need for SMP when it came to machines that had to be 100% compatible, while at the same time multi-CPU servers were a special case anyway. This continued with the Pentium (*4). Fully PC-compatible systems stayed rather rare.
The restriction to a ‘motherboard’ makes it a bit problematic, as the SystemPro has the CPUs plus all arbitration logic on a sub-assembly. The motherboard itself is agnostic of the number of CPUs installed. So again, not sure if it counts.
It only took off with the Pentium Pro, after 1995, when Intel pushed dual socket systems for high end PCs. Almost every major manufacturer had one or more such system made close to Intel’s specifications, including the separate voltage converter modules(*5). Compaq’s 273708-001 board may serve as a good example for a workstation — here both sockets and all logic are on the motherboard. On the server side there was a dual PPro card for the ProLiant 5500 Model 6/200 (type 285100-001), which could take two of them making it a quad CPU system, but that would again be not on the motherboard.
By the time of the Pentium II, AMD licensed APIC from Intel for their K7 design and the rest is history…
A notable mention should go here as well for Intel’s iPSC of 1985. Although not SMP, the system is, with up to 128 80286 CPUs, the first major example for using off-the-shelf x86 components to build a supercomputer, something standard in today’s high-end world. (These x86-based systems are not really PC compatible, even though they are standard x86 SoCs. )
*1 — It’s not always easy to set a history question from today’s POV, is it?
*2 — For the advent of multi-processor systems, Intel’s MPS of 1994 was way more important than the APIC hardware itself. Before that, everyone tried their own methods for communication.
*3 — Interestingly, IBM, Apple and Motorola implemented openPIC in the form of MPIC for PowerPC based systems, like various RS/6000 or Macs.
*4 — At this point Alan Cox might be helpful, as he, AFAIR, brought MPS for the Pentium into Linux.
*5 — One of the hassles of finding PPro System is when they are taken, as it renders the machine useless. This is quite similar to 68k Macs, where the socketed system ROM was taken out by someone salvaging RAM modules.
Multi-Processor Server Solutions | Supermicro
- Overview
- Features
- Resources
- Available Models
Overview
Highlights
Innovations
Optimized for:
Product Features
Form Factor
Memory
Management
Processors
Drives
Input/Output
Services
Resources
Supermicro SYS-240P-TNRT 4P 3rd Gen Xeon Scalable Server Hands-on
By Patrick Kennedy —
Hot on the heels of today’s 3rd Generation Intel Xeon Scalable launch for the platform codenamed “Cooper Lake” or “Cedar Island” we have the Supermicro SYS-240P-TNRT. The SYS-240P-TNRT is a 4-socket server designed to leverage the new capabilities of Intel’s new Xeon platform.
See the product review
Supermicro SuperMinute: X12 4-Way MP System
Supermicro launches a new high-performance X12 generation 4-socket SuperServer®, optimized for the new 3rd Gen Intel® Xeon® Scalable Processors and the Intel® Optane™ persistent memory 200 series, bringing outstanding performance to mission-critical workloads in enterprise, cloud-scale, and hybrid environments.
Play the video
X12 4-Way MP Series Datasheet
Supermicro high-performance X12 generation 4-way SuperServer® supporting 3rd generation Intel® Xeon® Scalable processors for mission-critical workloads in enterprise, cloud-scale, and hybrid environments.
View the datasheet
Mission Critical Server Solutions
Supermicro’s Comprehensive Server, Storage and Networking product lines Optimized for IT, Data Center, Embedded, HPC and Cloud Computing and Supporting 3rd Gen Intel Xeon Scalable Processors (Ice Lake)
View the brochure
Supermicro Multi Processor (MP) Validated Solutions for SAP HANA
Supermicro’s Multi Processor (MP) product line is a family of servers designed for the most intensive computing and In-Memory workloads for today’s demanding real-time databases, data warehouses, CRM and ERP Applications, and “Big Data feed into AI” workflows. The MP product family are rack mounted solutions powered by 4 or 8 Intel Xeon Scalable Processors in a single node architecture, with no latency-reducing cabling.
View the white paper
Supermicro SuperMinute: X11 4-Way MP System
Supermicro introduces its latest generation 2U rackmount 4-Socket MP server. The MP line of SuperServers are designed to deliver the highest performance, Resource-Savings, flexibility, scalability and serviceability to power mission-critical Enterprise workloads like ERP, large in-memory databases, and real time analytics by scaling up to the maximum processing and memory afforded in one node.
Play the video
Supermicro SuperMinute: X11 8-Way MP System
Supermicro’s flagship Scale-Up server, SYS-7089P-TR4T is the most powerful, highest performing, highest memory footprint server designed for the latest generation of Intel® Xeon® Scalable processors.
Play the video
90TB DWFT For Microsoft® SQL Server® 2016
Supermicro® and Microsoft® have developed the first 8-way DWFT Reference Architecture based on Supermicro’s 7U SYS-7088B-TR4FT 8-way system. It has achieved excellent scores in the DWFT benchmarks, and is certified to host a 90TB data warehouse instance.
View the white paper
Models
SKU
Generation
Features
SYS-240P-TNRT
X12
2U
1 Node
4 CPUs
2 GPUs
48 DIMMs
24 Drives
4x 10G
Native
Optional
Optional
Redundant
SYS-240P-TNRT
SYS-2049U-TR4
X11
2U
1 Node
4 CPUs
48 DIMMs
24 Drives
4x 1G
Optional
Optional
Native
Redundant
SYS-2049U-TR4
SYS-2049P-TN8R
X11
2U
1 Node
4 CPUs
48 DIMMs
10 Drives
1x 1G
Native
Native
Redundant
SYS-2049P-TN8R
SYS-7089P-TR4T
X11
7U
1 Node
8 CPUs
96 DIMMs
16 Drives
4x 10G
Optional
Native
Redundant
SYS-7089P-TR4T
SYS-8049U-E1CR4T
X11
4U
1 Node
4 CPUs
48 DIMMs
2 Drives
4x 10G
Optional
Optional
Native
Redundant
SYS-8049U-E1CR4T
SYS-2048U-RTR4
X10
2U
1 Node
4 CPUs
48 DIMMs
24 Drives
4x 1G
Optional
Optional
Native
Redundant
SYS-2048U-RTR4
SYS-4048B-TRFT
X10
4U
1 Node
4 CPUs
96 DIMMs
24 Drives
2x 10G
Optional
Native
Redundant
SYS-4048B-TRFT
SYS-8028B-TR3F
X10
2U
1 Node
4 CPUs
32 DIMMs
6 Drives
2x 1G
Optional
Native
Redundant
SYS-8028B-TR3F
SYS-8028B-TR4F
X10
2U
1 Node
4 CPUs
32 DIMMs
6 Drives
2x 1G
Optional
Native
Redundant
SYS-8028B-TR4F
SYS-8048B-TR3F
X10
4U
1 Node
4 CPUs
32 DIMMs
5 Drives
2x 1G
Optional
Native
Redundant
SYS-8048B-TR3F
SYS-8048B-TR4F
X10
4U
1 Node
4 CPUs
32 DIMMs
5 Drives
2x 1G
Optional
Native
Redundant
SYS-8048B-TR4F
SYS-8048B-TRFT
X10
4U
1 Node
4 CPUs
96 DIMMs
24 Drives
2x 10G
Optional
Native
Redundant
SYS-8048B-TRFT
SYS-4048B-TR4FT
X10
4U
1 Node
4 CPUs
96 DIMMs
24 Drives
2x 10G
Optional
Native
Redundant
SYS-4048B-TR4FT
SYS-7088B-TR4FT
X10
7U
1 Node
8 CPUs
192 DIMMs
12 Drives
4x 10G
Optional
Native
Redundant
SYS-7088B-TR4FT
SYS-8048B-TR4FT
X10
4U
1 Node
4 CPUs
96 DIMMs
24 Drives
2x 10G
Optional
Native
Redundant
SYS-8048B-TR4FT
SYS-8028B-C0R3FT
X10
2U
1 Node
4 CPUs
32 DIMMs
6 Drives
2x 10G
Native
Native
Redundant
SYS-8028B-C0R3FT
SYS-8028B-C0R4FT
X10
2U
1 Node
4 CPUs
32 DIMMs
6 Drives
2x 10G
Native
Native
Redundant
SYS-8028B-C0R4FT
SYS-8048B-C0R3FT
X10
4U
1 Node
4 CPUs
32 DIMMs
5 Drives
2x 10G
Native
Native
Redundant
SYS-8048B-C0R3FT
Load moreShow all
Dual-socket motherboards: do you need them in a gaming PC
Modern technologies have advanced far ahead — now you will not surprise the buyer with a processor with six or eight cores. Not all parameters grow quantitatively: in the vast majority of motherboards, as before, there is only one processor socket.
Nevertheless, there are still two-socket boards on the market. It is about them that will be discussed in this article.
What are the advantages of a two-socket board? nine0007
“It’s more powerful than a single-socket one: a board with two processors processes large data streams faster, and therefore this board is suitable for installation in high-powered workstations and PCs. You can also play on such systems, but in your free time. It provides more PCIe and memory bandwidth. A PC with two processors has more processor cores.
“It has more connectors: these motherboards have additional PCIe and RAM slots, which allows you to install more expansion cards and more RAM up to terabytes.
nine0003
— It is more reliable than single-socket: such boards are designed to withstand excessive loads, so its elements are more stable and protected.
And what are the disadvantages?
— A PC with a two-socket board is more expensive: firstly, such boards are more technologically advanced and rarer, and therefore more expensive. Secondly, such boards are purchased for two processors — this is double the cost compared to a standard PC. It is also important to consider that not all processors support working in pair mode — and as a rule, such processors are more expensive than usual ones.
nine0003
— PCs with a dual-socket board require more power: accordingly, such a PC will need a higher power supply.
— Dual-socket PCs require more efficient cooling: A dual-socket PC with two processors requires powerful cooling to function reliably. Insufficient cooling can bring the entire system out of action, especially if the temperature is elevated during long-term operation and heavy loads.
Do I need a dual processor motherboard for gaming? nine0007
It is perfectly possible to use two processors in your gaming build, but is such a build worth the investment? Not always.
The fact is that installing two processors radically increases the cost of a PC — you need to buy the two-socket board itself, two processors (tailored for a two-socket board), more efficient cooling and a more powerful power supply.
For this radical increase in cost, the gamer gets more processor cores. It is believed that the more cores, the better results the processor shows in games. And this is partly true: the more cores, the more FPS the PC gives out in games (even with the same video card). But within certain limits. On the second ten cores, the difference will be more like an error than a dependence.
nine0003
However, the increase in FPS is not comparable to the increase in cost — for modern AAA games, four processor cores are enough. More cores won’t be as noticeable in use. The difference between 6 and 12 cores can be about ten percent.
Do you need a two-socket card for workloads?
All the advantages of a dual-processor motherboard are relevant for servers and advanced workstations. Here, the potential of two processors is revealed better, because reliability, fault tolerance, high performance and speed of the processor are important for such systems.
nine0003
What parameters should be taken into account when choosing?
— Number of slots for RAM. The more of them, the better.
— Socket. The range of processors available for installation will depend on it;
— The presence or absence of a video card. If the motherboard is used to assemble the server, an integrated video card is sufficient. But on boards for workstations there are no built-in video cards.
What board form factors are available? nine0007
Dual socket motherboards come in three different form factors:
— ATX — standard motherboard size:
— EATX (Extended ATX) is a larger board for servers and workstations. Its dimensions are 30.5 x 33.0 cm.
— SSI EEB (Server Standards Infrastructure Entry Electronics Bay) — this motherboard form factor is mainly used for building servers and has dimensions of 30. 5 x 33.0 cm. The main connector for the power supply has 24 + 8 pins.
nine0003
What are the features?
Two-socket boards are tailored for servers and workstations.
What distinguishes one from the other is primarily the support of operating systems. On the motherboard for a workstation, desktop versions of operating systems are installed regularly, while server boards, as a rule, do not support desktop operating systems.
It is also important to take into account such a factor as disk support. There are plenty of SATA connectors on dual-socket motherboards even in the standard configuration, but there may be a shortage with m.2 connectors.
nine0003
The integrated video card will only be on server boards.
How to distinguish a board for a workstation from a server?
The easiest way to distinguish a server board from a board for a workstation is by the block with audio outputs. There are no sound cards on the server boards.
Which 2-socket board should I choose?
Among dual socket motherboards, you can choose from five models. All of them support DDR4 DIMMs.
nine0003
Two models based on LGA 2011v3 socket and Intel C612 chipset:
— Supermicro X10DAL-i , which supports 8 memory modules with a total capacity of up to 1024 gigabytes. Made in the ATX form factor.
— Asus Z10PE-D16 WS which supports 16 memory modules with a total capacity of up to 1024 gigabytes. Made in the form factor EEB.
There are three more models based on LGA 3647 socket and Intel C621 chipset:
— Asus WS C621E SAGE in the EEB form factor with support for 12 memory modules up to 768 gigabytes;
— Tyan Tempest HX S7100 also in the SSI EEB form factor with support for 12 memory modules up to 1536 gigabytes;
— Supermicro X11DAi-N in E-ATX form factor with support for 16 memory modules up to 2048 gigabytes.
Two processors are considered to give good results in games due to the large number of cores. However, this video clearly shows how increasing the number of cores improves performance — FPS increases, but not commensurate with the cost of a PC with two multi-core processors
DIY supercomputer / Sudo Null IT News
Today it is possible to build a home supercomputer, which will be discussed.
The article considers methods of hardware construction of high-performance computing systems. One interesting application is cryptography. For example, thanks to modern technology, hacking MD5 or WPA has become available to anyone. If you try hard (information is quickly cut out), you can find a way to hack the A5 / 2 algorithm used in GSM on the Internet. Another application is engineering, financial, medical calculations, bitcoin mining. nine0108
A bit of history
The date of the first written mention of supercomputers can be considered March 1, 1920. New York newspapers wrote about machines with a capacity of a hundred mathematicians. These were tabulators — electromechanical computers manufactured by IBM (which was then called CTR). Later computers became electronic. Several players emerged in the supercomputer market, such as Cray, HP, IBM, Nec. These computers had vector processors (that is, they operated not with individual numbers, but with vectors). Proprietary technologies of manufacturing companies were used for communication between computing nodes. For example, one of these technologies is the connection of processors according to the topology of a four-dimensional torus. These words hide a very simple meaning: each node is connected to six others. Further development of supercomputers gave rise to the direction of massively parallel systems and clusters. In clusters, as the quintessence of this direction, approximately the same communication algorithms between computing nodes are used as in supercomputers, only based on network interfaces. They are the weak point of such systems. In addition to the non-standard (compared to the classical star) network topology such as Fat Tree, «multidimensional torus» or Dragonfly, special switching devices are required. nine0003
Concerning the topic we have taken, it is impossible not to mention that today one of the promising directions in the development of supercomputers is the use of coprocessors in the standard computer architecture, which resemble video cards in architecture.
Processor selection
Today, the main manufacturers of processors are Intel and AMD. RISC processors such as the Power 7+, while attractive, are quite exotic and expensive. For example, not the newest model of such a server costs more than a million. nine0003
(By the way, at the same time, it is possible to assemble an inexpensive and efficient cluster of xbox 360 or PS3, the processors there are about the same as Power, and you can buy more than one console for a million. )
Based on this, we note construction options that are interesting in price high performance system. Of course, it must be multiprocessor. Intel uses Xeon processors for such tasks, while AMD uses Opteron processors.
If there is a lot of money
Separately, we would like to note an extremely expensive but productive line of processors based on the Intel Xeon LGA1567 socket. nine0108
The top processor of this series is the E7-8870 with ten 2.4 GHz cores. Its price is $4616. For such CPUs, HP and Supermicro release! eight-processor! server chassis. Eight 10-core Xeon E7-8870 2.4 GHz processors with HyperThreading support support 8*10*2=160 threads, which is displayed in the Windows task manager as one hundred and sixty processor load graphs, with a 10×16 matrix.
In order for eight processors to fit in the case, they are not placed directly on the motherboard, but on separate boards that are plugged into the motherboard. The photo shows four boards with processors installed in the motherboard (two on each). This is a Supermicro solution. In the HP solution, each processor has its own board. The cost of the HP solution is two to three million, depending on the filling of processors, memory and other things. The chassis from Supermicro costs $10,000, which is more attractive. In addition, Supermicro can put four coprocessor expansion cards in PCI-Express x16 ports (by the way, there will still be room for an Infiniband adapter to assemble a cluster of these), and HP has only two. Thus, Supermicro’s eight-processor platform is more attractive for creating a supercomputer. The following photo from the exhibition shows a supercomputer assembly with four GPU boards. nine0108
However, this is very expensive.
Which is cheaper
But there is a prospect of building a supercomputer on more affordable AMD Opteron G34, Intel Xeon LGA2011 and LGA 1366 processors. I excluded from the calculation processors with a frequency below 2 GHz, and for Intel — with a bus below 6.4GT / s.
Model | Number of cores | Frequency | Price, $ | Price/core, $ | Price/Core/GHz |
AMD | |||||
6386 SE | 16 | 2.8 | 1392 | 87 | 31 |
6380 | 16 | 2.5 | 1088 | 68 | 27 |
6378 | 16 | 2. 4 | 867 | 54 | 23 |
6376 | 16 | 2.3 | 703 | 44 | 19 |
6348 | 12 | 2.8 | 575 | 48 | 17 |
6344 | 12 | 2.6 | 415 | 35 | 13 |
6328 | 8 | 3.2 | 575 | 72 | 22 |
6320 | 8 | 2.8 | 293 | 37 | 13 |
INTEL | |||||
E5-2690 | 8 | 2. 9 | 2057 | 257 | 89 |
E5-2680 | 8 | 2.7 | 1723 | 215 | 80 |
E5-2670 | 8 | 2.6 | 1552 | 194 | 75 |
E5-2665 | 2.4 | 1440 | 180 | 75 | |
E5-2660 | 8 | 2.2 | 1329 | 166 | 76 |
E5-2650 | 8 | 2 | 1107 | 138 | |
E5-2687W | 8 | 3.1 | 1885 | 236 | 76 |
E5-4650L | 8 | 2. 6 | 3616 | 452 | 174 |
E5-4650 | 8 | 2.7 | 3616 | 452 | 167 |
E5-4640 | 8 | 2.4 | 2725 | 341 | 142 |
E5-4617 | 6 | 2.9 | 1611 | 269 | 93 |
E5-4610 | 6 | 2.4 | 1219 | 203 | 85 |
E5-2640 | 6 | 2.5 | 885 | 148 | 59 |
E5-2630 | 6 | 2.3 | 612 | 102 | 44 |
E5-2667 | 6 | 2. 9 | 1552 | 259 | 89 |
X5690 | 6 | 3.46 | 1663 | 277 | 80 |
X5680 | 6 | 3.33 | 1663 | 277 | 83 |
X5675 | 6 | 3.06 | 1440 | 240 | 78 |
X5670 | 6 | 2.93 | 1440 | 240 | 82 |
X5660 | 6 | 2.8 | 1219 | 203 | 73 |
X5650 | 6 | 2.66 | 996 | 166 | 62 |
E5-4607 | 6 | 2. 2 | 885 | 148 | 67 |
X5687 | 4 | 3.6 | 1663 | 416 | 115 |
X5677 | 4 | 3.46 | 1663 | 120 | |
X5672 | 4 | 3.2 | 1440 | 360 | 113 |
X5667 | 4 | 3.06 | 1440 | 360 | 118 |
E5-2643 | 4 | 3.3 | 885 | 221 | 67 |
The model with the minimum ratio is highlighted in bold italics, the most powerful AMD and, in my opinion, the closest in performance to Xeon, is underlined.
So my choice of supercomputer processors is Opteron 6386 SE, Opteron 6344, Xeon E5-2687W and Xeon E5-2630.
Motherboards
PICMG
It is impossible to put more than four two-slot expansion cards on conventional motherboards. There is another architecture — the use of cross-boards, such as the BPG8032 PCI Express Backplane.
Such a board contains PCI Express expansion cards and one processor board, somewhat similar to those installed in the eight-processor Supermicro-based servers discussed above. But only these processor boards comply with PICMG industry standards. Standards evolve slowly and such boards often do not support the latest processors. A maximum of such processor boards are now being produced for two Xeon E5-2448L — Trenton BXT7059SBC.
Such a system without a GPU will cost at least $5000.
Finished platforms TYAN
For about the same amount, you can purchase a ready-made platform for assembling TYAN FT72B7015 supercomputers. In this, you can install up to eight GPUs and two Xeon LGA1366.
«Regular» server motherboards
For LGA2011
Supermicro X9QR7-TF — 4 expansion cards and 4 processors can be installed on this motherboard.
Supermicro X9DRG-QF — This board is specially designed for building high-performance systems.
For Opteron
Supermicro H8QGL-6F — this card allows you to install four processors and three expansion cards
Platform reinforcement with expansion cards
This market is almost completely captured by NVidia, which, in addition to gaming video cards, also produces computing cards. AMD has a smaller market share, and relatively recently Intel has entered this market. nine0003
A feature of such coprocessors is the presence of a large amount of RAM on board, fast calculations with double precision and energy efficiency.
FP32, Tflops | FP64, Tflops | Price | Memory, GB | |
Nvidia Tesla K20X | 3. 95 | 1.31 | 5.5 | 6 |
AMD FirePro S10000 | 5.91 | 1.48 | 3.6 | 6 |
Intel Xeon Phi 5110P | 1 | 2.7 | 8 | |
Nvidia GTX Titan | 4.5 | 1.3 | 1.1 | 6 |
Nvidia GTX 680 | 3 | 0.13 | 0.5 | |
AMD HD 7970 GHz Edition | 4 | 1 | 0.5 | 3 |
AMD HD 7990 Devil 13 | 2×3.7 | 2×0.92 | 1.6 | 2×3 |
The top solution from Nvidia is called the Tesla K20X based on the Kepler architecture. It is these cards that are in the world’s most powerful supercomputer Titan. However, recently Nvidia released the Geforce Titan graphics card. Older models were with FP64 performance cut down to 1/24 of FP32 (GTX680). But in Titanium, the manufacturer promises fairly high performance in double precision calculations. Solutions from AMD are also good, but they are built on a different architecture and this can make it difficult to run calculations optimized for CUDA (Nvidia technology). nine0003
The solution from Intel — Xeon Phi 5110P is interesting because all the cores in the coprocessor are made on the x86 architecture and no special code optimization is required to start the calculations. But my favorite coprocessor is the relatively inexpensive AMD HD 7970 GHz Edition. Theoretically, this video card will show the maximum performance per price.
Clusterable
To improve system performance, several computers can be combined into a cluster, which will distribute the computational load among the computers that are part of the cluster. nine0003
Using regular Gigabit Ethernet as a network interface to connect computers is too slow. For these purposes, Infiniband is most often used. The Infiniband host adapter is inexpensive relative to the server. For example, at the international Ebay auction, such adapters are sold at a price of $40. For example, an X4 DDR adapter (20Gb/s) will cost about $100 to deliver to Russia.
At the same time, switching equipment for Infiniband is quite expensive. And, as mentioned above, a classic star as a computer network topology is not the best choice. nine0003
However, InfiniBand hosts can be connected directly to each other without a switch. Then, for example, this option becomes quite interesting: a cluster of two computers connected via infiniband. Such a supercomputer can be assembled at home.
How many graphics cards do you need
In the most powerful supercomputer of our time, Cray Titan, the ratio of processors to «video cards» is 1: 1, that is, it has 18688 16-core processors and 18688 Tesla K20X.
In Tianhe-1A, a Chinese supercomputer based on xeons, the ratio is as follows. Two six-core processors for one Nvidia M2050 video card (weaker than K20X). nine0003
We will take this ratio for our assemblies as optimal (because it is cheaper). That is, 12-16 processor cores per GPU. In the table below, practically possible options are marked in bold, underlining — the most successful from my point of view.
GPU | Cores | 6-core CPU | 8-core CPU | 12-core CPU | 16-core CPU | |||||
2 | 24 | 32 | 4 | 5 | 3 | 4 | 2 | 3 | 2 | 2 |
3 | 36 | 48 | 6 | 8 | 5 | 6 | 3 | 4 | 2 | 3 |
4 | 48 | 64 | 8 | 11 | 6 | 8 | 4 | 5 | 3 | 4 |
If a system with an already established ratio of processors / video cards can take “on board” more computing devices, then we will add them to increase the build power.
So, how much is
The options below are a supercomputer chassis without RAM, hard drives and software. All models use AMD HD 79 video adapter70 GHz edition. It can be replaced by another, at the request of the task (for example, xeon phi). Where the system allows, one of the AMD HD 7970 GHz Editions has been replaced with a three-slot AMD HD 7990 Devil 13.
nine0157 1000
Theoretically, the performance will be about 12 Tflops.
Option 2 on TYAN S8232 motherboard, cluster
This board does not support Opteron 63xx, so 62xx is used. In this option, two computers are clustered via Infiniband x4 DDR with two cables. Theoretically, the connection speed in this case rests on the speed of PCIe x8, that is, 32Gb / s. Two power supplies are used. How to coordinate them with each other can be found on the Internet. nine0003
Quantity | Price | Amount | ||
Motherboard | TYAN S8232 | 1 | 790 | 790 |
Processor | AMD Opteron 6282SE | 2 | 1000 | 2000 |
CPU cooler | Noctua NH-U12DO A3 | 2 | 60 | 120 |
Body | Antec Twelve Hundred Black | 1 | 200 | 200 |
Power supply | FSP AURUM PRO 1200W | 2 | 200 | 400 |
Graphics Accelerator | AMD HD 7970 GHz Edition | 2 | 1000 | |
Graphics Accelerator | AX7990 6GBD5-A2DHJ | 1 | 1000 | 1000 |
Infiniband adapter | X4 DDR Infiniband | 1 | 140 | 140 |
Infiniband cable | X4 DDR Infiniband | 1 | 30 | 30 |
5680 (for one block) |
For a cluster of such configurations, two are needed and their cost will be $11360 . Its power consumption at full load will be about 3000W. Theoretically, the performance will be up to 31Tflops.
Option 3 on Tyan platform FT72B7015
This option differs in that with eight GPUs there are only two CPUs. Accordingly, its performance in real tasks will depend on the ability of the program to be highly parallelized. nine0003
Quantity | Price | Amount | ||
Chassis (3000W) | Tyan FT72B7015 | 1 | 4900 | 4900 |
Processor | Xeon X5680 | 2 | 1300 | 2600 |
CPU cooler | SuperMicro SNK-P0040AP4 | 2 | 40 | 80 |
Graphics Accelerator | AMD HD 7970 GHz Edition | 8 | 500 | 4000 |
11580 |
Theoretically, the performance will be up to 32 Tflops.