Tag Archives: Supermicro

AMD Instinct MI100 : 11.5 TFLOPS In A Single Card!

AMD just announced the Instinct MI100 – the world’s fastest HPC GPU accelerator, delivering 11.5 TFLOPS in a single card!

 

AMD Instinct MI100 : 11.5 TFLOPS In A Single Card!

Powered by the new CDNA architecture, the AMD Instinct MI100 is the world’s fastest HPC GPU, and the first to break the 10 TFLOPS FP64 barrier!

Compared to the last-generation AMD accelerators, the AMD Instinct MI100 offers HPC applications almost 3.5X faster performance (FP32 matrix), and AI applications nearly 7X boost in throughput (FP16).

  • up to 11.5 TFLOPS of FP64 performance for HPC
  • up to 46.1 TFLOPS of FP32 Matrix performance for AI and machine learning
  • up to 184.6 TFLOPS of FP16 performance for AI training

2nd Gen AMD Infinity Fabric

It also leverages on the 2nd Gen AMD Infinity Fabric technology to deliver twice the peer-to-peer IO bandwidth of PCI Express 4.0. Thanks to its triple Infinity Fabric Links, it offers up to 340 GB/s of aggregate bandwidth per card.

In a server, MI100 GPUs can be configured as two fully-connected quad GPU hives, each providing up to 552 GB/s of P2P IO bandwidth.

Ultra-Fast HBM2 Memory

The AMD Instinct MI100 comes with 32 GB of HBM2 memory that deliver up to 1.23 TB/s of memory bandwidth to support large datasets.

PCI Express 4.0 Interface

The AMD Instinct MI100 is supports PCI Express 4.0, allowing for up to 64 GB/s of peak bandwidth from CPU to GPU, when paired with 2nd Gen AMD EPYC processors.

AMD Instinct MI100 : Specifications

Specifications AMD Instinct MI100
Fab Process 7 nm
Compute Units 120
Stream Processors 7,680
Peak BFLOAT16
Peak INT4 | INT8
Peak FP16
Peak FP32
Peak FMA32
Peak FP64 | FMA64
92.3 TFLOPS
184.6 TOPS
184.6 TFLOPS
46.1 TFLOPS
23.1 TFLOPS
11.5 TFLOPS
Memory 32 GB HBM2
Memory Interface 4,096 bits
Memory Clock 1.2 GHz
Memory Bandwidth 1.2 TB/s
Reliability Full Chip ECC
RAS Support
Scalability 3 x Infinity Fabric Links
OS Support Linux 64-bit
Bus Interface PCIe Gen 3 / Gen 4
Board Form Factor Full Height, Dual Slot
Board Length 10.5-inch long
Cooling Passively Cooled
Max Board Power 300 W TDP
Warranty 3-Years Limited

 

AMD Instinct MI100 : Availability

The AMD Instinct MI100 will be available in systems by the end of 2020 from OEM/ODM partners like Dell, Gigabyte, Hewlett Packard Enterprise (HPE), and Supermicro.

 

Recommended Reading

Go Back To > Enterprise ITComputer HardwareHome

 

Support Tech ARP!

If you like our work, you can help support us by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!


The First AMD Radeon Instinct Servers Revealed!

When AMD launched Radeon Instinct at the 2016 AMD Tech Summit in Sonoma earlier this month, they showed off several servers that will be powered by the new Radeon Instinct accelerators. These Radeon Instinct servers can now deliver up to 3 petaflops (3,000 TFLOPS) of FP16 compute performance using those Radeon Instinct accelerators.

Most of the performance boost comes from the combination of the new Vega GPU architecture, which allows for 2X packed FP16 math ops; and the new AMD MIOpen deep learning library.

After the launch event, we were given the opportunity to look inside two of these servers – the Supermicro 1028GQ-TRT and the Invented K888 G3. Both of these servers will ship with multiple Radeon Instinct MI25 Vega with NCU accelerators, allowing them to deliver up to 100 TFLOPS of FP16 compute performance.

We also had a look at the Falconwitch PS1816 server which can host a whopping 16 Radeon Instinct MI25 Vega with NCU accelerators to deliver 300 teraflops of FP16 compute performance!

 

The Supermicro 1028GQ-TRT

This is the server Ben Sander used to demonstrate the training capability of the Radeon Instinct MI25 accelerator in the 2016 AMD Tech Summit.

The Supermicro 1028GQ-TRT is a 1U server that fits up to 3 Radeon Instinct MI25 Vega with NCU accelerators. That allows it to deliver up to 75 teraflops of FP16 compute performance.

Multiple servers can be combined to increase compute performance. In his demo, Ben Sander used two of these Supermicro servers to obtain 150 teraflops of computing performance.

 

The Inventec K888 G3

The Inventec K888 G3 is a 2U, 2-processor server that fits up to 4 Radeon Instinct MI25 Vega with NCU accelerators. This allows it to deliver up to 100 teraflops of FP16 compute performance.

In this example, the Inventec K888 is powered by four FirePro S9300 X2 cards instead. Each of these FirePro S9300 X2 cards deliver slightly more FP16 compute performance than the Radeon Instinct MI25 Vega, [adrotate banner=”5″]

 

The Falconwitch PS1816

The Falconwitch PS1816 is a 2U, 24-bay server that boasts a total of 288 PCIe lanes. This allows it to support up to sixteen Radeon Instinct MI25 Vega with NCU accelerators to deliver 400 teraflops of FP16 compute performance.

If that’s not enough, there is an Inventec Radeon Instinct 42U rack that features six of these Falconwitch PS1816 servers and an additional four Radeon Instinct MI25 Vega with NCU accelerators. That is a total of 120 Radeon Instinct MI25 accelerators, delivering 3,000 teraflops or 3 petaflops of FP16 compute performance! This is literally, the mother of all Radeon Instinct servers!

 

Raja Koduri Introducing The First Radeon Instinct Servers

For those who missed our complete coverage of Radeon Instinct, here is the video of Radeon Technologies Group Senior Vice President and Chief Architect, Raja Koduri introducing the first Radeon Instinct servers.

For more information on the Radeon Instinct accelerators, and MIOpen deep learning library, please take a look at our article – The Complete AMD Radeon Instinct Tech Briefing!

 

Support Tech ARP!

If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!