AMD just announced the Instinct MI100 – the world’s fastest HPC GPU accelerator, delivering 11.5 TFLOPS in a single card!
AMD Instinct MI100 : 11.5 TFLOPS In A Single Card!
Powered by the new CDNA architecture, the AMD Instinct MI100 is the world’s fastest HPC GPU, and the first to break the 10 TFLOPS FP64 barrier!
Compared to the last-generation AMD accelerators, the AMD Instinct MI100 offers HPC applications almost 3.5X faster performance (FP32 matrix), and AI applications nearly 7X boost in throughput (FP16).
up to 11.5 TFLOPS of FP64 performance for HPC
up to 46.1 TFLOPS of FP32 Matrix performance for AI and machine learning
up to 184.6 TFLOPS of FP16 performance for AI training
2nd Gen AMD Infinity Fabric
It also leverages on the 2nd Gen AMD Infinity Fabric technology to deliver twice the peer-to-peer IO bandwidth of PCI Express 4.0. Thanks to its triple Infinity Fabric Links, it offers up to 340 GB/s of aggregate bandwidth per card.
In a server, MI100 GPUs can be configured as two fully-connected quad GPU hives, each providing up to 552 GB/s of P2P IO bandwidth.
Ultra-Fast HBM2 Memory
The AMD Instinct MI100 comes with 32 GB of HBM2 memory that deliver up to 1.23 TB/s of memory bandwidth to support large datasets.
PCI Express 4.0 Interface
The AMD Instinct MI100 is supports PCI Express 4.0, allowing for up to 64 GB/s of peak bandwidth from CPU to GPU, when paired with 2nd Gen AMD EPYC processors.
In addition to the gaming-centric RDNA architecture, AMD just introduced a new CDNA architecture that is optimised for compute workloads.
Here are some key tech highlights of the new AMD CDNA architecture!
AMD CDNA Architecture : What Is It?
Unlike the fixed-function graphics accelerators of the past, GPUs are now fully-programmable accelerators using what’s called the GPGPU (General Purpose GPU) Architecture.
GPGPU allowed the industry to leverage their tremendous processing power for machine learning and scientific computing purposes.
Instead of continuing down the GPGPU path, AMD has decided to introduce two architectures :
AMD RDNA : optimised for gaming to maximise frames per second
AMD CDNA : optimised for compute workloads to maximise FLOPS per second.
Designed to accelerate compute workloads, AMD CDNA augments scalar and vector processing with new Matrix Core Engines, and adds Infinity Fabric technology for scale-up capability.
This allows the first CDNA-based accelerator – AMD Instinct MI100 – to break the 10 TFLOPS per second (FP64) barrier.
The GPU is connected to its host processor using a PCI Express 4.0 interface, that delivers up to 32 GB/s of bandwidth in both directions.
AMD CDNA Architecture : Compute Units
The command processor and scheduling logic receives API-level commands and translates them into compute tasks.
These compute tasks are implemented as compute arrays and managed by the four Asynchronous Compute Engines (ACE), which maintain their independent stream of commands to the compute units.
Its 120 compute units (CUs) are derived from the earlier GCN architecture, and organised into four compute engines that execute wavefronts that contain 64 work-items.
The CUs are, however, enhanced with new Matrix Core Engines, that are optimised for matrix data processing.
Here is the block diagram of the AMD Instinct MI100 accelerator, showing how its main blocks are all tied together with the on-die Infinity Fabric.
Unlike the RDNA architecture, CDNA removes all of the fixed-function graphics hardware for tasks like rasterisation, tessellation, graphics caches, blending and even the display engine.
CDNA retains the dedicated logic for HEVC, H.264 and VP9 decoding that is sometimes used for compute workloads that operate on multimedia data.
The new Matrix Core Engines add a new family of wavefront-level instructions – the Matrix Fused Multiply-Add (MFMA). The MFMA instructions perform mixed-precision arithmetic and operates on KxN matrices using four different types of input data :
INT8 – 8-bit integers
FP16 – 16-bit half-precision
bf16 – 16-bit brain FP
FP32 – 32-bit single-precision
The new Matrix Core Engines has several advantages over the traditional vector pipelines in GCN :
the execution unit reduces the number of register file reads, since many input values are reused in a matrix multiplication
narrower datatypes create opportunity for workloads that do not require full FP32 precision, e.g. machine learning – saving energy.
AMD CDNA Architecture : L2 Cache + Memory
Most scientific and machine learning data sets are gigabytes or even terabytes in size. Therefore L2 cache and memory performance is critical.
In CDNA, the L2 cache is shared across the entire chip, and physically partitioned into multiple slices.
The MI100, specifically, has an 8 MB cache that is 16-way set-associative and made up of 32 slices. Each slice can sustain 128 bytes for an aggregate bandwidth of over 6 TB/s across the GPU.
The CDNA memory controller can drive 4- or 8-stacks high of HBM2 memory at 2.4 GT/s for a maximum throughput of 1.23 TB/s.
The memory contents are also protected by hardware ECC.
AMD CDNA Architecture : Communication + Scaling
CDNA is also designed for scaling up, using the high-speed Infinity Fabric technology to connect multiple GPUs.
AMD Infinity Fabric links are 16-bits wide, and operate at 23 GT/s, with three links in CDNA to allow for full connectivity in a quad-GPU configuration.
While the last generation Radeon Instinct MI50 GPU only uses a ring topology, the new fully-connected Infinity Fabric topology boosts performance for common communication patterns like all-reduce and scatter / gather.
Unlike PCI Express, Infinity Fabric links support coherent GPU memory, which lets multiple GPUs share an address space and tightly work on a single task.
Dell Technologies just introduced Dell EMC Ready solutions for both AI and virtualised HPC workloads on VMware vSphere 7!
Join us for the tech briefing on both new Dell EMC computing solutions for VMware, and find out how it can simplify your advanced computing needs!
Simplified Advanced Computing With Dell EMC Ready Solutions
Let’s start with the Dell Technologies briefing on the two new Dell EMC Ready solutions for both AI and virtualised HPC workloads.
Based on VMware Cloud Foundation, they are designed to make AI easier to deploy and consume, with new features from VMware vSphere 7, including Bitfusion.
Dell EMC Ready Solutions for AI : GPU-as-a-Service (GaaS)
GPUs in individual workstations or servers are often under-utilised at less than 15% of capacity. The new Dell EMC Ready Solutions for AI : GPU-as-a-Service fixes that and maximises your investment with virtual GPU pools.
The newest design includes the latest VMware vSphere 7 with Bitfusion, making it possible to virtualise GPUs on-premise. Factory-installed by Dell, VMware vSphere 7 with Bitfusion will let developers and data scientists pool IT resources and share them across datacenters.
Dell EMC Ready Solutions for AI : GPU-as-a-Service also uses the latest VMware Cloud Foundation with VMware vSphere 7 support for Kubernetes and containerised applications to run AI workloads anywhere. Containers make it easier to bring cloud-native applications into production, with the ability to move workloads.
Dell EMC Ready Solutions for Virtualised HPC
Most HPC workloads run on dedicated systems that require specialised skills to deploy and manage. Dell EMC Ready Solutions for Virtualised HPC can include VMware Cloud Foundation with VMware vSphere 7 featuring Bitfusion.
That should make it simpler and more economical to use VMware environments for HPC and AI applications in computational chemistry, bioinformatics and computer-aided engineering. IT teams can quickly provision hardware as needed, speed up initial deployment and configuration, saving time with simpler centralised management and security.
For very large HPC implementations, Dell EMC Ready Solutions for vHPC can include VMware vSphere Scale-Out Edition for additional cost savings.
Dell EMC OpenManage for Dell EMC Ready Solutions
The new Dell EMC Ready Solutions for AI and Virtualised HPC ship with the Dell EMC OpenManage systems management software, which helps administrators improve system uptime, keep data insights flowing and prepare for AI operations.
New Dell EMC OpenManage improvements include :
OpenManage Integration for VMware vCenter, supporting vSphere Lifecycle Manager, automates software, driver and firmware updates holistically to save time and simplify operations.
The enhanced OpenManage Mobile app gives administrators the ability to view power and thermal policies, perform emergency power reduction and monitor internal storage from anywhere in the world.
Leveraging their new Zen 2 microarchitecture and 7 nm process technology, AMD just introduced their 2nd Gen EPYC processors.
Designed to challenge Intel Xeon in the enterprise, cloud and HPC markets, the 2nd Gen EPYC processors promise to deliver “record-setting performance“, while reducing TCO (Total Cost of Ownership) by up to 50%.
Here is everything you need to know about the new 2nd Gen EPYC processors… summarised!
The Official 2nd Gen EPYC Product Presentation Summary
Let’s start with a quick 7.5 minute summary of the 2nd Gen EPYC product presentations by Dr. Lisa Su, Mark Papermaster and Forrest Norrod!
Now, let’s take a look at its key features and specifications!
AMD Infinity Architecture Explained
The AMD Infinity Architecture is a fancy name for their new modular chiplet-based design. It allows them to combine up to eight processor dies with a single I/O die on the same package, faster and at lower cost.
The processor dies are fabricated with the industry-leading 7 nm process technology for best performance at lowest power consumption, and thermal output.
The I/O die, on the other hand, can be fabricated on the much cheaper 14 nm process technology, with a much higher yield.
2nd Gen EPYC Is Built On 7nm
The 2nd Gen EPYC processor cores are fabricated on the 7nm process technology. This allows AMD to fit more transistors into a smaller space.
By doubling the transistor density, coupled with microarchitectural optimisations, the 2nd Gen EPYC delivers 4X the floating point performance of the 1st Gen EPYC processors.
The smaller process also increases energy efficiency, reducing both power consumption and heat output. According to AMD, 2nd Gen EPYC will use half the power consumption as the 1st Gen EPYC at the same performance level.
AMD claims they will offer up to 90% better integer performance and up to 79% better floating-point performance, than the competing Intel Xeon Platinum 8280 processor.
On top of significantly better performance per socket, they also come with hardware memory encryption, and a dedicated security processor.
Baked-In Security On Multiple Levels
The 2nd Gen EPYC processors are built-in with multiple levels of security features, to harden it against cyberattacks.
They have a secure root of trust designed to validate the initial BIOS boot without corruption.
In virtualised environments, you can use it to cryptographically check that your entire software stack is booted without corruption.
They have memory encryption engines built into their memory channels to hardware-encrypt data in the memory, preventing cold boot attacks.
In the 2nd Gen EPYC, every virtual machine is now encrypted with one of up to 509 unique encryption keys known only to the processor.
This protects your data even if a malicious VM finds its way into your virtual machine memory, or if a compromised hypervisor gains access into a guest VM.
2nd Gen EPYC Is PCI Express Gen 4 Ready!
Like the 3rd Gen Ryzen processors, the 2nd Gen EPYC is PCI Express Gen 4 ready.
PCIe 4.0 doubles the bandwidth over PCIe 3.0, and every EPYC processor has 128 lanes to tie together HPC clusters, or connect to GPU accelerators and NVMe drives.
2nd Gen EPYC Model, Specifications + Price Summary
For your convenience, we summarised the specifications and prices of the 2nd Gen EPYC models!
64 / 128
64 / 128
64 / 128
48 / 96
48 / 96
32 / 64
32 / 64
32 / 64
32 / 64
24 / 48
24 / 48
24 / 48
16 / 32
16 / 32
16 / 32
12 / 24
8 / 16
8 / 16
8 / 16
2nd Gen EPYC Is Already Changing The Industry
AMD appeared to have shipped the 2nd Gen EPYC processors early to Google, where they were deployed in production servers for their internal datacenter infrastructure.
Google also plans to use the 2nd Gen EPYC processors in new general-purpose machines that are part of the Google Cloud Compute Engine.
Twitter has also announced that they are already using the 2nd Gen EPYC processors to reduce their datacenter TCO (total cost of ownership) by 25%.
In partnership with Intel, Dell Technologies announced the launch of five Dell AI Experience Zones across the APJ region!
Here is a quick primer on the new Dell AI Experience Zones, and what they mean for organisations in the APJ region!
The APJ Region – Ripe For Artificial Intelligence
According to the Dell Technologies Digital Transformation Index, Artificial Intelligence (AI) will be amongst the top spending priorities for business leaders in APJ.
Half of those surveyed plan to invest in AI in the next one to three years, as part of their digital transformation strategy. However, 95% of companies face a lack of in-house expertise in AI.
This is where the five new Dell AI Experience Zones come in…
The Dell AI Experience Zones
The new AI Experience Zones are designed to offer both customers and partners a comprehensive look at the latest AI technologies and solutions.
Built into the existing Dell Technologies Customer Solution Centres, they will showcase how the Dell EMC High-Performance Computing (HPC) and AI ecosystem can help them address business challenges and seize opportunities.
All five AI Experience Zones are equipped with technology demonstrations built around the latest Dell EMC PowerEdge servers. Powered by the latest Intel Xeon Scalable processors, they are paired with advanced, open-source AI software like VINO, as well as Dell EMC networking and storage technologies.
Customers and partners who choose to leverage the new AI Experience Zones will receive help in kickstarting their AI initiatives, from design and AI expert engagements, to masterclass training, installation and maintenance.
“The timely adoption of AI will create new opportunities that will deliver concrete business advantages across all industries and business functions,” says Chris Kelly, vice president, Infrastructure Solutions Group, Dell Technologies, APJ.
“Companies looking to thrive in a data drive era need to understand that investments in AI are no longer optional – they are business critical. Whilst complex in nature, it is imperative that companies quickly start moving from theoretical AI strategies to practical deployments to stay ahead of the curve.”
Dell AI Experience Zones In APJ
The five new AI Experience Zones that Dell Technologies and Intel announced are located within the Dell Technologies Customer Solution Centres in these cities :
NVIDIA CEO Jensen Huang (recently anointed as Fortune 2017 Businessperson of the Year) made as surprise reveal at the NIPS conference – the NVIDIA TITAN V. This is the first desktop graphics card to be built on the latest NVIDIA Volta microarchitecture, and the first to use HBM2 memory.
In this article, we will share with you everything we know about the NVIDIA TITAN V, and how it compares against its TITANic predecessors. We will also share with you what we think could be a future NVIDIA TITAN Vp graphics card!
Updated @ 2017-12-10 : Added a section on gaming with the NVIDIA TITAN V .
Originally posted @ 2017-12-09
NVIDIA Volta isn’t exactly new. Back in GTC 2017, NVIDIA revealed NVIDIA Volta, the NVIDIA GV100 GPU and the first NVIDIA Volta-powered product – the NVIDIA Tesla V100. Jensen even highlighted the Tesla V100 in his Computex 2017 keynote, more than 6 months ago!
Yet there has been no desktop GPU built around NVIDIA Volta. NVIDIA continued to churn out new graphics cards built around the Pascal architecture – GeForce GTX 1080 Ti and GeForce GTX 1070 Ti. That changed with the NVIDIA TITAN V.
The NVIDIA GV100 is the first NVIDIA Volta-based GPU, and the largest they have ever built. Even using the latest 12 nm FFN (FinFET NVIDIA) process, it is still a massive chip at 815 mm²! Compare that to the GP100 (610 mm² @ 16 nm FinFET) and GK110 (552 mm² @ 28 nm).
That’s because the GV100 is built using a whooping 21.1 billion transistors. In addition to 5376 CUDA cores and 336 Texture Units, it boasts 672 Tensor cores and 6 MB of L2 cache. All those transistors require a whole lot more power – to the tune of 300 W.
The NVIDIA TITAN V
That’s V for Volta… not the Roman numeral V or V for Vendetta. Powered by the NVIDIA GV100 GPU, the TITAN V has 5120 CUDA cores, 320 Texture Units, 640 Tensor cores, and a 4.5 MB L2 cache. It is paired with 12 GB of HBM2 memory (3 x 4GB stacks) running at 850 MHz.
The blowout picture of the NVIDIA TITAN V reveals even more details :
It has 3 DisplayPorts and one HDMI port.
It has 6-pin + 8-pin PCIe power inputs.
It has 16 power phases, and what appears to be the Founders Edition copper heatsink and vapour chamber cooler, with a gold-coloured shroud.
There is no SLI connector, only what appears to be an NVLink connector.
Here are more pictures of the NVIDIA TITAN V, courtesy of NVIDIA.
Can You Game On The NVIDIA TITAN V? New!
Right after Jensen announced the TITAN V, the inevitable question was raised on the Internet – can it run Crysis / PUBG?
The NVIDIA TITAN V is the most powerful GPU for the desktop PC, but that does not mean you can actually use it to play games. NVIDIA notably did not mention anything about gaming, only that the TITAN V is “ideal for developers who want to use their PCs to do work in AI, deep learning and high performance computing.”
In fact, the TITAN V is not listed in their GeForce Gaming section. The most powerful graphics card in the GeForce Gaming section remains the TITAN Xp.
Then again, the TITAN V uses the same NVIDIA Game Ready Driver as GeForce gaming cards, starting with version 388.59. Even so, it is possible that some or many games may not run well or properly on the TITAN V.
Of course, all this is speculative in nature. All that remains to crack this mystery is for someone to buy the TITAN V and use it to play some games!
If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!
The NVIDIA TITAN V Specification Comparison
Let’s take a look at the known specifications of the NVIDIA TITAN V, compared to the TITAN Xp (launched earlier this year), and the TITAN X (launched late last year). We also inserted the specifications of a hypotheticalNVIDIA TITAN Vp, based on a full GV100.
Future TITAN Vp?
NVIDIA TITAN V
NVIDIA TITAN Xp
NVIDIA TITAN X
12 nm FinFET+
12 nm FinFET+
16 nm FinFET
16 nm FinFET
L2 Cache Size
GPU Core Clock
GPU Boost Clock
Multi GPU Capability
The NVIDIA TITAN Vp?
In case you are wondering, the TITAN Vp does not exist. It is merely a hypothetical future model that we think NVIDIA may introduce mid-cycle, like the NVIDIA TITAN Xp.
Our TITAN Vp is based on the full capabilities of the NVIDIA GV100 GPU. That means it will have 5376 CUDA cores with 336 Texture Units, 672 Tensor cores and 6 MB of L2 cache. It will also have a higher TDP of 300 watts.
The Official NVIDIA TITAN V Press Release
December 9, 2017—NVIDIA today introduced TITAN V, the world’s most powerful GPU for the PC, driven by the world’s most advanced GPU architecture, NVIDIA Volta .
Announced by NVIDIA founder and CEO Jensen Huang at the annual NIPS conference, TITAN V excels at computational processing for scientific simulation. Its 21.1 billion transistors deliver 110 teraflops of raw horsepower, 9x that of its predecessor, and extreme energy efficiency.
“Our vision for Volta was to push the outer limits of high performance computing and AI. We broke new ground with its new processor architecture, instructions, numerical formats, memory architecture and processor links,” said Huang. “With TITAN V, we are putting Volta into the hands of researchers and scientists all over the world. I can’t wait to see their breakthrough discoveries.”
NVIDIA Supercomputing GPU Architecture, Now for the PC
TITAN V’s Volta architecture features a major redesign of the streaming multiprocessor that is at the center of the GPU. It doubles the energy efficiency of the previous generation Pascal design, enabling dramatic boosts in performance in the same power envelope.
New Tensor Cores designed specifically for deep learning deliver up to 9x higher peak teraflops. With independent parallel integer and floating-point data paths, Volta is also much more efficient on workloads with a mix of computation and addressing calculations. Its new combined L1 data cache and shared memory unit significantly improve performance while also simplifying programming.
Fabricated on a new TSMC 12-nanometer FFN high-performance manufacturing process customised for NVIDIA, TITAN V also incorporates Volta’s highly tuned 12GB HBM2 memory subsystem for advanced memory bandwidth utilisation.
Free AI Software on NVIDIA GPU Cloud
TITAN V’s incredible power is ideal for developers who want to use their PCs to do work in AI, deep learning and high performance computing.
Users of TITAN V can gain immediate access to the latest GPU-optimised AI, deep learning and HPC software by signing up at no charge for an NVIDIA GPU Cloud account. This container registry includes NVIDIA-optimised deep learning frameworks, third-party managed HPC applications, NVIDIA HPC visualisation tools and the NVIDIA TensorRT inferencing optimiser.
June 16, 2017 — NVIDIA is among six technology companies to receive funding from the U.S. Department of Energy’s Exascale Computing Project (ECP) to accelerate the development of next-generation supercomputers.
The Exascale Computing Project
The ECP mission is to facilitate the delivery of at least two exascale computing systems, with an aim to deliver at least one by 2021. Such systems would be approximately 50x more powerful than the nation’s fastest supercomputer, Titan, located at Oak Ridge National Laboratory, in use today.
The goal of the ECP PathForward programme is to find solutions that maximise the energy efficiency and overall performance of future large-scale supercomputers critical to areas such as national security, manufacturing, industrial competitiveness, and energy research.
In addition to performance, the DOE has ambitious goals for improving power efficiency, to achieve exascale performance using only 20-30 megawatts. By comparison, an exascale system built with CPUs alone could consume hundreds of megawatts.
NVIDIA In The Exascale Computing Project
NVIDIA has been researching and developing faster, more efficient GPUs for high performance computing for more than a decade. This is its sixth DOE research and development subcontract, which will help accelerate its efforts to develop highly efficient throughput computing technologies to ensure U.S. leadership in HPC.
NVIDIA’s R&D will focus on critical areas including energy-efficient GPU architectures and resilience. Its findings may be incorporated into future generation GPU architectures after Volta (which will be used in the DOE’s upcoming flagship Summit and Sierra supercomputers, scheduled to go online in 2018).
The DOE has placed a high priority on supercomputer research. Its PathForward technical requirements state, “The U.S. faces serious and urgent economic, environmental, and national security challenges based on energy, climate, and growing security threats. High performance computing is a requirement for addressing such challenges, and the need for the development of capable exascale computers has become critical for solving these problems.”
To facilitate and test its technology, NVIDIA research teams will collaborate closely with six national DOE laboratories: Argonne National Laboratory, Lawrence Berkeley National Laboratory, Lawrence Livermore National Laboratory, Los Alamos National Laboratory, Oak Ridge National Laboratory, and Sandia National Laboratories.
KUALA LUMPUR, April 12, 2017 – MellanoxTechnologies, Ltd. (NASDAQ: MLNX) today unveiled its expansion plans for Malaysia. The announcement, which is in line with the country’s ambitions of becoming the leading Big Data Analytics (BDA) solutions hub in South East Asia, reiterated Mellanox’s commitment to Malaysia through its strategic investment roadmap.
Mellanox Technologies Expands Presence In Malaysia
“Malaysia’s investment in Big Data, data centers and the Cloud is impressive,” said Charlie Foo, Vice President and General Manager, Asia Pacific Japan, Mellanox Technologies. “With a year-over-year growth of more than 20 percent in the last five years, the field of digital data management is maturing rapidly. Mellanox’s investment in Malaysia looks to complement Malaysia’s advancing digital economy by providing intelligent 10, 25, 40, 50 and 100Gb/s interconnect solutions that serve today’s and future needs in Malaysia. This will enable organizations to be less concerned about today’s technological demands while concentrating on running their business, resulting in unparalleled operating efficiency for these organizations.”
Mellanox’s investment into Malaysia’s digital economy comes at a time when the country is ramping up its efforts to see its ICT roadmap to fruition. The country’s ICT custodian, Malaysia Digital Economy Corporation (MDEC), noted that MSC Malaysia — a national initiative designed to attract world-class technology companies to the country — reported a U.S. $3.88 billion in export sales in 2015, representing an 18 percent increase over 2014.
Today, the MSC Malaysia footprint has expanded to include 42 locations across the country, hosting more than 3,800 companies from more than 40 countries, employing more than 150,000 high-income knowledge workers, 85 percent which are Malaysians. This has propelled Malaysia to a top three ranking in AT Kearney’s Global Services Location Index since 2005, with only China and India ahead of Malaysia.
Mellanox’s Open Ethernet switch family delivers the highest performance and port density with a complete chassis and fabric management solution, enabling converged data centers to operate at any scale while reducing operational costs and infrastructure complexity.
Mellanox InfiniBand solutions have already been chosen to accelerate large High Performance Computing (HPC) customers in Malaysia. HPC customers use super computers and parallel processing techniques for solving complex computational problems and performing research activities through computer modeling, simulation and analysis. These HPC customers span various industries including education, bioscience, governments, finance, media and entertainment, oil and gas, pharmaceutical and manufacturing.
The company is actively seeking partnerships and collaboration opportunities to support customers from different industries, primarily within Big Data, data centers and the Cloud.
Support Tech ARP!
If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!
Kuala Lumpur, 4 July 2016 – Dell has announced advancements to its high performance computing (HPC) portfolio, including the availability of new Dell HPC Systems and technology partner collaborations for early access to innovative HPC technologies.
“While traditional HPC has been critical to research programs that enable scientific and societal advancement, Dell is mainstreaming these capabilities to support enterprises of all sizes as they seek a competitive advantage in an ever increasing digital world,” said William Tan, head of Enterprise Solutions, Dell Malaysia. “As a clear leader in HPC, Dell now offers customers highly flexible, precision built HPC systems for multiple vertical industries based upon years of experience powering the world’s most advanced academic and research institutions. With Dell HPC Systems, our customers can deploy HPC systems more quickly and cost effectively and accelerate their speed of innovation to deliver both breakthroughs and business results.”
Dell HPC Systems Portfolio Simplifies Powerful, Traditional HPC System for Enterprises of All Sizes
Available in Malaysia and globally, the Dell HPC Systems portfolio is a family of HPC and data analytics solutions that combine the flexibility of customised HPC systems with the speed, simplicity and reliability of pre-configured systems. Dell engineers and domain experts designed and tuned the new systems for specific science, manufacturing and analytics workloads with fully tested and validated building block systems, backed by a single point of hardware support and additional service options across the solution lifecycle.
With simplified configuration and ordering, organisations can more quickly select and deploy updated Dell HPC Systems at any scale today. As an Intel Scalable System Framework configuration, these systems, available today, include the latest Intel Xeon processor families, support for Intel Omni-Path Architecture (Intel OPA) fabric, and software in the Dell HPC Lustre Storage and Dell HPC NFS Storage solutions:
Dell HPC System for Life Sciences – Designed to meet the needs of life sciences organisations, this enables bioinformatics and genomics centers to deliver results and identify treatments in clinically relevant timeframes while maintaining compliance and protecting confidential data.
Dell HPC System for Manufacturing –Enables manufacturing and engineering customers to run complex design simulations, including structural analysis and computational fluid dynamics.
Dell HPC System for Research – Enables research centers to quickly develop HPC systems that match the unique needs of a wide variety of workloads, involving complex scientific analysis.
Dell Leads HPC Technology Advancements with Industry Partners to Help Accelerate Customer Innovation Cycles
Dell has instituted a customer early access program for early development and testing in preparation for Dell’s next server offering in the HPC solutions portfolio, the Dell PowerEdge C6320p server, which will be available in the second half of 2016, with the Intel Xeon Phi processor (formerly code-named Knights Landing). The PowerEdge C6320p unique server engineering and design will enable customers to:
Gain insights faster with a modular building block design, engineered to deliver faster insights for data-intensive computations and scale-up parallel processing.
Accelerate performance in dense and highly parallel HPC environments with 72 cores that are specifically optimised for parallel computing.
Simplify and automate systems management with the integrated Dell Remote Access Controller 8 (iDRAC8) with Lifecycle Controller. Customers can deploy, monitor and update PowerEdge C6320p servers faster and ensure higher levels of service and availability.
The Texas Advanced Computing Center (TACC) at The University of Texas at Austin has partnered with Dell and Intel to deploy an upgrade to its Stampede supercomputing cluster with Intel Xeon Phi processors and Intel OPA via Dell’s early access program.
Stampede, one of the main clusters for the Extreme Science and Engineering Discovery Environment (XSEDE),is a multi-use, cyberinfrastructure resource offering large memory, large data transfer, and GPU capabilities for data-intensive, accelerated or visualisation computing for thousands of projects ranging from cancer cure research to severe weather modeling.
This month, the U.S. National Science Foundation awarded US$30 million to TACC to acquire and deploy Stampede 2 as a strategic national resource to provide HPC capabilities for thousands of researchers in the U.S. The new Dell HPC System is expected to deliver a peak performance of up to 18 petaflops, more than twice the system performance of the current Stampede system. Three and a half years since its installation, Stampede ranks as the 12th most powerful supercomputer in the world, according to the June 2016 TOP500 list.
Additionally, Dell continues to bring HPC capabilities to mainstream enterprises through a series of evolving solutions and services designed to deliver a range of HPC-as-a-Service capabilities, giving HPC sites a choice of local or remote management services with deployment on-premise, off-premise or a hybrid of the two.