Tag Archives: Supercomputer

How NVIDIA A800 Bypasses US Chip Ban On China!

Find out how NVIDIA created the new A800 GPU to bypass the US ban on sale of advanced chips to China!

 

NVIDIA Offers A800 GPU To Bypass US Ban On China!

Two months after it was banned by the US government from selling high-performance AI chips to China, NVIDIA introduced a new A800 GPU designed to bypass those restrictions.

The new NVIDIA A800 is based on the same Ampere microarchitecture as the A100, which was used as the performance baseline by the US government.

Despite its numerically larger model number (the lucky number 8 was probably picked to appeal to the Chinese), this is a detuned part, with slightly reduced performance to meet export control limitations.

The NVIDIA A800 GPU, which went into production in Q3, is another alternative product to the NVIDIA A100 GPU for customers in China.

The A800 meets the U.S. government’s clear test for reduced export control and cannot be programmed to exceed it.

NVIDIA is probably hoping that the slightly slower NVIDIA A800 GPU will allow it to continue supplying China with A100-level chips that are used to power supercomputers and high-performance datacenters for artificial intelligence applications.

As I will show you in the next section, except in very high-end applications, there won’t be truly significant performance difference between the A800 and the A100. So NVIDIA customers who want or need the A100 will have no issue opting for the A800 instead.

However, this can only be a stopgap fix, as NVIDIA is stuck selling A100-level chips to China until and unless the US government changes its mind.

Read more : AMD, NVIDIA Banned From Selling AI Chips To China!

 

How Fast Is The NVIDIA A800 GPU?

The US government considers the NVIDIA A100 as the performance baseline for its export control restrictions on China.

Any chip equal or faster to that Ampere-based chip, which was launched on May 14, 2020, is forbidden to be sold or exported to China. But as they say, the devil is in the details.

The US government didn’t specify just how much slower chips must be, to qualify for export to China. So NVIDIA could technically get away by slightly detuning the A100, while offering almost the same performance level.

And that was what NVIDIA did with the A800 – it is basically the A100 with a 33% slower NVLink interconnect speed. NVIDIA also limited the maximum number of GPUs supported in a single server to 8.

That only slightly reduces the performance of A800 servers, compare to A100 servers, while offering the same amount of GPU compute performance. Most users will not notice the difference.

The only significant impediment is on the very high-end – Chinese companies are now restricted to a maximum of eight GPUs per server, instead of up to sixteen.

To show you what I mean, I dug into the A800 specifications, and compared them to the A100 below:

NVIDIA A100 vs A800 : 80GB PCIe Version

Specifications A100
80GB PCIe
A800
80GB PCIe
FP64 9.7 TFLOPS
FP64 Tensor Core 19.5 TFLOPS
FP32 19.5 TFLOPS
Tensor Float 32 156 TFLOPS
BFLOAT 16 Tensor Core 312 TFLOPS
FP16 Tensor Core 312 TFLOPS
INT8 Tensor Core 624 TOPS
GPU Memory 80 GB HBM2
GPU Memory Bandwifth 1,935 GB/s
TDP 300 W
Multi-Instance GPU Up to 7 MIGs @ 10 GB
Interconnect NVLink : 600 GB/s
PCIe Gen4 : 64 GB/s
NVLink : 400 GB/s
PCIe Gen4 : 64 GB/s
Server Options 1-8 GPUs

NVIDIA A100 vs A800 : 80GB SXM Version

Specifications A100
80GB SXM
A800
80GB SXM
FP64 9.7 TFLOPS
FP64 Tensor Core 19.5 TFLOPS
FP32 19.5 TFLOPS
Tensor Float 32 156 TFLOPS
BFLOAT 16 Tensor Core 312 TFLOPS
FP16 Tensor Core 312 TFLOPS
INT8 Tensor Core 624 TOPS
GPU Memory 80 GB HBM2
GPU Memory Bandwifth 2,039 GB/s
TDP 400 W
Multi-Instance GPU Up to 7 MIGs @ 10 GB
Interconnect NVLink : 600 GB/s
PCIe Gen4 : 64 GB/s
NVLink : 400 GB/s
PCIe Gen4 : 64 GB/s
Server Options 4/ 8 / 16 GPUs 4 / 8 GPUs

NVIDIA A100 vs A800 : 40GB PCIe Version

Specifications A100
40GB PCIe
A800
40GB PCIe
FP64 9.7 TFLOPS
FP64 Tensor Core 19.5 TFLOPS
FP32 19.5 TFLOPS
Tensor Float 32 156 TFLOPS
BFLOAT 16 Tensor Core 312 TFLOPS
FP16 Tensor Core 312 TFLOPS
INT8 Tensor Core 624 TOPS
GPU Memory 40 GB HBM2
GPU Memory Bandwifth 1,555 GB/s
TDP 250 W
Multi-Instance GPU Up to 7 MIGs @ 10 GB
Interconnect NVLink : 600 GB/s
PCIe Gen4 : 64 GB/s
NVLink : 400 GB/s
PCIe Gen4 : 64 GB/s
Server Options 1-8 GPUs

 

Please Support My Work!

Support my work through a bank transfer /  PayPal / credit card!

Name : Adrian Wong
Bank Transfer : CIMB 7064555917 (Swift Code : CIBBMYKL)
Credit Card / Paypal : https://paypal.me/techarp

Dr. Adrian Wong has been writing about tech and science since 1997, even publishing a book with Prentice Hall called Breaking Through The BIOS Barrier (ISBN 978-0131455368) while in medical school.

He continues to devote countless hours every day writing about tech, medicine and science, in his pursuit of facts in a post-truth world.

 

Recommended Reading

Go Back To > Business | ComputerTech ARP

 

Support Tech ARP!

Please support us by visiting our sponsors, participating in the Tech ARP Forums, or donating to our fund. Thank you!

AMD EPYC : Four Supercomputers In Top 50, Ten In Top 500!

AMD is on the roll, announcing more supercomputing wins for their 2nd Gen EPYC processors, including four supercomputers in the top 50 list, and ten in the top 500!

 

2nd Gen AMD EPYC : A Quick Primer

The 2nd Gen AMD EPYC family of server processors are based on the AMD Zen 2 microarchitecture and fabricated on the latest 7 nm process technology.

According to AMD, they offer up to 90% better integer performance and up to 79% better floating-point performance, than the competing Intel Xeon Platinum 8280 processor. For more details :

Here is a quick 7.5 minute summary of the 2nd Gen EPYC product presentations by Dr. Lisa Su, Mark Papermaster and Forrest Norrod!

 

AMD EPYC : Four Supercomputers In Top 50, Ten In Top 500!

Thanks to the greatly improved performance of their 2nd Gen EPYC processors, they now power four supercomputers in the top 50 list :

Top 50 Rank Supercomputer Processor
7 Selene
NVIDIA DGX A100 SuperPOD
AMD EPYC 7742
30 Belenos
Atos BullSequana XH2000
AMD EPYC 7H12
34 Joilot-Curie
Atos BullSequana XH2000
AMD EPYC 7H12
48 Mahti
Atos BullSequana XH2000
AMD EPYC 7H12

On top of those four supercomputers, there are another six other supercomputers in the Top 500 ranking, powered by AMD EPYC.

In addition to powering supercomputers, AMD EPYC 7742 processors will soon power Gigabyte servers selected by CERN to handle data from their Large Hadron Collider (LHC).

 

3rd Gen AMD EPYC Supercomputers

AMD also announced that two universities will deploy Dell EMC PowerEdge servers powered by the upcoming 3rd Gen AMD EPYC processors.

Indiana University

Indiana University will deploy Jetstream 2 – an eight-petaflop distributed cloud computing system, powered by the upcoming 3rd Gen AMD EPYC processors.

Jetstream 2 will be used by researchers in a variety of fields like AI, social sciences and COVID-19 research.

Purdue University

Purdue University will deploy Anvil – a supercomputer powered by the upcoming 3rd Gen AMD EPYC processors, for use in a wide range of computational and data-intensive research.

AMD EPYC will also power Purdue University’s community cluster “Bell”, scheduled for deployment in the fall.

 

Recommended Reading

Go Back To > Computer Hardware | Business | Home

Support Tech ARP!

If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!


AMD Datacenter Leadership In 2020 & Beyond!

AMD Senior VP and General Manager Forrest Norrod just shared AMD’s datacenter leadership with EPYC and Radeon Instinct, and AMD’s datacenter roadmap beyond 2020!

 

Forrest Norrod : Senior VP + GM, AMD Datacenter + Embedded Solutions Business Group

Forrest Norrod is senior vice president and general manager of the Datacenter and Embedded Solutions Business Group at AMD.

He is responsible for managing all aspects of strategy, business management, engineering and sales for AMD datacenter and embedded products.

Norrod has more than 25 years of technology industry experience across a number of engineering and business management roles at both the chip and system level.

 

AMD Datacenter Leadership In 2020 & Beyond!

During AMD Financial Analyst Day 2020, Forrest Norrod shared AMD’s datacenter leadership with EPYC and Radeon Instinct, and AMD’s datancenter roadmap in this presentation.

Here are the key points from Forrest Norrod’s presentation :

  • AMD won the contract to power the recently announced El Capitan supercomputer at Lawrence Livermore National Laboratory with EPYC processors and Radeon Instinct GPUs.
  • Expected to come online in 2023, El Capitan is expected to deliver more than 2 exaFLOPs of double-precision performance, making it more powerful than today’s 200 fastest supercomputers combined.

  • AMD is continuing to gain traction with its 2nd Generation AMD EPYC processors in enterprise, cloud and HPC markets based on delivering performance leadership and TCO advantages across the most important enterprise and cloud workloads.
  • AMD EPYC is enabling Nokia to double the performance of their 5G Cloud Packet Core.
  • In 2020 AMD expects more than 150 AMD EPYC processor-powered cloud instances and 140 server platforms to be available.

  • AMD is introducing new technologies including AMD CDNA architecture, 3rd Generation Infinity Architecture and the ROCm 4.0 software platform, all of which will support the AMD-powered Frontier and El Capitan supercomputers.
  • AMD plans to ship the 3rd Gen AMD EPYC “Milan” processor in Late 2020, and it will provide 100% coverage of enterprise requirements – whether it’s for the cloud, HPC or enterprise IT.
  • Milan will remain on the 7 nm process, but the next-generation Genda core (Zen 4) will use the 5 nm process technology.

  • The AMD CDNA architecture will allow for better scalability, with accelerators fully interconnected with 2nd Gen Infinity Architecture.
  • But the next-generation AMD CDNA 2 architecture will allow for Unified Data, with CPU + GPU coherency with 3rd Gen Infinity Architecture – allowing for easier programming and improved performance.

 

Recommended Reading

Go Back To > Computer Hardware | Business | Home

Support Tech ARP!

If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!


El Capitan Supercomputer : AMD Selected As Node Supplier!

It’s official – AMD has been selected as the node supplier for the El Capitan supercomputer, which is projected to be the world’s most powerful supercomputer when it is fully deployed!

 

El Capitan Supercomputer : A Quick Primer!

El Capitan is a supercomputer funded by the Advanced Simulation and Computing (ASC) program at the National Nuclear Security Administration (NNSA) from the Department of Energy.

When it is fully deployed in 2023, it will perform complex and increasingly predictive modelling and simulation for the NNSA’s Life Extension Programs (LEPs), which addresses nuclear weapon raging and emergent threat issues.

This will allow the United States to keep its nuclear stockpile safe, secure and reliable, in the absence of underground nuclear testing.

“This unprecedented computing capability, powered by advanced CPU and GPU technology from AMD, will sustain America’s position on the global stage in high-performance computing and provide an observable example of the commitment of the country to maintaining an unparalleled nuclear deterrent,” said LLNL Director Bill Goldstein.

“Today’s news provides a prime example of how government and industry can work together for the benefit of the entire nation.”

Besides supporting the nuclear stockpile, El Capitan will perform secondary US national security missions, including nuclear nonproliferation and counterterrorism.

NNSA laboratories – Lawrence Livermore, Los Alamos and Sandia national laboratories – are building machine learning and AI into computational techniques and analysis that will benefit NNSA’s primary missions and unclassified projects such as climate modelling and cancer research for DOE.

To that end, it will use a combination of CPUs and GPUs to exceed 2 exaFLOPS in performance – that’s two quintillion floating point operations per second. That will make it the world’s most powerful supercomputer!

 

El Capitan Supercomputer : AMD Selected As Node Supplier!

El Capitan will be powered by the next-generation AMD EPYC processors, codenamed Genoa and featuring the upcoming AMD Zen 4 processor cores, as well as the next-generation AMD Radeon Instinct GPUs based on a new compute-optimised architecture.

The nodes will run on the AMD Radeon Open Compute (ROCm) heterogenous computing platform, with most of their floating point computing power delivered by the Radeon Instinct GPUs.

Not only will the El Capitan nodes offer significantly greater per-node performance than any current system, they will also offer dramatically better energy efficiency.

El Capitan will also integrated advanced features that have not yet been widely deployed, including :

  • HPE Cray Slingshot interconnect network, which will enable large calculations across many nodes
  • new HPE optics technologies to deliver higher data transmission rates with better power efficiency and reliability
  • new Cray Shasta software platform, with a new container-based architecture

“El Capitan will drive unprecedented advancements in HPC and AI, powered by the next-generation AMD EPYC CPUs and Radeon Instinct GPUs,” said Forrest Norrod, senior vice president and general manager, Datacenter and Embedded Systems Group, AMD.

“Building on our strong foundation in high-performance computing and adding transformative coherency capabilities, AMD is enabling the NNSA Tri-Lab community — LLNL, Los Alamos and Sandia national laboratories — to achieve their mission-critical objectives and contribute new AI advancements to the industry.”

“We are extremely proud to continue our exascale work with HPE and NNSA and look forward to the delivery of the most powerful supercomputer in the world, expected in early 2023.”

 

Recommended Reading

Go Back To > Computer Hardware | Home

Support Tech ARP!

If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!


Frontier Supercomputer From AMD + Cray Is World’s Fastest!

AMD and Cray just unveiled the Frontier supercomputer, which will deliver exascale performance! Here is a primer on the world’s fastest supercomputer!

 

The Frontier Supercomputer – Designed By Cray, Powered By AMD

AMD announced that it is joining Cray, the U.S Department Of Energy and Oak Ridge National Laboratory to develop the Frontier supercomputer. It will be the fastest in the world, delivering exascale performance.

Developed at a cost of over US$600 million, the Frontier supercomputer will deliver over 1.5 exaflops of processing power when it comes online in the year 2021!

AMD Contributions To The Frontier Supercomputer

AMD is not just a provider of hardware – the CPUs and GPUs – for the Frontier supercomputer. They will contribute their years of experience in High Performance Computing and Artificial Intelligence :

  • Experience in High Performance Computing (HPC) and Artificial Intelligence (AI)
  • Custom AMD EPYC CPU
  • Purpose-built Radeon Instinct GPU
  • High Bandwith Memory (HBM)
  • Tightly integrated 4:1 GPU to CPU ratio
  • Custom, high speed coherent Infinity Fabric connection
  • Enhanced, open ROCm programming environment for AMD CPUs and GPUs support

 

Frontier Supercomputer And The Future Of Exascale Computing

With the development of the Frontier supercomputer, AMD and Cray will usher in a new era of exascale computing. It will lay the foundation for advanced and high performance of Artificial Intelligence (AI), analytics and simulation.

The use of this super-fast supercomputer by the U.S Department of Energy will further boost the limits of scientific discovery for the U.S and the world.

Recommended Reading

Go Back To > Business + Enterprise | Home

Support Tech ARP!

If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!


Fujitsu Supercomputer For RIKEN Uses 24 NVIDIA DGX-1s

SINGAPORE, 7 March 2016Fujitsu announced today that it is using 24 NVIDIA DGX-1 AI systems to help build a Fujitsu supercomputer for RIKEN, Japan’s largest comprehensive research institution, for deep learning research.

The largest customer installation of DGX-1 systems to date, the Fujitsu supercomputer will accelerate the application of AI to solve complex challenges in healthcare, manufacturing and public safety.

“DGX-1 is like a time-machine for AI researchers,” said Jen-HsunHuang, founder and CEO of NVIDIA. “Enterprises, research centres and universities worldwide are adopting DGX-1 to ride the wave of deep learning —the technology breakthrough at the centreof the AI revolution.”

The RIKEN Center for Advanced IntelligenceProject will use the new Fujitsu supercomputer, scheduled to go online next month, to accelerate AI research in several areas, including medicine, manufacturing, healthcare and disaster preparedness.

“We believe that the NVIDIA DGX-1-based system will acceleratereal-world implementation of the latest AI technologies technologies as well as research into next-generation AI algorithms,” said Arimichi Kunisawa, head of the Technical Computing Solution Unit at Fujitsu Limited. “Fujitsu is leveraging its extensive experience in high-performance computing development and AI research to support R&D that utilises this system, contributing to the creation of a future in which AI is used to find solutions to a variety of social issues.”

 

The New Fujitsu Supercomputer Runs On 24 NVIDIA DGX-1s

Conventional HPC architectures are proving too costly and inefficient for meeting the needs of AI researchers. That’s why companies like Fujitsu and customers such as RIKEN are looking for GPU-based solutions that reduce cost and power consumption while increasing performance.

Each DGX-1 combines the power of eight NVIDIA Tesla P100 GPUs with an integrated software stack optimised for deep learning frameworks, delivering the performance of 250 conventional x86 servers.

[adrotate group=”2″]

The system features a number of technological innovations unique to the DGX-1, including:

  • Containerised deep learning frameworks, optimised by NVIDIA for maximum GPU-accelerated deep learning training
  • Greater performance and multi-GPU scaling with NVIDIA NVLink, accelerating time to discovery
  • An integrated software and hardware architecture optimized for deep learning

The supercomputer will also use 32 Fujitsu PRIMERGY servers, which, combined with the DGX-1 systems, will boost its total theoretical processing performance to 4 petaflops when running half-precision floating-point calculations.

Go Back To > Enterprise | Home

 

Support Tech ARP!

If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!

SMU Deploys NVIDIA DGX-1 Supercomputer For AI Research

Singapore, 30 November 2016NVIDIA today announced that Singapore Management University (SMU) is the first organisation in Singapore and Southeast Asiato deploy an NVIDIA DGX-1 deep learning supercomputer.

Deployed at the SMU Living Analytics Research Center (LARC), the supercomputer will further research on applying artificial intelligence (AI) for Singapore’s Smart Nation project. Established in 2011, LARC aims to innovate technologies and software platforms that are relevant to Singapore’s Smart Nation efforts. LARC is supported and funded by the National Research Foundation (NRF).

 

NVIDIA DGX-1

The NVIDIA DGX-1 is the world’s first deep learning supercomputer to meet the computing demands of AI. It enables researchers and data scientists to easily harness the power of GPU-accelerated computing to create a new class of computers that learn, see and perceive the world as humans do.

Providing through put equivalent to 250 conventional servers in a single box, the supercomputer delivers the highest levels of computing power to drive next-generation AI applications, allowing researchers to dramatically reduce the time to train larger, more sophisticated deep neural networks.

Built on NVIDIA Tesla P100 GPUs that use the latest Pascal GPU architecture, the DGX-1 supercomputer will enable SMU to conduct a range of AI research projects for Smart Nation. One of the featured projects is a food AI application to achieve smart food consumption and healthy lifestyle, which requires the analysis of a large number of food photos.

[adrotate banner=”4″]

“This project involves the processing of large amounts of unstructured and visual data. Food photo recognition is not possible without the DGX-1 solution, which applies cutting-edge deep learning technologies and yields excellent recognition accuracy,” said Professor Steven Hoi, School of Information Systems, SMU.

The first phase of the food AI project is able to recognise 100 of the most popular local dishes in Singapore. The next phase is to expand the current food database to about 1,000 popular food dishes in Singapore. In addition to the recognition of food photos, the team will also analyse food data in supermarkets to help with the recommendation of healthy food options.Once developed, the food AI solution will be made available to developers through an API for them to build smart food consumption solutions.

“SMU has been an NVIDIA GPU Research Center using Tesla GPUs for several years. The NVIDIA DGX-1 will give SMU researchers the performance and deep learning capabilities needed to work on their Smart Nation projects, which will further advance Singapore’s aspirations,” said Raymond Teh, vice president of sales and marketing for Asia Pacific, NVIDIA.

 

Support Tech ARP!

If you like our work, you can help support our work by visiting our sponsors, participate in the Tech ARP Forums, or even donate to our fund. Any help you can render is greatly appreciated!

4,500 NVIDIA Tesla GPU Upgrade For Piz Daint Supercomputer

Singapore, April 6, 2016—NVIDIA today announced that Pascal architecture-based NVIDIA Tesla GPU accelerators will power an upgraded version of Europe’s fastest supercomputer, the Piz Daint system at the Swiss National Supercomputing Center (CSCS) in Lugano, Switzerland. The upgrade is expected to more than double Piz Daint’s speed, with most of the system’s performance expected to come from its Tesla GPUs.

Piz Daint, named after a mountain in the Swiss Alps, currently delivers 7.8 petaflops of compute performance, or 7.8 quadrillion mathematical calculations per second. That puts it at No. 7 in the latest TOP500 list of the world’s fastest supercomputers. CSCS plans to upgrade the system later this year with 4,500 Pascal-based GPUs.

 

Piz Daint Supercomputer Upgrade

Pascal is the most advanced GPU architecture ever built, delivering unmatched performance and efficiency to power the most computationally demanding applications. Pascal-based Tesla GPUs will allow researchers to solve larger, more complex problems that are currently out of reach in cosmology, materials science, seismology, climatology and a host of other fields.

Pascal GPUs feature a number of breakthrough technologies, including second-generation High Bandwidth Memory (HBM2) that delivers three times higher bandwidth than the previous generation architecture, and 16nm FinFET technology for unprecedented energy efficiency. For scientists with near infinite computing needs, Pascal GPUs deliver a giant leap in application performance and time to discovery for their scientific research.

The upgrade will enable CSCS scientists to do simulations, data analysis and visualisations faster and more efficiently. Piz Daint will be used to analyse data from the Large Hadron Collider at CERN, the world’s largest particle accelerator. The upgrade will also accelerate research on the

Human Brain Project’s High Performance Analytics and Computing Platform, which currently uses Piz Daint. The project’s goal is to build neuromorphic computing systems that use the same principles of computation and cognitive architectures as the brain. The upgrade will also facilitate CSCS research in geophysics, cosmology and materials science.

[adrotate banner=”5″]

“We are taking advantage of NVIDIA GPUs to significantly accelerate simulations in such diverse areas as cosmology, materials science, seismology and climatology,” said Thomas Schulthess, professor of computational physics at ETH Zurich and director of the Swiss National Supercomputing Center. “Tesla accelerators represent a leap forward in computing, allowing our researchers to solve larger, more complex problems that are currently out of reach in a host of fields.”

“CSCS scientists are using Piz Daint to tackle some of the most important computational challenges of our day, like modeling the human brain and uncovering new insights into the origins of the universe,” said Ian Buck, vice president of Accelerated Computing at NVIDIA. “Tesla GPUs deliver a massive leap in application performance, allowing CSCS to push the limits of scientific discovery.”

 

Support Tech ARP!

If you like our work, you can help support our work by visiting our sponsors, participate in the Tech ARP Forums, or even donate to our fund. Any help you can render is greatly appreciated!

NVIDIA DGX-1 Deep Learning Supercomputer Launched

April 6, 2016 — NVIDIA today unveiled the NVIDIA DGX-1, the world’s first deep learning supercomputer to meet the unlimited computing demands of artificial intelligence.

The NVIDIA DGX-1 is the first system designed specifically for deep learning — it comes fully integrated with hardware, deep learning software and development tools for quick, easy deployment. It is a turnkey system that contains a new generation of GPU accelerators, delivering the equivalent throughput of 250 x86 servers.

The NVIDIA DGX-1 deep learning system enables researchers and data scientists to easily harness the power of GPU-accelerated computing to create a new class of intelligent machines that learn, see and perceive the world as humans do. It delivers unprecedented levels of computing power to drive next-generation AI applications, allowing researchers to dramatically reduce the time to train larger, more sophisticated deep neural networks.

NVIDIA designed the DGX-1 for a new computing model to power the AI revolution that is sweeping across science, enterprises and increasingly all aspects of daily life. Powerful deep neural networks are driving a new kind of software created with massive amounts of data, which require considerably higher levels of computational performance.

“Artificial intelligence is the most far-reaching technological advancement in our lifetime,” said Jen-Hsun Huang, CEO and co-founder of NVIDIA. “It changes every industry, every company, everything. It will open up markets to benefit everyone. Data scientists and AI researchers today spend far too much time on home-brewed high performance computing solutions. The DGX-1 is easy to deploy and was created for one purpose: to unlock the powers of superhuman capabilities and apply them to problems that were once unsolvable.”

 

Powered by Five Breakthroughs

The NVIDIA DGX-1 deep learning system is built on NVIDIA Tesla P100 GPUs, based on the new NVIDIA Pascal GPU architecture. It provides the throughput of 250 CPU-based servers, networking, cables and racks — all in a single box.

The DGX-1 features four other breakthrough technologies that maximise performance and ease of use. These include the NVIDIA NVLink high-speed interconnect for maximum application scalability; 16nm FinFET fabrication technology for unprecedented energy efficiency; Chip on Wafer on Substrate with HBM2 for big data workloads; and new half-precision instructions to deliver more than 21 teraflops of peak performance for deep learning.

Together, these major technological advancements enable DGX-1 systems equipped with Tesla P100 GPUs to deliver over 12x faster training than four-way NVIDIA Maxwell architecturebased solutions from just one year ago.

[adrotate group=”2″]

The Pascal architecture has strong support from the artificial intelligence ecosystem.

“NVIDIA GPU is accelerating progress in AI. As neural nets become larger and larger, we not only need faster GPUs with larger and faster memory, but also much faster GPU-to-GPU communication, as well as hardware that can take advantage of reduced-precision arithmetic. This is precisely what Pascal delivers,” said Yann LeCun, director of AI Research at Facebook.

Andrew Ng, chief scientist at Baidu, said: “AI computers are like space rockets: The bigger the better. Pascal’s throughput and interconnect will make the biggest rocket we’ve seen yet.” NVIDIA Launches World’s First Deep Learning Supercomputer

“Microsoft is developing super deep neural networks that are more than 1,000 layers,” said Xuedong Huang, chief speech scientist at Microsoft Research. “NVIDIA Tesla P100’s impressive horsepower will enable Microsoft’s CNTK to accelerate AI breakthroughs.”

 

Comprehensive Deep Learning Software Suite

The NVIDIA DGX-1 system includes a complete suite of optimised deep learning software that allows researchers and data scientists to quickly and easily train deep neural networks. The DGX-1 software includes the NVIDIA Deep Learning GPU Training System (DIGITS), a complete, interactive system for designing deep neural networks (DNNs).

It also includes the newly released NVIDIA CUDA Deep Neural Network library (cuDNN) version 5, a GPUaccelerated library of primitives for designing DNNs. It also includes optimised versions of several widely used deep learning frameworks — Caffe, Theano and Torch. The DGX-1 additionally provides access to cloud management tools, software updates and a repository for containerised applications.

 

NVIDIA DGX-1 Specifications

[adrotate group=”2″]
  • Up to 170 teraflops of half-precision (FP16) peak performance
  • Eight Tesla P100 GPU accelerators, 16GB memory per GPU
  • NVLink Hybrid Mesh Cube
  • 7TB SSD DL Cache
  • Dual 10GbE, Quad InfiniBand 100Gb networking
  • 3U – 3200W

Optional support services for the NVIDIA DGX-1 improve productivity and reduce downtime for production systems. Hardware and software support provides access to NVIDIA deep learning expertise, and includes cloud management services, software upgrades and updates, and priority resolution of critical issues.

 

NVIDIA DGX-1 Availability

General availability for the NVIDIA DGX-1 deep learning system in the United States is in June, and in other regions beginning in the third quarter direct from NVIDIA and select systems integrators.

Go Back To > Enterprise | Home

 

Support Tech ARP!

If you like our work, you can help support our work by visiting our sponsors, participate in the Tech ARP Forums, or even donate to our fund. Any help you can render is greatly appreciated!

NVIDIA DRIVE PX 2 : First AI Supercomputer For Cars

Special by Danny Shapiro, NVIDIA

Take a supercomputer. Give it wheels. The result: a robot that can take you anywhere you want to go. No wonder self-driving cars were the hot topic at CES last week, and the talk of the Detroit Auto Show this week.

Building this new generation of super-smart cars requires some serious intelligence. That’s why we introduced NVIDIA DRIVE PX 2, our artificial intelligence supercomputer for the car. We’re taking the GPU technology at the heart of a revolution that’s giving computers superhuman powers of perception and putting it in your driveway.

Here’s why your next car might be your first supercomputer:

 

NVIDIA DRIVE PX 2 : First AI Supercomputer For Cars

Only next generation AI has the adaptability, and the power, to understand what cars encounter on the road.

There aren’t enough engineers in Silicon Valley to hand-code software that can account for everything that happens when you drive. To deal with all the stuff a car sees on the road – and thanks to modern sensors, they see more and more – you need deep learning, a form of artificial intelligence. Last year, GPU-powered deep learning systems exceeded human levels of perception for the first time.

 

Our GPUs Make AI Practical

GPUs are built for parallel computing. So they’re ideal for deep neural networks – complex mathematical models that mimic the brain. DNNs are trained by feeding massive amounts of data into powerful computers. Parallel computing is the only practical way to digest this info rapidly. And DNNs are ideal for driving, because the more data you give them, the smarter they get.

 

NVIDIA DRIVE PX 2 Brings AI to the Road

NVIDIA DRIVE PX 2 can perform 24 trillion deep learning operations per second, and it has the processing power of 150 MacBook Pros. It lets developers to replace the trunk full of GPU-based workstations in their vehicles with a supercomputer the size of a lunchbox.

 

We Built DRIVE PX 2 to be a Scalable Platform for Car Companies

We designed DRIVE PX 2 to handle everything from advanced driver assistance systems to fully self-driving vehicles. It can be configured as a single-processor, air-cooled system for driver assistance, up to a four-processor, liquid-cooled system for autonomous driving. Whatever the case, it’s based on one scalable Architecture – the same that powers the world’s most advanced supercomputers.

 

DRIVE PX 2 Is An Open Platform

NVIDIA DRIVE PX 2 is built with the same open, programmable GPU architecture that’s driving an AI revolution. Audi, BMW, Ford, Mercedes and ZMP (makers of the RoboTaxi) are already using our AI platform for their autonomous car R&D.

Our open, programmable platform is being used by More than 50 automakers, Tier 1 suppliers, software companies and startups are using NVIDIA DRIVE PX to develop deep neural networks.

 

Car Companies Can Make Their Cars Safer Every Day

GPUs have already accelerated the training of deep neural networks by 20 to 30 times. What used to take months to train, now takes just days. This lets us create a brain for autonomous vehicles that is always alert, and can achieve superhuman levels of situational awareness.  The more data these cars scoop up and share with one another, the smarter they all get.

 

AI-Equipped Cars Are Coming Soon

[adrotate group=”2″]

Earlier this month, Volvo announced it selected NVIDIA DRIVE PX 2 to power its fleet of autonomous cars. They’re outfitting their award-winning XC90 SUV with it – and will let drivers put these cars into autonomous driving mode on public roads around its hometown of Gothenburg, Sweden.

 

Everyone’s Investing in Automotive Supercomputing

GM has invested $500 million with Lyft on self-driving technologies. Toyota recently earmarked $1 billion for AI research. Just yesterday, the U.S. government put forth a $4 billion investment plan in support of autonomous driving technologies and the infrastructure to enable it.

This is just the start. Our goal is to make this technology available across all vehicle types and segments.  Putting supercomputers on wheels is going to reduce the number accidents, injuries and fatalities. It’s going to make new capabilities – and new kinds of transportation – possible.

Go Back To > Automotive | Home

 

Support Tech ARP!

If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!