Tag Archives: NVIDIA TensorRT

NVIDIA TensorRT 7 with Real-Time Conversational AI!

NVIDIA just launched TensorRT 7, introducing the capability for Real-Time Conversational AI!

Here is a primer on the NVIDIA TensorRT 7, and the new real-time conversational AI capability!

 

NVIDIA TensorRT 7 with Real-Time Conversational AI

NVIDIA TensorRT 7 is their seventh-generation inference software development kit. It introduces the capability for real-time conversational AI, opening the door for human-to-AI interactions.

TensorRT 7 features a new deep learning compiler designed to automatically optimise and accelerate the increasingly complex recurrent and transformer-based neural networks needed for AI speech applications.

This boosts the performance of conversational AI components by more than 10X, compared to running them on CPUs. This drives down the latency below the 300 millisecond (0.3 second) threshold considered necessary for real-time interactions.

 

TensorRT 7 Targets Recurrent Neural Networks

TensorRT 7 is designed to speed up AI models that are used to make predictions on time-series, sequence-data scenarios that use recurrent loop structures (RNNs).

RNNs are used not only for conversational AI speed networks, they also help with arrival time planning for cars and satellites, predictions of events in electronic medical records, financial asset forecasting and fraud detection.

The use of RNN has hitherto been limited to a few companies with the talent and manpower to hand-optimise the code to meet real-time performance requirements.

With TensorRT 7’s new deep learning compiler, developers now have the ability to automatically optimise these neural networks to deliver the best possible performance and lowest latencies.

The new compiler also optimises transformer-based models like BERT for natural language processing.

 

TensorRT 7 Availability

NVIDIA TensorRT 7 will be made available in the coming days for development and deployment for free to members of the NVIDIA Developer program.

The latest versions of plug-ins, parsers and samples are also available as open source from the TensorRT GitHub repository.

 

Recommended Reading

Go Back To > Software | Business | Home

 

Support Tech ARP!

If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!


NVIDIA Wins MLPerf Inference Benchmarks For DC + Edge!

The MLPerf Inference 0.5 benchmarks are officially released today, with NVIDIA declaring that they aced them for both datacenter and edge computing workloads.

Find out how well NVIDIA did, and why it matters!

 

The MLPerf Inference Benchmarks

MLPerf Inference 0.5 is the industry’s first independent suite of five AI inference benchmarks.

Applied across a range of form factors and four inference scenarios, the new MLPerf Inference Benchmarks test the performance of established AI applications like image classification, object detection and translation.

 

NVIDIA Wins MLPerf Inference Benchmarks For Datacenter + Edge

Thanks to the programmability of its computing platforms to cater to diverse AI workloads, NVIDIA was the only company to submit results for all five MLPerf Inference Benchmarks.

According to NVIDIA, their Turing GPUs topped all five benchmarks for both datacenter scenarios (server and offline) among commercially-available processors.

Meanwhile, their Jetson Xavier scored highest among commercially-available edge and mobile SoCs under both edge-focused scenarios – single stream and multi-stream.

The new NVIDIA Jetson Xavier NX that was announced today is a low-power version of the Xavier SoC that won the MLPerf Inference 0.5 benchmarks.

All of NVIDIA’s MLPerf Inference Benchmark results were achieved using NVIDIA TensorRT 6 deep learning inference software.

 

Recommended Reading

Go Back To > Enterprise | Software | Home

 

Support Tech ARP!

If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!


NVIDIA : Now Everyone Can Use NVIDIA GPU Cloud!

On 4 December 2017, NVIDIA announced that AI researchers using NVIDIA desktop GPUs can now tap into NVIDIA GPU Cloud (NGC). By extending NVIDIA GPU Cloud support to NVIDIA TITAN, they have opened up NGC to hundreds of thousands of new users.

 

Now Everyone Can Use NVIDIA GPU Cloud!

The expanded NGC capabilities add new software and other key updates to the NGC container registry, providing AI researchers with a broader and more powerful set of tools.

Anyone using the NVIDIA-powered TITAN graphics cards can sign up immediately for a no-charge NGC account and gain full access to a comprehensive catalog of GPU-optimized deep learning and HPC software and tools. Other supported computing platforms include NVIDIA DGX-1, DGX Station and NVIDIA Volta-enabled instances on Amazon EC2.

Software available through NGC’s rapidly expanding container registry includes NVIDIA-optimized deep learning frameworks such as TensorFlow and PyTorch, third-party managed HPC applications, NVIDIA HPC visualization tools, and NVIDIA’s programmable inference accelerator, TensorRT 3.0.

 

New NGC Container, Updates & Features

In addition to making NVIDIA TensorRT available on NGC’s container registry, NVIDIA announced the following NGC updates:

[adrotate group=”2″]
  • Open Neural Network Exchange (ONNX) support for TensorRT.
  • Immediate support and availability for the first release of MXNet 1.0
  • Availability of Baidu’s PaddlePaddle AI framework

ONNX is an open format originally created by Facebook and Microsoft through which developers can exchange models across different frameworks. In the TensorRT development container, NVIDIA created a converter to deploy ONNX models to the TensorRT inference engine. This makes it easier for application developers to deploy low-latency, high-throughput models to TensorRT.

Together, these additions give developers a one-stop shop for software that supports a full spectrum of AI computing needs — from research and application development to training and deployment.

Go Back To > News | Home

 

Support Tech ARP!

If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!