NVIDIA just launched TensorRT 7, introducing the capability for Real-Time Conversational AI!
Here is a primer on the NVIDIA TensorRT 7, and the new real-time conversational AI capability!
NVIDIA TensorRT 7 with Real-Time Conversational AI
NVIDIA TensorRT 7 is their seventh-generation inference software development kit. It introduces the capability for real-time conversational AI, opening the door for human-to-AI interactions.
TensorRT 7 features a new deep learning compiler designed to automatically optimise and accelerate the increasingly complex recurrent and transformer-based neural networks needed for AI speech applications.
This boosts the performance of conversational AI components by more than 10X, compared to running them on CPUs. This drives down the latency below the 300 millisecond (0.3 second) threshold considered necessary for real-time interactions.
TensorRT 7 Targets Recurrent Neural Networks
TensorRT 7 is designed to speed up AI models that are used to make predictions on time-series, sequence-data scenarios that use recurrent loop structures (RNNs).
RNNs are used not only for conversational AI speed networks, they also help with arrival time planning for cars and satellites, predictions of events in electronic medical records, financial asset forecasting and fraud detection.
The use of RNN has hitherto been limited to a few companies with the talent and manpower to hand-optimise the code to meet real-time performance requirements.
With TensorRT 7’s new deep learning compiler, developers now have the ability to automatically optimise these neural networks to deliver the best possible performance and lowest latencies.
The new compiler also optimises transformer-based models like BERT for natural language processing.
TensorRT 7 Availability
NVIDIA TensorRT 7 will be made available in the coming days for development and deployment for free to members of the NVIDIA Developer program.
The latest versions of plug-ins, parsers and samples are also available as open source from the TensorRT GitHub repository.
If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!
Just before we flew to Computex 2017, we attended the AWS Masterclass on Artificial Intelligence. It offered us an in-depth look at AI concepts like machine learning, deep learning and neural networks. We also saw how Amazon Web Services (AWS) uses all that to create easy-to-use tools for developers to create their own AI applications at low cost and virtually no capital outlay.
The AWS Masterclass on Artificial Intelligence
AWS Malaysia flew in Olivier Klein, the AWS Asia Pacific Solutions Architect, to conduct the AWS Masterclass. During the two-hour session, he conveyed the ease by which the various AWS services and tools allow virtually anyone to create their own AI applications at lower cost and virtually no capital outlay.
The topic on artificial intelligence is rather wide-ranging, covering from the basic AI concepts all the way to demonstrations on how to use AWS services like Amazon Polly and Amazon Rekognition to easily and quickly create AI applications. We present to you – the complete AWS Masterclass on Artificial Intelligence!
The AWS Masterclass on AI is actually made up of 5 main topics. Here is a summary of those topics :
Topic
Duration
Remark
AWS Cloud and An Introduction to Artificial Intelligence, Machine Learning, Deep Learning
15 minutes
An overview on Amazon Web Services and the latest innovation in the data analytics, machine learning, deep learning and AI space.
The Road to Artificial Intelligence
20 minutes
Demystifying AI concepts and related terminologies, as well as the underlying technologies.
Let’s dive deeper into the concepts of machine learning, deep learning models, such as the neural networks, and how this leads to artificial intelligence.
Connecting Things and Sensing the Real World
30 minutes
As part of an AI that aligns with our physical world, we need to understand how Internet-of-Things (IoT) space helps to create natural interaction channels.
We will walk through real world examples and demonstration that include interactions with voice through Amazon Lex, Amazon Polly and the Alexa Voice Services, as well as understand visual recognitions with services such as Amazon Rekognition.
We will also bridge this with real-time data that is sensed from the physical world via AWS IoT.
Retrospective and Real-Time Data Analytics
30 minutes
Every AI must continuously “learn” and be “trained”” through past performance and feedback data. Retrospective and real-time data analytics are crucial to building intelligence model.
We will dive into some of the new trends and concepts, which our customers are using to perform fast and cost-effective analytics on AWS.
In the next two pages, we will dissect the video and share with you the key points from each segment of this AWS Masterclass.
If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!
The AWS Masterclass on AI Key Points (Part 1)
Here is an exhaustive list of key takeaway points from the AWS Masterclass on Artificial Intelligence, with their individual timestamps in the video :
Introduction To AWS Cloud
AWS has 16 regions around the world (0:51), with two or more availability zones per region (1:37), and 76 edge locations (1:56) to accelerate end connectivity to AWS services.
AWS offers 90+ cloud services (3:45), all of which use the On-Demand Model (4:38) – you pay only for what you use, whether that’s a GB of storage or transfer, or execution time for a computational process.
You don’t even need to plan for your requirements or inform AWS how much capacity you need (5:05). Just use and pay what you need.
AWS has a practice of passing their cost savings to their customers (5:59), cutting prices 61 times since 2006.
AWS keeps adding new services over the years (6:19), with over a thousand new services introduced in 2016 (7:03).
[adrotate group=”1″]
Introduction to Artificial Intelligence, Machine Learning, Deep Learning
Artificial intelligence is based on unsupervised machine learning (7:45), specifically deep learning models.
Insurance companies like AON use it for actuarial calculations (7:59), and services like Netflix use it to generate recommendations (8:04).
A lot of AI models have been built specifically around natural language understanding, and using vision to interact with customers, as well as predicting and understanding customer behaviour (9:23).
Here is a quick look at what the AWS services management console looks like (9:58).
This is how you launch 10 compute instances (virtual servers) in AWS (11:40).
The ability to access multiple instances quickly is very useful for AI training (12:40), because it gives the user access to large amounts of computational power, which can be quickly terminated (13:10).
Machine learning, or specifically artificial intelligence, is not new to Amazon.com, the parent company of AWS (14:14).
Amazon.com uses a lot of AI models (14:34) for recommendations and demand forecasting.
The visual search feature in Amazon app uses visual recognition and AI models to identify a picture you take (15:33).
Olivier introduces Amazon Go (16:07), a prototype grocery store in Seattle.
[adrotate group=”1″]
The Road to Artificial Intelligence
The first component of any artificial intelligence is the “ability to sense the real world” (18:46), connecting everything together.
Cheaper bandwidth (19:26) now allows more devices to be connected to the cloud, allowing more data to be collected for the purpose of training AI models.
Cloud computing platforms like AWS allow the storage and processing of all that sensor data in real time (19:53).
All of that information can be used in deep learning models (20:14) to create an artificial intelligence that understands, in a natural way, what we are doing, and what we want or need.
Olivier shows how machine learning can quickly solve a Rubik’s cube (20:47), which has 43 quintillion unique combinations.
You can even build a Raspberry Pi-powered machine (24:33) that can solve a Rubik’s cube puzzle in 0.9 seconds.
Some of these deep learning models are available on Amazon AI (25:11), which is a combination of different services (25:44).
Olivier shows what it means to “train a deep learning model” (28:19) using a neural network (29:15).
Deep learning is computationally-intensive (30:39), but once it derives a model that works well, the predictive aspect is not computationally-intensive (30:52).
A pre-trained AI model can be loaded into a low-powered device (31:02), allowing it to perform AI functions without requiring large amounts of bandwidth or computational power.
Olivier demonstrates the YOLO (You Only Look Once) project, which pre-trained an AI model with pictures of objects (31:58), which allows it to detect objects in any video.
The identification of objects is the baseline for autonomous driving systems (34:19), as used by Tu Simple.
Tu Simple also used a similar model to train a drone to detect and follow a person (35:28).
If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!
The AWS Masterclass on AI Key Points (Part 2)
Connecting Things and Sensing the Real World
Cloud services like AWS IoT (37:35) allow you to securely connect billions of IoT (Internet of Things) devices.
Olivier prefers to think of IoT as Intelligent Orchestrated Technology (37:52).
Olivier demonstrates how the combination of multiple data sources (maps, vehicle GPS, real-time weather reports) in Bangkok can be used to predict traffic as well as road conditions to create optimal routes (39:07), reducing traffic congestion by 30%.
The PetaBencana service in Jakarta uses picture recognition and IoT sensors to identify flooded roads (42:21) for better emergency response and disaster management.
Olivier demonstrates how easy it is to connect an IoT devices to the AWS IoT service (43:46), and use them to sense the environment and interact with.
Olivier shows how the capabilities of the Amazon Echo can be extended by creating an Alexa Skill using the AWS Lambda function (59:07).
Developers can create and publish Alexa Skills for sale in the Amazon marketplace (1:03:30).
Amazon Polly (1:04:10) renders life-like speech, while the Amazon Lex conversational engine (1:04:17) has natural language understanding and automatic speech recognition. Amazon Rekognition (1:04:29) performs image analysis.
Amazon Polly (1:04:50) turns text into life-like speech using deep learning to change the pitch and intonation according to the context. Olivier demonstrates Amazon Polly’s capabilities at 1:06:25.
Amazon Lex (1:11:06) is a web service that allows you to build conversational interfaces using natural language understanding (NLU) and automatic speech recognition (ASR) models like Alexa.
Amazon Lex does not just support spoken natural language understanding, it also recognisestext (1:12:09), which makes it useful for chatbots.
Olivier demonstrates that text recognition capabilities in a chatbot demo (1:13:50) of a customer applying for a credit card through Facebook.
Amazon Rekognition (1:21:37) is an image recognition and analysis service, which uses deep learning to identify objects in pictures.
Amazon Rekognition can even detect facial landmarks and sentiments (1:22:41), as well as image quality and other attributes.
You can actually try Amazon Rekognition out (1:23:24) by uploading photos at CodeFor.Cloud/image.
[adrotate group=”1″]
Retrospective and Real-Time Data Analytics
AI is a combination of 3 types of data analytics (1:28:10) – retrospective analysis and reporting + real-time processing + predictions to enable smart apps.
Cloud computing is extremely useful for machine learning (1:29:57) because it allows you to decouple storage and compute requirements for much lower costs.
Amazon Athena (1:31:56) allows you to query data stored in Amazon S3, without creating a compute instance to do it. You only pay for the TB of data that is processed by that query.
Best of all, you will get the same fast results even if your data set grows (1:32:31), because Amazon Athena will automatically parallelise your queries across your data set internally.
Olivier demonstrates (1:33:14) how Amazon Athena can be used to run queries on data stored in Amazon S3, as well as generate reports using Amazon QuickSight.
When it comes to data analytics, cloud computing allows you to quickly bring massive computing power to bear, achieving much faster results without additional cost (1:41:40).
The insurance company AON used this ability (1:42:44) to reduce an actuarial simulation that would normally take 10 days, to just 10 minutes.
Amazon Kinesis and Amazon Kinesis Analytics (1:45:10) allows the processing of real-time data.
A company called Dash is using this capability to analyse OBD data in real-time (1:47:23) to help improve fuel efficiency and predict potential breakdowns. It also notifies emergency services in case of a crash.
If you like our work, you can help support our work by visiting our sponsors, participating in the Tech ARP Forums, or even donating to our fund. Any help you can render is greatly appreciated!