In the previous part of this series, we saw that NVIDIA (NVDA) is expanding its AI solutions to vertical industries. The company is also expanding the breadth of its AI solutions. NVIDIA’s GPUs (graphics processing units) are largely used for deep learning. The GPUs’ use in inference is gradually increasing.
During the second fiscal quarter of 2019 earnings call, Jensen Huang, NVIDIA’s CEO, explained that inference is a complex technology. Inference involves optimizing the output of the neural networks on which data are trained. He stated that there are various types of neural networks like CNN (convolutional neural network), autoencoder, RNN (recurrent neural networks), and LSTM (long short-term memory units). Compiling output from these different neural networks is a complex computational problem.
NVIDIA’s new TensorRT 4
In the second fiscal quarter of 2019, NVIDIA launched its fourth generation TensorRT neural network optimizing compiler, which goes beyond image and video optimization to voice optimization. TensorRT 4 optimizes voice recognition, natural language understanding, recommendation systems, and translation, which enables it to handle a larger portion of deep learning inference workloads.
Early in fiscal 2019, NVIDIA successfully deployed its Tesla P4 inference accelerator in some hyperscale data centers. NVIDIA integrated its TensorRT into Google’s (GOOG) TensorFlow deep learning framework. Google also made NVIDIA’s Tesla P4 GPU available on the Google Cloud Platform.
Tesla P4 GPU is based on the Pascal architecture. The GPU is a few hundred times faster than a CPU (central processing unit) in inference performance. Huang stated that NVIDIA’s next-generation Turing GPU delivers ten times better inference performance than Pascal.
NVIDIA is exploring the possibilities of AI, which would drive strong growth in the coming years. Next, we’ll discuss NVIDIA’s Automotive segment—another major market for AI.