uploads///A_Semiconductors_NVDA_Volta versus Pascal

Reading the Third-Party Reviews for NVIDIA’s Tesla V100 GPU


Oct. 3 2017, Updated 9:05 a.m. ET

NVIDIA’s Tesla V100 GPU at a glance

NVIDIA’s (NVDA) Pascal-based Tesla P100 GPU (graphics processing unit) is powering most of the DLT (deep learning training) across the world. Now, the company is targeting inferencing, wherein the computer uses training to act in the real world.

NVIDIA has launched its next-generation GPU architecture Volta on the Tesla platform, which is used in data center applications. The Tesla V100 GPU is priced at $150,000 and built on TSMC’s (TSM) 12 nm (nanometer) process, featuring 5120 cores, HBM2 (high bandwidth memory) memory, NVLINK 2.0, and tensor cores for deep learning.

Article continues below advertisement

V100’s inferencing capability

Until now, almost all inferencing happened on CPU (central processing unit). NVIDIA is now offering a TensorRT 3 inference optimizer and runtime that delivers 100 times faster, inferencing on V100. The optimizer supports the industry’s most widely used AI (artificial intelligence) frameworks, Google’s (GOOG) TensorFlow and Facebook’s (FB) Caffe.

But GPUs are not believed to be an ideal solution for inferencing. Addressing this concern, NVIDIA’s CFO (chief financial officer) Colette Kress stated that GPUs may not be suitable for all types of inferencing and that NVIDIA believes that its GPUs are ideal for more complex types of inferencing.

Pascal versus Volta

Wccftech.com compared Volta GPU’s performance with its predecessor Pascal and found that NVIDIA has brought a significant generational leap in performance with Volta. Below are a few details:

  • Wccftech noted that V100 delivers 120 TFLOPs (tera-floating-point operations per second) performance, compared with P100’s 10 TFLOPs performance. Volta provides 12 times more DLT power and six times more inferencing power than P100.
  • NVIDIA has also increased its memory bandwidth from 720 Gbps (gigabytes per second) to 900 Gbps.
  • NVIDIA has increased L1 cache memory by almost eight times from 1.3MB (megabyte) to 10MB.
  • The NVLINK 2.0 feature almost doubles the internal bandwidth, from 160 Gbps to 300 Gbps.

Wccftech.com reported that Tesla V100 was tested against Tesla P100 on single-core Geekbench 4 computing tests, which showed that V100 is 132% faster than P100.

Volta-based DGX-1 system

NVIDIA has also launched its DGX-1 supercomputer, featuring eight Tesla V100 GPUs. The DGX-1 features 40,960 CUDA Cores and 5120 Tensor Cores with 128 GB of HBM2 memory. The system features Intel’s (INTC) 20 core, 40 thread, dual Xeon E5-2698 V4 processor with 2.2 GHz (gigahertz) clock speed.

The Volta-based DGX-1 was tested against Pascal-based DGX-1 on Geekbench 4, and it was found that the addition of Tensor cores in Volta boosted FP16 computing performance from 170 TFLOPs to 960 TFLOPs.


More From Market Realist

  • CONNECT with Market Realist
  • Link to Facebook
  • Link to Twitter
  • Link to Instagram
  • Link to Email Subscribe
Market Realist Logo
Do Not Sell My Personal Information

© Copyright 2021 Market Realist. Market Realist is a registered trademark. All Rights Reserved. People may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.