When first launched last year, the original Tesla V100 shipped with 16GB of HBM2 memory. Now just a little less than a year into its lifetime, NVIDIA is announcing that their workhorse server accelerator is getting a memory capacity bump to 32GB, effectively immediately.
For the last couple of years now, NVIDIA has been relying on 4GB (4-Hi) HBM2 memory stacks for their Tesla P100 and Tesla V100 products, as this was the first HBM2 memory to be ready in reasonable commercial volumes. Now that Samsung and SK Hynix have a better grip on HBM2 manufacturing, 8GB (8Hi) HBM2 stacks are far more readily available and reliable. As a result, the conditions are right for NVIDIA to finally give their Tesla cards a long-awaited memory upgrade.
This upgrade will be across the entire Tesla V100 family – all SKUs for SMX and PCIe cards are getting their memory doubled. The Tesla V100’s specifications otherwise remain identical, with the same GPU and memory clocks along with the same TDPs. We’re also told that the mechanical specifications are identical as well, which would mean that the 8-Hi stacks won’t cause any cooling problems due to changes in memory stack height.
NVIDIA Tesla/Titan Family Specification Comparison | ||||||
Tesla V100 (SXM2) |
Tesla V100 (PCIe) |
Titan V (PCIe) |
Tesla P100 (SXM2) |
|||
CUDA Cores | 5120 | 5120 | 5120 | 3584 | ||
Tensor Cores | 640 | 640 | 640 | N/A | ||
Core Clock | ? | ? | 1200MHz | 1328MHz | ||
Boost Clock | 1455MHz | 1370MHz | 1455MHz | 1480MHz | ||
Memory Clock | 1.75Gbps HBM2 | 1.75Gbps HBM2 | 1.7Gbps HBM2 | 1.4Gbps HBM2 | ||
Memory Bus Width | 4096-bit | 4096-bit | 3072-bit | 4096-bit | ||
Memory Bandwidth | 900GB/sec | 900GB/sec | 653GB/sec | 720GB/sec | ||
VRAM | 32GB |
32GB |
12GB | 16GB | ||
L2 Cache | 6MB | 6MB | 4.5MB | 4MB | ||
Half Precision | 30 TFLOPS | 28 TFLOPS | 27.6 TFLOPS | 21.2 TFLOPS | ||
Single Precision | 15 TFLOPS | 14 TFLOPS | 13.8 TFLOPS | 10.6 TFLOPS | ||
Double Precision | 7.5 TFLOPS | 7 TFLOPS | 6.9 TFLOPS | 5.3 TFLOPS | ||
Tensor Performance (Deep Learning) |
120 TFLOPS | 112 TFLOPS | 110 TFLOPS | N/A | ||
GPU | GV100 | GV100 | GV100 | GP100 | ||
Transistor Count | 21B | 21B | 21.1B | 15.3B | ||
TDP | 300W | 250W | 250W | 300W | ||
Form Factor | Mezzanine (SXM2) | PCIe | PCIe | Mezzanine (SXM2) | ||
Cooling | Passive | Passive | Active | Passive | ||
Manufacturing Process | TSMC 12nm FFN | TSMC 12nm FFN | TSMC 12nm FFN | TSMC 16nm FinFET | ||
Architecture | Volta | Volta | Volta | Pascal |
It should be noted that this upgrade is a wholesale replacement of the existing 16GB versions in NVIDIA’s product stack, as NVIDIA won’t be retaining the 16GB versions now that the 32GB versions are out. So all new cards sold by NVIDIA going forward will be the 32GB cards, and OEMs will be making the same transition in Q2 as their 16GB stocks are depleted and replaced with 32GB cards.
NVIDIA’s Tesla V100-equipped systems will also be getting the same upgrade. Both the DGX-1 server and DGX Station will now ship with the 32GB cards. And while OEMs are a bit farther behind, ultimately I expect they’ll make the same move since the 16GB cards are going away anyhow. Meanwhile NVIDIA isn’t officially talking about pricing here for either the new V100 cards or the updated DGX systems. However as the new parts are essentially drop-in replacements in their respective ongoing lines, there aren’t any signs right now that pricing is changing.
As far as workloads go, since the specifications of the Tesla V100 aren’t changing outside of memory capacity, the benefit and performance impact of this upgrade is almost entirely rooted in dataset size. Workloads that rely on datasets that couldn’t previously fit entirely within a Tesla V100 card should see a sizable boost, as will any applications that can benefit from more local memory for caching purposes. However in pure compute-bound scenarios, the new cards shouldn’t perform any differently than the outgoing models.
Meanwhile, as for NVIDIA's workstation-oriented Titan V, nothing is being announced at this time. The 12GB card could easily be bumped to 24GB using the 8-Hi stacks, however it's not clear if NVIDIA is in a hurry to do so. In practice it will almost certainly come down to whether NVIDIA wants to keep GV100 GPU production to a single line for efficiency reasons, or if they'll operate two lines as part of a broader product segmentation strategy.
Lastly, this upgrade means that NVIDIA has finally caught up with arch-rival AMD in total memory capacity. AMD has been shipping a 32GB FirePro S9170 card using 16x2GB GDDR5 modules for the better part of the last 3 years now, a mark that NVIDIA couldn’t catch up to until 8-Hi HBM2 was ready. The Tesla V100 has always been significantly more powerful than the W9100 regardless, but this finally brings NVIDIA to parity in the one area where they were trailing AMD in the server space.
from AnandTech https://ift.tt/2IXn0v6
via IFTTT
0 comments:
Post a Comment