The name of the base system (platform): | Nvidia Volta |
Developers: | Nvidia |
Last Release Date: | November 2017 |
Technology: | Processors, Data Centers - Data Center Technologies |
Content |
NVIDIA Tesla GPUs are massively parallel accelerators based on the NVIDIA CUDA parallel computing platform. Tesla GPUs are specifically designed for cost-effective, high-performance computing, computational science, and supercomputing, delivering much faster speeds for a wide range of scientific and commercial applications compared to a CPU-based system.
CUDA is a parallel computing platform and NVIDIA programming model that provides significant acceleration of resource-intensive calculations using GPUs. The CUDA programming model, downloaded more than 1.7 million times and supporting over 220 leading engineering, scientific and commercial applications, is the most common way to use GPU acceleration in application development.
2017: Nvidia Tesla V100
GPU for data centers, designed to accelerate artificial intelligence, HPC and graphics. Based on the state-of-the-art Nvidia Volta GPU architecture, the Tesla V100 offers 100 CPU performance in one GPU, giving scientists, researchers and engineers the ability to find solutions to previously unsolved problems.
Training artificial intelligence algorithms
Scientists undertake increasingly complex tasks, ranging from speech recognition and training virtual assistants to the detection of road markings, and training self-driving cars to driving. Solving such problems requires training exponentially more complex models of neural networks in a short time.
Equipped with 43,000 Tensor cores, the Tesla V100 is the first accelerator to break the 100 tera per second (TOPS) performance barrier in deep learning tasks. The second generation of NVIDIA NVLink™ technology connects several V100 graphics accelerators, providing 160 GB/s of bandwidth and allowing you to create the most powerful computing servers. Models that took weeks to learn on previous generation systems can now be trained in just a few days. Thanks to such a serious reduction in the time spent training algorithms, artificial intelligence will help solve new problems yourself.
Inference
To open us access to up-to-date information, services and products, companies have begun to use artificial intelligence. However, meeting the needs of users is a difficult task. For example, according to estimates by the largest companies with a hyperscale infrastructure, they will have to double the performance of their data centers if each user uses their speech recognition services for only three minutes a day.
The Tesla V100 accelerator is designed to provide maximum performance in existing ultra-scalable data centers. One server equipped with a Tesla V100 GPU and using 13 kW of power provides the same performance in the interface tasks as 30 CPU servers. Such a jump in performance and energy efficiency contributes to the expansion of the use of services with artificial intelligence.
high-performance computing
HPC is a fundamental pillar of modern science. From weather forecasting and the creation of new drugs to finding energy sources, scientists constantly use large computing systems to model our world and predict events in it. Artificial intelligence expands the capabilities of HPC, allowing scientists to analyze large amounts of data and extract useful information where simulations alone cannot provide a complete picture of what is happening.
The Tesla V100 graphics accelerator is designed to enable the fusion of HPC and artificial intelligence. This is a solution for HPC systems that will perform well both in computing for simulations and processing data for extracting useful information from them. By combining CUDA and Tensor cores in one architecture, a server equipped with Tesla V100 graphics accelerators can replace hundreds of traditional CPU servers, performing traditional HPC and artificial intelligence tasks. Now every scientist can afford a supercomputer to help solve the most difficult problems.
Nvidia Tesla v100 specifications
2016: Nvidia Tesla P100
On June 20, 2016, Nvidia introduced a graphics accelerator for scalable data centers - Nvidia Tesla P100. The Nvidia Tesla Accelerated Computing Platform solution helps build a class of servers whose performance at the level of several hundred classic servers on[1] CPU[1]
Data centers - vast network infrastructures with numerous interconnected CPU servers - handle a huge number of transactions, but their power is not enough to process scientific applications and tasks related to artificial intelligence, when more efficient, faster server nodes are required. The Tesla P100 accelerator based on the Nvidia Pascal architecture with five advanced technologies, according to the company, provides high performance and cost-effectiveness for the most demanding applications.
and Artificial intelligence cognition requires a completely new approach and a new level of computing. GPUs Nvidia , together with OpenPower technology, are already accelerating Watson's learning of new skills. A bundle of Power architecture from IBM and Pascal architecture from Nvidia with the NVLink interface together will further accelerate the study of cognition processes, accelerating the development of artificial intelligence. Dr. John Kelly III, Senior Vice President, Cognitive Solutions and IBM Research |
The Tesla P100 is Nvidia's first dual-precision and single-precision accelerator at 5 and 10 teraflops, respectively. Tesla P100 based on Pascal architecture increases the speed of learning neural networks by 12 times compared to solutions based on Nvidia Maxwell architecture, Nvidia said.
The Pascal processor has 15.3 billion transistors based on the 16 nm FinFET process. It is designed to provide the required performance and energy efficiency for loads with virtually unlimited computing requirements.
Deep Study Presentation, (2016)
Nvidia has announced a number of updates to the GPU development platform, the Nvidia SDK. Updates include Nvidia CUDA 8. Nvidia's parallel computing platform version presents developers with direct access to Pascal's new capabilities, including unified memory and NVLink. In addition, the current release includes the nvGRAPH graph analysis library, which can be used to calculate trajectories, information security and logistics analysis, which includes Big Data analytics in the scope of GPU-accelerated computing.
Nvidia Tesla P100 graphics accelerators on the Pascal platform will appear as part of the Nvidia DGX-1 learning system in June 2016. The processor is expected to appear in servers in early 2017.
2014: Nvidia Tesla K80
In November 2014, NVIDIA introduced the NVIDIA Tesla accelerated computing platform solution: the Tesla K80 dual-processor graphics accelerator - an accelerator designed for a wide range of applications, including machine learning, data analysis, scientific and high-performance (HPC) calculations.
The Tesla K80 dual-processor accelerator is the flagship of Tesla's accelerated computing platform, a platform for analyzing information and accelerating scientific research. This platform integrates GPU accelerators, the used CUDA parallel programming model, and an extensive ecosystem of application developers, application vendors, and data center solution providers .
Tesla K80 's dual-processor graphics accelerator has almost twice the performance and twice the memory bandwidth of its predecessor, the Tesla K40. The new accelerator is ten times faster than the most powerful CPU today, overtaking central processors and competing accelerators in hundreds of computationally heavy applications for data analysis and scientific calculations.
Users will be able to unlock the potential of a wide range of applications with a new version of NVIDIA GPU Boost technology, which allows dynamic frequency control, improving the performance of each specific application.
The Tesla K80 dual-processor accelerator has been designed for computational tasks in areas such as astrophysics, genomics, quantum chemistry, data analysis and beyond. It is also optimized for advanced "deep learning" challenges, one of the fastest-growing areas of the machine learning industry.
Tesla K80 surpasses all other accelerators in terms of computation speed - up to 8.74 teraflops for single-precision floating point calculations and 2.91 teraflops for double accuracy. Tesla K80 ten times faster than the fastest CPUs in leading science and engineering applications such as AMBER, GROMACS, Quantum Espresso and LSMS.
Key features of Tesla's dual-processor accelerator K80:
- Two GPUs on board are twice the data transfer rate in applications that take advantage of multiple GPUs.
- The GDDR5 ultra-fast memory 24GB - 12GB GPU memory - double that of the Tesla K40 - allows twice as many data sets to be processed.
- 480GB/s bandwidth - Increased bandwidth allows scientists to process petabytes of information twice as fast as the Tesla K10. Optimized for energy sourcing, video and image processing, and data analysis.
- 4992 parallel CUDA® cores - Speed applications up to 10 times the CPU.
- NVIDIA GPU Boost dynamic technology - dynamically changes GPU frequencies depending on the specifics of applications for maximum performance.
- Dynamic Parallelism - Allows GPU threads to dynamically generate new threads for fast and easy data processing in adaptive and dynamic structures.
2013
Nvidia Tesla K20X
Dual Precision Performance
- 1.31 Tflop on Tesla K20X
- Higher binary accuracy than consumer solutions Faster PCI-E messaging
- The only NVIDIA product with two DMA engines for bi-directional messaging using PCIe High performance in technical applications when working with large data sets
- More internal storage (6 GB per K20X and 8 GB per Tesla K10 GPU) Faster communication with InfiniBand using NVIDIA GPUDirect
- Linux Special Patch, InfiniBand Driver and CUDA Driver High Performance CUDA Driver for Windows
- The TCC driver reduces the computing resources of the CUDA kernel and supports Windows Remote Desktop and Windows Services
Tesla GPU Accelerators Make It Possible to Share GPU and CPU on an Individual Server Node or Blade System
How to choose a TESLA graphics card
Key opportunities | Tesla K20X | Tesla K20 | Tesla K10 | Tesla M2090 | Tesla M2075 |
GPGPU Performance | 1 Kepler GK110 | 2 Kepler GK104s | 1 Fermi GPU | 1 Fermi GPU | |
GPU Computing Applications | Seismic data processing, computational hydrodynamics, computer modeling, financial computing, computational chemistry and physics, data analysis, satellite photography, weather modeling | Seismic data processing, signal and image processing, video analytics | Seismic data processing, computational fluid dynamics, computer modeling, financial calculations, computational chemistry and physics, data analysis, seismic constructions, weather modeling | ||
Peak Performance for Double Precision Floating Point Calculations | 1.31 Tflops | 1.17 Tflops | 190 Gigaflop (95 Gigaflop at GPU) | 665 Gigaflop | 515 Gigaflop |
Peak Performance for Single Precision Floating Point Calculations | 3.95 Tflops | 3.52 Tflops | 4577 Gigaflop (2288 Gigaflop on GPU) | 1331 Gigaflop | 1030 Gigaflop |
Memory bandwidth (no ECC) | 250 GB/sec | 208 GB/sec | 320 GB/s (160 GB/s on GPU | 177 GB/s | 150 GB/s |
Memory Size (GDDR5) | 6 GB | 5 GB | 8 GB (4 GB per GPU) | 6 GB | 6 GB |
CUDA Cores | 2688 | 2496 | 3072 (1536 on GPU) | 512 | 448 |
Notes
- ↑ 1,0 1,1 [http://corp.cnews.ru/news/line/2016-06-20_nvidia_tesla_p100_uskoryaet_prilozheniya_glubokogo the Nvidia Tesla P100