Nvidia Triton Inference Server

Product

The name of the base system (platform):	Artificial intelligence (AI, Artificial intelligence, AI)
Developers:	Nvidia
Last Release Date:	November 2021
Branches:	Electrical and microelectronics

The Nvidia Triton Server (formerly TensorRT) is an open source software for deploying deep learning models in a work environment. The Triton server allows commands to deploy prepared AI models from local storage (TensorFlow, PyTorch, TensorRT Plan, Caffe, MXNet or Custom), Google Cloud or AWS S3 platforms on any GPU or CPU-based infrastructure. The server simultaneously runs several models on one GPU to increase utilization, and integrates with Kubernetes for orchestration, parameter management and automatic scaling.

2021: Multi-GPU Support

At the GTC conference in November 2021, Nvidia introduced the Triton Infection Server update. It now supports multiple GPUs and nodes, which allows you to distribute interference workloads for LLM among many graphics processors and nodes in real time. Such models require more memory than is available on a single GPU or even on a large server with multiple GPUs, and the interference must run quickly.

Megatron 530B was also introduced - a castomized large language model that can be taught for new subject areas and new languages. With the Triton Infection Server, the Megatron 530B can run on two Nvidia DGX systems to reduce processing time from a minute on the CPU server to half a second. This may allow the deployment of LLM for real-time applications.

The full list of November GTC 2021 announcements is available here.

Источник — «https://tadviser.com/index.php/Product:Nvidia_Triton_Inference_Server»

The site content is translated by machine translation software powered by PROMT. The machine-translated articles are not always perfect and may contain errors in vocabulary, syntax or grammar. Read original article
If you find inaccuracies or errors in the results of machine translation, please write to editor@tadviser.ru. We will make every effort to correct them as soon as possible.

Simple Link

How to create a "smart plant": Key characteristics of a modern digital enterprise 7300

Model Studio CS: How to use BIM to give new impetus to the development of the fuel and energy complex 4600