[an error occurred while processing the directive]
RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2
Project

Softline helped to start the Russian-Tatar neural network machine translator of "Institute of applied semiotics of AN of Tatarstan"

Customers: Institute of applied semiotics of Academy of Sciences of Tatarstan

Kazan; Science and education

Contractors: Softline
Product: Nvidia DGX Supercomputers

Project date: 2019/11  - 2020/03

2020: Start of the Russian-Tatar neural network machine translator

On March 25, 2020 the Softline company reported that it helped to start the Russian-Tatar neural network machine translator of Institute of applied semiotics of AN of Tatarstan.

According to the company, the Institute of applied semiotics of Academy of Sciences of Tatarstan is engaged in studying of a wide range of the questions connected using artificial intelligence technology. For promoting, preserving and development of Tatar and culture the organization develops and releases a broad spectrum of software products among which — a synthesizer of the Tatar speech, application for mobile devices, a computer corpus of Tatar, the social and political thesaurus and the electronic atlas of national dialects. For implementation of more large-scale projects and tasks, such as machine translation, the systems of synthesis and speech analysis on the basis of artificial neural networks, were required the essential computing resources allowing to process effectively data bulks and to quickly receive results.

Offered Softline supercomputer for artificial intelligence NVIDIA DGX-1 became the best solution of the matter. This hardware and software system allows to reduce considerably project deadlines in the field of artificial intelligence. Thanks to existence of a program stack of NVIDIA, ready to use, for deep learning, the customer can begin work with algorithms deep learning in only one day and at the same time not spend time resources for integration and setup of necessary infrastructure.

Using the NVIDIA DGX-1 system, scientists of Institute of applied semiotics of AN of RT with the assistance of specialists of machine learning of the University and in partnership with JSC SMP-Neftegaz developed Innopolis and started the public service translate.tatar intended for machine translation from Russian on Tatar and vice versa. The architecture of a neuronet of encoder-decoder-attention is the cornerstone of approach. A system constantly develops. For its improvement models based on architecture of Transformer were constructed recently, algorithms of implementation of language models in a neuronet were applied. For the first time for the Russian-Tatar pair experiments on use of parallel data for other languages for the purpose of transfer of knowledge (transfer learning) were made.

As the basic training data the parallel corpus with a total amount of 983,319 couples Russian-Tatar offers created at institute including texts of news subject, literature, translations of laws and regulations was used.

File:Aquote1.png
Possibilities of the portal allow to translate texts in the Russian-Tatar and Tatar-Russian directions, to sound translation results in both languages and to estimate its quality. Besides, service is supplied with the bilingual interface thanks to what the circle of users constantly extends.

Rinat Gilmullin, the associate director of Institute of applied semiotics of Academy of Sciences of Tatarstan told
File:Aquote2.png

File:Aquote1.png
Having received an initial request from Institute of applied semiotics of AN of Tatarstan, we, first of all, had to define a circle of those tasks which can be solved using the equipment and the software offered by Softline company. Having found out that scientists should be engaged in the tasks connected with development of neuronets for machine translation we estimated the scale of works and suggested the management of institute to consider a hardware and software system for high-performance computing and acceleration of training of neuronets of NVIDIA DGX-1. We connected specialists of NVIDIA who conducted an interview with customer representatives and confirmed that the tasks facing institute can be solved successfully by means of DGX-1. Selecting the equipment, scientists took a possibility of potential accumulation of computing power of a complex into account. If necessary the institute can horizontally scale computing powers of a system by assembly of a cluster from the modules DGX-1 connected by InfioniBand interconnect. Also the choice of the solution was influenced by presence at NVIDIA of a repository of the optimized software of NVIDIA GPU Cloud — big library of frameworks and ready models of the neuronets optimized for GPU and delivered in the form of the containers Docker.

Egor Dyomin, the manager on sale of hardware solutions of Softline told
File:Aquote2.png