RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

Sber: A tool for checking spelling in texts using AI

Product
The name of the base system (platform): Artificial intelligence (AI, Artificial intelligence, AI)
Developers: SberDevices (SalyutDevices, formerly SberDevices)
Date of the premiere of the system: 2023/10/06
Technology: Office applications

2023: Presentation of the text verification and proofreading service

Business now has access to the Sberbank service for checking and correcting texts using artificial intelligence technologies. This was announced on October 6, 2023 by SberDevices.

The AI service is a tool for checking spelling in texts in Russian, working on the basis of a neural network generative model. The solution can be used by the business to correct text of any length and format - in copywriting and editing, in the creation of marketing and advertising materials, in the work of media editorial offices. The service was developed by SberDevices and is available in the AI Services catalog on the ML Space platform for registered users.

{{quote 'author = said Denis Filippov, Vice President, Digital Surfaces, Salute, Sberbank. | AI-based models provide more and more possibilities for text editing. With the help of the presented solution, you can process any text by rewriting it without errors, use the generative capabilities of models to correct spelling in texts of various domains. The tool can become an AI assistant in various information projects and will help you quickly and efficiently eliminate spelling errors in texts, saving time and resources,}}

The development team was faced with the task of studying and solving proofreading problems using generative models. The result was a developed methodology for generative spelling correction for the Russian language, which shows the quality of the SOTA level[1] ​ ​ on the task[2] spelling. Following the results of work are released: [3]SAGE library with an open source code (license MIT), family of the pretrained generative models ([4],[5],[6],[7]) for spelling updates in the Russian and English languages and a hub with the marked data[8]for a problem of correction of spelling in texts of different domains.

As of October 2023, the presented tool is ahead of open solutions for the Russian language and proprietary models of competitors in terms of quality. A significant increase in metrics relative to other decisions is a consequence of the developed methodology. Two methods of error augmentation have been proposed to reproduce natural human typos and spelling errors in texts. With the help of these modules, a corpus of texts with errors (about 7 GB) was created, on which generative models of M2M100 and FredT5-large were trained. The second stage was the further training of models on a combination of assembled parallel datacets for spelling correction. The best configuration of the resulting solution is presented in the form of an AI service on the ML Space platform.