HSE: Technology for identifying texts generated by any type of AI

Product

Developers:	Higher School of Economics (HSE)
Date of the premiere of the system:	2024/07/22
Technology:	Speech technology

The main articles are:

2024: The process of creating technology to identify texts generated by any type of AI

Scientists at the Higher School of Economics are working on the creation of an application that allows you to establish whether a text is written by a person or generated by artificial intelligence. The approach on which the application relies is universal in nature and allows you to "catch" a variety of bots built on different architectures. The HSE announced this on July 22, 2024. In the near future, it is planned to test the prototype in a wide range of texts. The platform is expected to be available to users in 2025.

The development of artificial intelligence technologies has led to the fact that the volume of texts generated by AI is increasing avalanche-like. At the same time, the texts that generate bots are already difficult to distinguish from those that people write.

The existing approaches to identifying bots-generated texts for July 2024 are often based on working with several specific bot architectures, which significantly reduces the range of their application and makes them vulnerable to future generations of bots. The goal of the HSE project is to create an effective system for detecting texts written by different programs in a wide class of bots for various languages.

Our development is different from that of competitors. The vast majority of similar projects are devoted to the task of identifying specific architectures of generative language models. This leads to the inevitable obsolescence of such developments as text generation tools and/or the emergence of new types of bots develop, and also forces potential consumers to use several models responsible for detecting bots with different architectures in practice. We "catch" all bots, and not just one or several at our disposal, "said Vasily Gromov, project manager, professor at the Department of Data Analysis and Artificial Intelligence at the HSE Faculty of Computer Science.

The developed system relies on several different areas of mathematical knowledge: the theory of chaotic dynamic systems, topological data analysis, dimension theory, clustering theory (clear and fuzzy), neural networks, etc. This ensures the robustness of the system: a bot can "fake" one or even several characteristics, but it is extremely difficult to "fake" them all.

{{quote 'We conducted large-scale computational experiments using various methods of data analysis and artificial intelligence, which made it possible to determine the sets of characteristics that are most suitable for distinguishing between spaces and trajectories of bots and people, and develop a prototype of software, "said Vasily Gromov, project manager, professor of the Department of Data Analysis and Artificial Intelligence, Faculty of Computer Science, HSE. }}

In the near future, it is planned to test the prototype in a wide range of texts - from works of art generated by bots to final competitive works of HSE students. It is planned that the platform will be available to a wide range of users in 2025. At first, she will be able to "catch" bots Russian on and, English but scientists are already working to increase the number of languages with which the system can interact. These are primarily languages countries BRICS and languages of peoples. Russia

Источник — «https://tadviser.com/index.php/Product:HSE:_Technology_for_identifying_texts_generated_by_any_type_of_AI»

The site content is translated by machine translation software powered by PROMT. The machine-translated articles are not always perfect and may contain errors in vocabulary, syntax or grammar. Read original article
If you find inaccuracies or errors in the results of machine translation, please write to editor@tadviser.ru. We will make every effort to correct them as soon as possible.

Simple Link

How to create a "smart plant": Key characteristics of a modern digital enterprise 10200

Model Studio CS: How to use BIM to give new impetus to the development of the fuel and energy complex 10300