RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

AI Talent Hub: LLAMATOR Framework for testing chatbots for vulnerabilities

Product
Developers: ITMO (Scientific and Educational Corporation), Napoleon IT (Napoleon Aichi)
Date of the premiere of the system: 2025/02/04
Technology: TMS - Test Management System,  Speech technologies

The main articles are:

2025: Introducing a framework for testing chatbots for vulnerabilities

Students ITMO have developed a framework for testing chat boats for vulnerabilities with an accuracy of 89%.

The LLAMATOR framework developed within the AI Security Lab, warns against the issuance of conflicting information, sensitive data and invalid content by systems based on large language models, in particular commercial chatbots. This allows you to minimize legal and reputational risks of companies that use chatbots to interact with customers and employees. Napoleon IT (Napoleon Aichi) announced this on February 4, 2025.

Unlike other solutions in the market, LLAMATOR does not just test the system for resistance to external attacks with single requests, and leads with it full-fledged automated dialogues, selecting and improving an attacking strategy based on system responses. The framework can test models in both English and Russian and supports a wide range of tests - from simple prompt injections prior to deep hallucination and incorrect generation testing.

{{quote 'One of the main difficulties was to create a model capable of realistically mimic human interaction. We spent a lot of experiments with the selection of the attacking model and its systemic industrial training - to us We managed to achieve 89% accuracy in identifying vulnerabilities using the approach LLM-as-a-Judge, - said "'Timur Nizamov, one of the developers of the solution LLAMATOR.}}

The framework comes with a freemium model: its source code is open, but the command developers can conduct a comprehensive security audit of chat bots and AI systems at the request of the company. LLAMATOR integrates with a variety of platforms, including Web, REST API, Telegram, WhatsApp and others.

It is planned to scale the testing solution in the near future multi-agent AI systems and systematic commercial operation. As early as February 2025 negotiations are underway on technological and methodological cooperation with potential customers and large AI vendors.