RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

Application for controlling music and changing songs with voice and gestures

Product
The name of the base system (platform): Artificial intelligence (AI, Artificial intelligence, AI)
Branches: Entertainment, leisure, sports

Main article: Artificial Intelligence and Music Creation

2023: Application Presentation

Skoltech graduate student Ilya Borovik and his co-author from Germany presented an application with which users can "customize" works to their preferences with voice, facial expressions or gestures - for example, ask them to play the composition more slowly or even make a lullaby out of it.

File:Aquote1.png
The demo of the system consists of an artificial intelligence model that is trained on a small publicly accessible corpus of 1,067 musical performances of 236 works of piano music. The model takes notes at the input and learns to play them, predicting performance characteristics: local tempo, position, duration and volume of the note. At the output, we get the performance of the work. Our goal was to make this model manageable, so we connected it to an application that allows the user to communicate with it, "said Ilya Borovik.
File:Aquote2.png

When the user launches the application on a smartphone and gives permission to use the camera and microphone, he begins to listen to a random generated performance for the product from the database. To influence the execution, the user must press the button and record video or audio. Using voice commands or expressing emotions on the face, you can ask the model to play music differently. For example, to play Chopin's mazurki as lullabies.

System Operation Diagram
File:Aquote1.png
To control the model, we use execution instructions that have already been written in notes. The scores have labels that indicate to the performer how to play this or that part of the work: faster, slower, louder, quieter, and so on. We take all the data that is, and based on it we convert the user's voice instructions into these instructions, - continues Ilya.
File:Aquote2.png

Tags in the passage of the score to Beethoven's Sonata No. 17. Blue indicates guidance on tempo, red and orange indicates guidance on volume, green indicates accents for notes

Scientists continue to develop the project. Among their plans is to make the process of communication between the user and the model fully interactive, in order to arrive at the desired result in just a few iterations. The application interface will also be improved and the base data of musical works will be expanded. Now it contains classical works, the property of global culture. At the next stage, the researchers plan to turn on orchestral music.