RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

Kolmogorov-Arnold Networks (KAN)

Product
Developers: California Institute of Technology (Caltech), Massachusetts Institute of Technology (MIT)
Date of the premiere of the system: May 2024
Branches: Information Technology

2024: Creating a neural network

At the end of April 2024, American researchers from a number of scientific organizations announced the development of a fundamentally new neural network architecture - Kolmogorov-Arnold Networks (KAN). The platform is based on the works of Soviet academicians Andrei Kolmogorov and Vladimir Arnold.

Traditionally, deep learning systems, including computer vision platforms and large language models (LLMs), are based on a multilayer perceptron (MLP). This is the architecture of interconnected neurons that act as units for computational operations in the network.

On the basis of the works of Soviet academicians Andrei Kolmogorov and Vladimir Arnold, a fundamentally new architecture of neural networks was created

Reportedly, scientists from the United States have proposed a more effective solution. The work was attended by specialists from the Massachusetts Institute of Technology (MIT), the California Institute of Technology (Caltech), Northeastern University (Northeastern University) and the Institute of Artificial Intelligence and Fundamental Interactions of the US National Science Foundation (IAIFI). While MLPs have fixed activation functions on nodes ("neurons"), KAN systems use trainable activation functions on the ribs ("weights"). There are no linear weights at all in KAN - each weight parameter is replaced by a one-dimensional function parameterized as a spline.

The KAN architecture is claimed to be capable of surpassing MLP in both accuracy and interpretability. In theory, KANs have faster neural scaling laws than MLPs. Overall, KAN is a promising alternative to MLP, opening up new opportunities to further improve deep learning models. At the same time, the new technology has certain disadvantages: this, in particular, is a lower learning rate. In other words, for tasks that prioritize speed, MLPs remain a more practical option.[1]

Notes