The name of the base system (platform): | Artificial intelligence (AI, Artificial intelligence, AI) |
Developers: | Nvidia |
Last Release Date: | 2021/11/24 |
Main article: Neural networks (neural networks)
2021: GauGAN2 with the ability to create plausible photos of non-existent landscapes
On November 24, 2021, it became known that Nvidia introduced a system based on artificial intelligence technologies GauGAN2 (the successor to the first model GauGAN), which allows you to create plausible photographs of non-existent landscapes. Using techniques such as segmented mapping, retouching, and text-to-image conversion, GauGAN2 is capable of creating realistic images based on text and hand sketches.
"Compared to other current models, especially for converting text into an image or map segments into an image, the underlying GauGAN2 neural network produces more diverse and high-quality images. Instead of drawing each element of an imaginary image, users can simply enter a short phrase and generate its key features and plot like a snowy mountain range. This starting blank can then be completed by making a particular mountain higher and adding trees in the background or clouds in the sky, "- reported Nvidia team member Isha Salian. |
GauGAN2 is an improved version of the GauGAN system, created in 2019 and trained in more than a million open images from the Flickr platform. Like GauGAN, GauGAN2 understands the relationships between objects such as snow, trees, water, flowers, bushes, hills and mountains and "understands" that the type of precipitation varies depending on the time of year.
Both GauGAN and GauGAN2 are a generative adversarial network (GAN) consisting of a generator and a discriminator. The generator takes samples (images with accompanying text) and assumes which data (words) correspond to other data (elements of the landscape). The generator is trained by deceiving a discriminator that evaluates whether these assumptions correspond to the truth. Although GAN transitions are generally of poor quality, they are improved by a discriminator response.
Unlike GauGAN, GauGAN2 is trained for 10 million images and is able to translate speech descriptions into images of landscapes. If you enter text like "sunset on the beach," the network generates the corresponding image, and if you expand the phrase to "sunset on a rocky beach" or replace "sunset" with "noon" or "rainy day," the corresponding changes will appear in the image.
With GauGAN2, users can generate segmented maps - high-level sketches that show the location of objects in the image. You can then turn this sketch into a drawing by adding rough sketches using the markings "sky," "wood," "stone" and "river" or manually using the brush tool.
According to Nvidia, the initial version of the GauGAN is already used to create concept art for movies and video games. As in the case of GauGAN, the company plans to put the GauGAN2 code on the GitHub along with an interactive demonstration on Playground, a web hub for Nvidia artificial intelligence research and deep learning.[1]