Latest Computer Vision Research From Cornell and Adobe Proposes An Artificial Intelligence (AI) Method To Transfer The Artistic Features Of An Arbitrary Style Image To A 3D Scene

Art is a fascinating but extremely complex discipline. Indeed, the creation of artistic images is often not only a time-consuming problem but also requires a significant amount of expertise. If this problem applies to 2D arts, imagine extending it to dimensions beyond the image plane, such as time (in animated content) or 3D space (with sculptures or virtual environments). This introduces new limitations and challenges, which this paper addresses.

Previous results involving 2D stylization focus on video content divided frame by frame. The result is that the generated individual frames achieve high-quality stylization but often lead to flickering artifacts in the generated video. This is due to the lack of temporal coherence of the produced frames. In addition, they do not explore the 3D environment, which would increase the complexity of the task. Other works focusing on 3D stylization suffer from geometrically imprecise reconstructions of point clouds or triangular meshes and the lack of style details. The reason lies in the different geometric properties of initial mesh and produced mesh, because the style is applied after linear transformation.

Also Read :  GALA Sports, the up-and-coming mobile sports game developer whose parent company seeking for successful Hong Kong IPO as World Cup comes near

The proposed method called Artistic Radiance Fields (ARF), can transfer the artistic properties of a single 2D image to a real-world 3D scene, leading to artistic new visualizations that are faithful to the input style image (Fig. 1).


For this purpose, the researchers exploited a photo-realistic radiation field reconstructed from multiple images of real-world scenes into a new, stylized radiation field that supports high-quality stylized renderings from a new point of view. The results are shown in Fig. 1.

For example, given in input a set of real images of an excavator and an image of the famous Van Gogh “Starry Night” painting as a “style” to be applied to it, the result is a colorful excavator with a smooth texture similar to the painting.

The ARF pipeline is depicted in the figure below (Fig. 2).


The key point of this architecture is the connection of the proposed loss of Nearest Neighbor Featuring Matching (NNFM) and the color transfer.

The NNFM involves the comparison between the feature maps of both rendered and style images, extracted using the well-known VGG-16 Convolutional Neural Network (CNN). Thus, the functions can be used to guide the transfer of complex high-frequency visual details consistently across multiple viewpoints.

Also Read :  New wrist-worn device can assess functional impact of chronic pain

Color transfer is instead a technique used to avoid noticeable color mismatch between the synthesized views and the style image. It involves a linear transformation of the pixels forming the input images to match the mean and covariance of the pixels in the style image.

In addition, the architecture uses a delayed back-propagation method, taking into account the computation of losses on full-resolution images with a reduced load on the GPU. The first step is the image image at full resolution and the computation of image loss and gradient with respect to the pixel colors, which produces a cached gradient image. Then, these cache gradients are back-propagated patch-wise for the accumulation process.

The approach, ARF, presented in this article brings several advantages. First, it leads to amazing creations of stylized images with almost no artifacts. Second, the stylized images can be produced from new views with only a few input images, enabling artistic 3D reconstructions. Finally, by using the delayed back propagation method, the architecture significantly reduces the GPU memory footprint.

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'ARF: Artistic Radiance Fields'. All Credit For This Research Goes To Researchers on This Project. Check out the paper, github link and project.
Please Don't Forget To Join Our ML Subreddit

Also Read :  Tecnotree Concludes CognitiveScale Transaction in the United States of America

Daniele Lorenzi received his M.Sc. in ICT for Internet and Multimedia Engineering in 2021 from the University of Padua, Italy. He is a Ph.D. candidate at the Institute of Information Technology (ITEC) at the Alpen-Adria-Universität (AAU) Klagenfurt. He currently works at the Christian Doppler Laboratory ATHENA and his research interests include adaptive video streaming, immersive media, machine learning and QoS/QoE evaluation.


Leave a Reply

Your email address will not be published.

Related Articles

Back to top button