《State of AI Report 2024》(5) - Multimodal related

Hollywood-level production begins to use generative AI for visual effects

Visual effects is a costly and labor-intensive business, so Hollywood producers have been gradually experimenting with the introduction of generative AI, although this has sparked significant opposition among artists and animators. Currently, much of the related work is still being carried out quietly behind the scenes, but careful viewers have already noticed some background flaws in HBO and Netflix productions caused by generative AI. This actually relates to long-standing issues with generative models, which struggle to stably and accurately represent physical and geometric properties.

Yesterday, Paula sent me a video about how Hollywood is gradually embracing AI technology. Just last month, James Cameron, director of "Avatar," officially joined Stability AI's board of directors.

Generative AI for visual effects moves toward professionalization

In a groundbreaking collaboration, Runway partnered with Lionsgate, the film and game studio known for movies like "John Wick," "Twilight," and "The Hunger Games" series. Runway will train a new generative model based on Lionsgate’s catalog of 20,000 works, while Lionsgate stated it will use Runway’s model to support "cost-effective content creation opportunities." The financial details of this deal have not yet been disclosed, but it is confirmed that Lionsgate will initially use the model for storyboard design, eventually expanding its application to visual effects production.

Speaking of Runway, I watched a video a few days ago showcasing the application of Runway combined with Apple Vision Pro—it was very impressive.

Major labs begin to diverge, with strong challengers emerging

Due to scientific disagreements, commercial pressures, personality conflicts, and the availability of capital, an increasing number of small research teams are splitting off from large labs, signaling a deepening of the ecosystem.

Sakana AI in Japan is one such emerging force, co-founded by Llion Jones, the sole author of "Attention Is All You Need" who still works at Google, and David Ha. They recently completed a $30 million seed round. Sakana AI focuses on an evolutionary approach to "model merging," combining existing models to screen for the most promising ones as "parents" for the next generation of models.

Sakana AI is likely to become the fastest-growing unicorn among domestic Japanese companies. Llion Jones, the author of this transformer, reportedly loves Japan and insists on working and living there.

H Company, headquartered in Paris, was founded by several experienced DeepMind researchers and recently completed a $220 million funding round, focusing on developing action models for robotic process automation (RPA). Additionally, due to internal disputes within OpenAI's board, its co-founder Ilya Sutskever decided to leave and establish Safe Superintelligence Inc., a new lab dedicated to building safe AGI without short-term commercial goals or pressures.

Recently, some of the earliest Stable Diffusion developers founded Black Forest Labs, specializing in image and video generation. They have released their first open-source image model series, FLUX.1, whose quality is quickly beginning to compete with Midjourney. (Previously introduced Flux, it's indeed very impressive!)

However, excellent engineers do not necessarily make outstanding entrepreneurs. Some teams that left large labs achieved early success, while others did not. Safe Sign Technologies, founded by a former lawyer and a former DeepMind researcher, successfully completed an acquisition without the founding team diluting shares to external investors. On the other hand, H Company, founded by a group of former DeepMind employees, despite securing over $200 million in funding, failed to maintain team integrity before product launch, ultimately leading to the dissolution of the team.