Homegrown AI tools play key role in preserving cultural heritage

XINHUA | Updated: 2024-11-28 07:52

A scene of releasing a cloud computing infrastructure for AI technology at Jiaxing, Zhejiang province, on Nov 19. [Photo/Xinhua]

China's generative AI tools are carving out a unique niche, offering a blend of entertainment and practical benefits while also playing a key role in preserving cultural heritage.

Among them, an image-to-video tool called Vidu-1.5 launched last week by Beijing-based ShengShu Technology, an AI startup, is proclaimed to be a multimodal model to support multi-entity consistency.

In practice, this means AI can generate a video from as few as three input images. For example, in a video shared by the company, the inputs — a man, a futuristic mecha suit and a bustling nighttime cityscape — are seamlessly blended into a cohesive montage, all within just 30 seconds.

Understanding and controlling multiple entities — such as the person, attire and environment — are the biggest challenges in AI-generated video technology.

Ever since ChatGPT introduced its pioneering Sora, multiple Chinese tech firms have swiftly stepped up to the plate, rolling out products that boast unique characteristics. ShengShu Technology's Vidu is one popular example.

"Look how consistent the suit is," Stefano Rivera, an AI product aficionado tweeted with admiration, calling himself a "superfan" of Vidu "from day 1".

This AI-generated content tool has already ignited a surge of creative enthusiasm among global individual creators, leading to playful and imaginative clips like Leonardo DiCaprio showcasing haute couture on the runway, Elon Musk cruising on an electric scooter in a flamboyant Chinese jacket and a series of Japanese anime scenes.

Vidu's greatest breakthrough is establishing logical relationships among multiple user-specified objects within a scene, says Tang Jiayu, the CEO of ShengShu Technology, in a written response to Xinhua.

With previous text-to-video tools, generating scenes like "a boy holding the cake in a crystal setting "would yield different images of the boy, the cake and the crystal setting each time, much like opening a blind box. Now, with multi-subject consistency, the identity of the boy, cake and crystal can be preserved throughout the video, maintaining continuity, says Tang.

Chinese entrepreneurs like Tang, along with global investors with substantial capital, are rapidly pouring into the AIGC sector, expanding their market footprint in China.

1 2 Next >>|

1/2 Next