Friday, May 24, 2024

China unveils Sora challenger able to produce videos from text

Must read

China has come up with its own text-to-video artificial-intelligence (AI) tool similar to OpenAI’s Sora, although the new model can only produce videos no longer than 16 seconds, compared with the US service’s 60 seconds.

Vidu, the country’s best hope so far in catching up with Sora, was launched over the weekend by start-up Shengshu Technology in a joint effort with the prestigious Beijing-based Tsinghua University.

The model is able to produce videos with 1080p resolution based on simple text prompts, the company said.

“Vidu is the latest achievement of self-reliant innovation, with breakthroughs in many areas,” said Zhu Jun, chief scientist at Shengshu who is also deputy dean at Tsinghua’s Institute for AI, announcing the model at the Zhongguancun Forum held in the Chinese capital, according to a report by Beijing News.

A screenshot of a demo video released by Shengshu.

Vidu is “imaginative”, “can simulate the physical world” and “produce 16-second videos with consistent characters, scenes and timeline”, Zhu said, adding that the model is also able to comprehend “Chinese elements”.

During the model’s unveiling, Shengshu released several demo clips, including one featuring a panda playing the guitar while sitting on grass and another of a puppy swimming in a pool, both showing vivid details.

Vidu’s debut has raised hopes in the country, which is racing to catch up with leading global generative AI players, such as Microsoft-backed OpenAI.

Unlike OpenAI’s ChatGPT, which has inspired a raft of China-based competitors after launching in November 2022, previews of Sora videos released in February have not drawn a similar level of enthusiasm from Chinese Big Tech firms or start-ups.

Industry experts said one of the factors hindering Chinese firm’s progress is the lack of sufficient computing power.

For Sora to produce a one-minute clip, it needs eight Nvidia A100 graphics processing units (GPUs) to run for more than three hours, according to Li Yangwei, a Beijing-based technical consultant working in the intelligent computing sector.

“Sora requires a lot of computing power for inferencing,” he said.

The US has been tightening export restrictions on advanced chips produced by the likes of Nvidia, including its A100 and H100 GPUs, which have become highly sought-after components for training AI systems, but are banned from being shipped to China.

OpenAI’s Sora pours ‘cold water’ on China’s AI dreams

Beijing-based Shengshu was founded in March 2023, with a core team made up mostly of members from Tsinghua’s Institute for AI, as well as other members from Alibaba Group Holding, Tencent Holdings and ByteDance. Alibaba, owner of the Post, is also working on its own video models.

Shengshu raised hundreds of millions of yuan last month from investors including Qiming Ventures, Zhipu AI and Baidu Ventures, according to start-up database provider ITjuzi.

Latest article