Articles by FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
Articles by FavTutor
No Result
View All Result
Home AI News, Research & Latest Updates

Vidu: The Chinese competitor to OpenAI’s text-to-video model Sora

Dhruv Kudalkar by Dhruv Kudalkar
April 28, 2024
Reading Time: 7 mins read
Vidu AI Text to video
Follow us on Google News   Subscribe to our newsletter

Chinese AI company Shengshu Technology and Tsinghua University recently unveiled Vidu, a text-to-video model capable of generating high-definition 16-second clips at 1080p resolution in a single click. In the announcement made at the 2024 Zhongguancun Forum in Beijing, they claimed that Vidu is a strong competitor to OpenAI’s Sora.

Highlights:

  • Vidu is a Chinese text-to-video model that can generate 16-second 1080p clips in a single click.
  • Vidu is positioned as a strong competitor to OpenAI’s groundbreaking text-to-video model Sora.
  • It showcases complex physics, realistic visuals, and cultural adaptability but needs to catch OpenAI’s Sora in overall fidelity.

What is Vidu?

Vidu, a text-to-video AI model developed by Shengshu Technology and Tsinghua University in China, is capable of generating 16-second video clips at 1080p resolution. It is based on a self-developed Universal Vision Transformer (U-ViT) architecture, which the company claims allows it to simulate the real physical world with multi-camera view generation and complex scenes adhering to real-world physics, such as realistic lighting, shadows, and detailed facial expressions.

🚨 China just released SORA’s rival “Vidu”

This is China's first long duration, high consistency, and high dynamics video modelIt can create videos upto 16s with 1080P in single click.

It excels at simulating the real physical world and also showcases a vivid imagination,… pic.twitter.com/6ThjAxrQs2

— Sambhav Gupta (@sambhavgupta6) April 27, 2024

Here is a quote from the Vidu official press release:

“Since the release of Sora, the battle for “domestic Sora” has begun. But when the industry focuses on the “long” feature, they all ignore that behind Sora is actually the improvement of comprehensive effects, such as consistency, realism, aesthetics, etc. in long time series. From the perspective of comprehensive effects, “Vidu” is the first and only video model to fully benchmark against Sora at the effect level, not only domestically, but also globally. It is also the first video model to achieve a breakthrough after Sora.”

The emergence of Vidu serves as a resounding declaration of China’s ambition to catch up with and potentially surpass leading US companies like OpenAI in the field of generative AI models. Achieving this will require a significant increase in performance, but Vidu’s rapid progress suggests it is well within reach. Interestingly, the core technology underpinning Vidu’s U-ViT architecture was first proposed by the Shengshu Technology research team in September 2022, predating Sora’s diffusion transformer (DiT) architecture.

Zhu Jun, vice dean of the Institute for Artificial Intelligence at Tsinghua University and chief scientist of ShengShu-AI, said the following about Sora at the forum:

After the release of Sora, we found that it closely aligned with our technical roadmap, which further motivated us to advance our research with determination.

Features of Vidu

During a recent live demonstration, Vidu showcased its ability to simulate the real physical world and generate scenes with intricate details, adhering to the principles of real-world physics, such as accurate light and shadow effects, and capturing delicate facial expressions with remarkable fidelity. Additionally, Vidu’s capabilities extend beyond mere visual realism, as it can generate complex dynamic shots, rather than fixed ones, further enhancing its versatility.

5. pic.twitter.com/2DJpZ2y9Ox

— Angry Tom (@AngryTomtweets) April 27, 2024

Moreover, as a homegrown Chinese model, Vidu boasts a deep understanding of Chinese cultural elements. This enables it to generate images of unique characters such as pandas, loongs, and dragons – a testament to the model’s cultural sensitivity and adaptability.

6. pic.twitter.com/sLcm0vza3u

— Angry Tom (@AngryTomtweets) April 27, 2024

Here are some more examples:

3. pic.twitter.com/eJivHZOY4u

— Angry Tom (@AngryTomtweets) April 27, 2024

8. pic.twitter.com/6XaKsPOheB

— Angry Tom (@AngryTomtweets) April 27, 2024

Comparison with OpenAI’s Sora

While Vidu undoubtedly represents a remarkable achievement and serves as a testament to China’s rapid strides in the field of AI research, it is important to acknowledge that it currently falls short of the industry-leading capabilities of OpenAI’s Sora model. Sora, a pioneering text-to-video model capable of generating continuous videos of up to one minute in length, sets the benchmark for visual fidelity and realism that Vidu has yet to surpass.

However, it is the temporal consistency achieved by Vidu that truly sets it apart, and this technology holds immense potential for further refinement and improvement as research and development efforts continue. The developers at Shengshu Technology are confident in their creation, boasting of Vidu’s “exceptional consistency” within generated scenes, where individual images build logically upon one another.

One plausible explanation for the current disparity between Vidu and Sora’s capabilities could be the relatively limited access to cutting-edge GPU resources in China compared to the resources available to a technological behemoth like OpenAI. Nevertheless, the emergence of Vidu serves as a resounding declaration of China’s unwavering ambition to not only catch up with but potentially surpass leading US companies in the intensely competitive race for dominance in the field of generative AI models.

While Vidu may currently lag behind Sora in terms of overall visual fidelity, its potential for growth and refinement is undeniable. As China continues to invest in cutting-edge AI research and development, further advancements in Vidu’s capabilities are inevitable, setting the stage for a future where the line between reality and artificial creation becomes increasingly blurred.

There are also doubts regarding Vidu’s claimed ability to generate video clips of up to 16 seconds in length. While the developers at Shengshu Technology assert that Vidu can produce 1080p video clips spanning 16 seconds, the demonstrations and samples released thus far have only showcased clips ranging from 3 to 5 seconds in duration.

Let's be honest. Vidu isn't that impressive.

Supposedly, text-to-video that can generate up to 16 seconds at 1080p.

Clips in this demo are barely 3 seconds. 🤷‍♂️pic.twitter.com/TKnjAJor63

— Min Choi (@minchoi) April 27, 2024

How to Access Vidu?

Users cannot directly access Vidu for their usage. However, they can fill out a form and apply for access to the text-to-video model.

Here is how you can apply for access:

  • Click on the following link: https://www.shengshu-ai.com/home
  • If you do not understand Chinese, you can use Google Translate to translate the language to your liking.
  • Scroll to the video generation section.
  • Click on the ‘Apply for Use’ button. You will be directed to a form as seen below. Fill out the form and apply for access.
How to access Vidu

Conclusion

The emergence of Vidu, a Chinese text-to-video AI model, showcases impressive advancements in generative AI technology, positioning itself as a competitor to OpenAI’s Sora. While Vidu demonstrates strengths in realism and cultural adaptability, it aims to enhance fidelity further to challenge industry-leading models like Sora in the future.

ShareTweetShareSendSend
Dhruv Kudalkar

Dhruv Kudalkar

Hello, I'm Dhruv Kudalkar, a final year undergraduate student pursuing a degree in Information Technology. My research interests revolve around Generative AI and Natural Language Processing (NLP). I constantly explore new technologies and strive to stay up-to-date in these fields, driven by a passion for innovation and a desire to contribute to the ever-evolving landscape of intelligent systems.

RelatedPosts

Candidate during Interview

9 Best AI Interview Assistant Tools For Job Seekers in 2025

May 1, 2025
AI Generated Tom and Jerry Video

AI Just Created a Full Tom & Jerry Cartoon Episode

April 12, 2025
Amazon Buy for Me AI

Amazon’s New AI Makes Buying from Any Website Easy

April 12, 2025
Microsoft New AI version of Quake 2

What Went Wrong With Microsoft’s AI Version of Quake II?

April 7, 2025
AI Reasoning Model Better Method

This Simple Method Can Make AI Reasoning Faster and Smarter

April 3, 2025

About FavTutor

FavTutor is a trusted online tutoring service to connects students with expert tutors to provide guidance on Computer Science subjects like Java, Python, C, C++, SQL, Data Science, Statistics, etc.

Categories

  • AI News, Research & Latest Updates
  • Trending
  • Data Structures
  • Web Developement
  • Data Science

Important Subjects

  • Python Assignment Help
  • C++ Help
  • R Programming Help
  • Java Homework Help
  • Programming Help

Resources

  • About Us
  • Contact Us
  • Editorial Policy
  • Privacy Policy
  • Terms and Conditions

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.

No Result
View All Result
  • AI News
  • Data Structures
  • Web Developement
  • AI Code Generator
  • Student Help
  • Main Website

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.