The Sight of Sora-- ChinAfrica

ChinAfrica

The Sight of Sora

A technological leap that may chart the future landscape of AI

By Zhao Wei | VOL. 16 April 2024 ·2024-04-15

Screenshot of a Sora-generated video sjpws a herd of majestic wooly mammoths treading through a snowy meadow (SCREENSHOT)

“I ... have so many questions,” Marques Brownlee, a tech-focused American YouTuber with a following of 6 million and counting, expressed his intrigue on the social media platform X on 16 February. The candid response was directed at Sam Altman, CEO of OpenAI, a US-based artificial intelligence (AI) research organisation founded in 2015, after Altman earlier that day had unveiled Sora, his company’s most recent AI model and a leap in video-generation technology.

Sora represents a monumental stride in AI, harnessing the power to create 60-second videos from simple text prompts.

This innovation echoes the transformative impact of ChatGPT, introduced by OpenAI just a year earlier, which redefines the realms of writing, coding and text-to-image content creation.

Could Sora usher in a new era of digital storytelling and content creation, reshaping how people perceive and interact with AI-generated media — or with the world at large, even?

Explosive progress

According to OpenAI, Sora is powered by cutting-edge diffusion probabilistic models, a technology that enables the powerful tool to not only generate multiple shots within a single video, but also interpret prompt words with a nuanced understanding of language, ensuring consistency in character and visual style.

This was demonstrated in a compelling 60-second showcase of a stylish woman walking down a neon-lit Tokyo street. Video professionals noted the seamless transition from a wide shot to a close-up at the 37-second mark, underscoring Sora’s sophisticated editing capabilities.

This advancement soon sparked a flurry of reaction among video professionals online, with many expressing concerns about the potential obsolescence of their roles and proclaiming a dramatic shift in, if not an outright end to, traditional video production as we know it.

However, OpenAI’s technical report revealed a vision for Sora that extends far beyond a simple video creation tool.

Imagined as a “world simulator,” Sora is designed to facilitate content creation in a variety of native aspect ratios suitable for different devices, with advanced features such as 3D consistency, long-range coherence and object permanence.

As per the company’s website, “Our results suggest that scaling video generation models is a promising path toward building general purpose simulators of the physical world.”

Technically, the key difference between text and video generation is understanding human logic versus understanding the nuances of the physical world. The integration of Sora with advanced AI text models, such as large language models, could mark the advent of a universal simulator.

The prospect of such a system autonomously learning to navigate complex urban traffic by simulating a variety of driving scenarios is not just plausible; it is expected to happen in the foreseeable future.

Looking ahead, the potential integration of AI systems like ChatGPT and Sora with additional sensory modalities, including taste and touch, raises profound questions about the extent to which they could replicate the full spectrum of human experiences.

As the boundaries between simulation and reality become increasingly blurred, some entrenched beliefs about the nature of existence are being challenged.

This shift is prompting people to rethink their relationship with technology, especially as AI starts to mirror the intricacies of human life. This is also why, in the wake of Sora’s emergence, some people have expressed a fear of AI technology. It is not the technology itself that they fear, but the uncertain impact of technology on humanity’s future.

In other words, what people fear is the “unknown” that Sora brings. While the AI model’s immediate impact on the video and film industries is obvious, the long-term consequences - potentially vast and wide-ranging - as of yet remain largely hidden.

Researchers work on an AI-powered robot in Beijing, capital of China, on 31 January (XINHUA)

Pandora’s box or industrial revolution?

On 14 December 2023, the China Centre for Information Industry Development under the Ministry of Industry and Information Technology unveiled a report on the evolution of generative AI within China’s economic landscape. Highlighting the swift integration of this transformative technology across key sectors - manufacturing, retail, telecommunications and healthcare - the report showed an impressive adoption rate of 15 percent among Chinese enterprises in 2023, contributing to a burgeoning market valued at approximately 14.4 trillion yuan ($2 trillion).

The report’s forecast for the future of generative AI was optimistic, predicting that this technology could contribute an additional nearly 90 trillion yuan ($12.52 trillion) to the world economy by 2035, with China’s contribution expected to exceed 30 trillion yuan ($4.17 trillion), representing a significant 40 percent of this growth.

In an interview with China Central Television, Li Xiaodong, Vice President of the Internet Society of China and founder of the Fuxi Institution, a nonprofit research organisation focused on Internet innovation and development, noted how the widespread application of AI in a host of fields, from tech innovation to cultural creation and industrial manufacturing, is fuelled by increased computing power and reduced costs, bringing AI ever closer to the mainstream.

“AI will soon become a non-topic, since it is seamlessly woven into the fabric of our daily lives,” Li said.

In the short term, AI-generated content (AIGC) is poised to revolutionise content production by significantly lowering costs - a change reminiscent of historic milestones like papermaking and printing, which popularised access to knowledge.

The trajectory of AIGC, while unpredictable, has the potential to mirror past technological leaps that reshaped societal norms, such as the advent of camera-equipped mobile phones and smartphone technology leading to the explosion of social media platforms like TikTok.

But the most disturbing potential feature of the AI revolution is that its benefits are unlikely to be shared equitably.

In an effort to regulate the burgeoning field of generative AI, China has introduced several regulatory frameworks, including the Regulations on the Administration of Deep Synthesis of Internet Information Services in January 2023 and the Interim Measures for the Management of Generative Artificial Intelligence Services in August that same year.

On the global stage, China supports the leadership of the United Nations and advocates for an AI governance model that respects the diverse policies and practices of nations around the world.

Related Stories:
•	Digital Empowerment
•	Inspired by Innovation
•	The Future Has Arrived
•	AI Is Here – Are You Ready?
•	Smartly Spoken