OpenAI, which is the organization that developed one of the strongest AI tools such as ChatGPT and Dall-E 3 is releasing the first video maker ever created by Sora. I’m not exaggerating when I tell you that my jaw dropped when I watched the first video created by Sora.
What is Sora AI?
Sora It is an AI model that generates videos from text prompts. It’s capable of producing one minute of high-quality video.
Sora ai is a model of diffusion which is an innovative AI technique that uses unique methods to “learning.” Diffusion models start with clear data such as videos or images. They add noise gradually until the original data is obscured.
Their power is in reverse-engineering this process, learning to eliminate noise step-by-step to restore the initial data. This is the basis for the AI system that is able to produce real-world results.
To guide Sora It makes use of GPT (the technique that is behind ChatGPT) to translate plain text-based prompts to precise descriptions specifically designed for video creation. So, even your tiniest thoughts are translated into visually stunning, precise outcomes.
Here are a few examples
Here are some prompts as well as sample videos demonstrating the extraordinary abilities of Sora.
Prompt: A trailer for a movie that tells the story of a 30 year old spaceman sporting a red wool-knitted motorcycle helmet blue skies Salt desert in cinematic fashion film shot on 35mm with vibrant colors.
Prompt: The camera follows behind a white vintage SUV with a black roof rack as it speeds up a steep dirt road surrounded by pine trees on a steep mountain slope, dust kicks up from it’s tires, the sunlight shines on the SUV as it speeds along the dirt road, casting a warm glow over the scene. The dirt road curves gently into the distance, with no other cars or vehicles in sight. The trees on either side of the road are redwoods, with patches of greenery scattered throughout. The car is seen from the rear following the curve with ease, making it seem as if it is on a rugged drive through the rugged terrain. The dirt road itself is surrounded by steep hills and mountains, with a clear blue sky above with wispy clouds.
Prompt: Prompt: An extreme close-up of an gray-haired man with a beard in his 60s, he is deep in thought pondering the history of the universe as he sits at a cafe in Paris, his eyes focus on people offscreen as they walk as he sits mostly motionless, he is dressed in a wool coat suit coat with a button-down shirt , he wears a brown beret and glasses and has a very professorial appearance, and the end he offers a subtle closed-mouth smile as if he found the answer to the mystery of life, the lighting is very cinematic with the golden light and the Parisian streets and city in the background, depth of field, cinematic 35mm film.
These examples are already miles better than what the competitors are capable of.
Keep in mind that these aren’t cherry-picked. OpenAI’s CEO, Sam Altman, is actively taking and sharing prompt requests on X.
Sora ai can animate DALL-E images
In addition to creating videos from texts, Sora is also capable of creating videos using an image as the input.
Prompt to think of a Shiba Inu dog with an eagle beret and a black turtleneck.
With this ability it is possible for Sora to become integrated with ChatGPT in the near future.
Sora ai can generate images
I was surprised to see that a small number of people are discussing this feature. Sora can also be used for producing images.
It operates by arranging a patch of Gaussian noise within the form of a spatial grid, with an amplitude of time that is one frame. The model can produce images with different sizes–up to 2048 x 2048 resolution.
Here are a few examples:
Prompt: A snowy mountain village with cozy cabins and a northern lights display, high detail and photo realistic dslr, 50mm f/1.2
The example image looks even better than what Dall-E 3 can produce.
More Sora capabilities
When they are trained at a large scale, video models could produce interesting emerging capabilities such as these:
- 3D congruity: Sora can generate videos that feature dynamic camera movement.
- Long-range coherence and object permanence Sora can create multiple images of the same object in one sample and maintain it’s appearance over the course of duration of the video.
- Connecting to the outside world Sora may sometimes mimic actions that affect the condition around the globe in a variety of ways.
- Simulating digital environments: Sora can also simulate artificial processes. An instance could be games played on video.
Another exciting thing you can perform using Sora is to create 3D models from videos. User metamike demonstrated this by showing an example of a Santorini video which was turned into a 3D-like scene using Poly.cam tool.
Ethical considerations and limits
Despite its incredible capability, Sora faces challenges in accurately capturing complex physical systems and comprehending intricate causal and effect scenarios.
For instance in the video below, the AI creates an unreal motion.
Prompt: Step-printing scene of a person running, cinematic film shot in 35mm.
This is really bizarre.
Like the majority of AI algorithms, Sora reflects the biases and limitations of its vast dataset of human generated training.
Oh, and speaking about training the model, the current debate among the AI industry is whether AI companies should be able to credit and pay people whose work is used in training.
Technology is evolving at a rapid pace and regulations are lagging behind.
Who’s at risk?
If anyone should be the most scared of AI that’s the executives of film studios and shareholders. If anyone with access to the internet can make and share a complete movie simply by writing a query into an AI technology, the TV and film industry gatekeepers are certain to be pushed into total obsolescence.
While they’re currently trying to utilize AI in order to substitute human creativeness but it could backfire on the AI. The saying goes that they plant the seeds but reap the wind.
Do you have to worry too?
“Smart people who aren’t afraid of change and seize opportunities won’t ever be replaced.”
Final Thoughts
It’s been crazy this week for the AI world, with an announcement by Google’s Gemini 1.5 and OpenAI’s Sora.
It was only a year back when Will Smith’s Will Smith spaghetti video went to the top of the charts, and we’re getting close-to-realistic-looking videos.
If the development keeps up at this speed we could soon be able to access realistic video simulators that are only limited to our own imagination. These applications could be revolutionary and disruptive across a variety of industries such as gaming, film production of content, and much more.