Summary
After months of teasing, OpenAI’s Sora video generation technology is available to the public. I spent some time playing with this much-anticipated tech, and honestly came away a little underwhelmed.
ChatGPT Plus Subscribers Get a 5-Second Taste Test of Sora
All you have to do is put your prompt into the text box, and a few seconds later you have a video clip, pretty much howMidjourneyor other AI image generators work from the user’s perspective.
Even Short Clips Are Very Hit-and-Miss
One of the main reasons that the “full” Sora experience is limited to 20 seconds is that there are still significant issues with this technology when it comes to coherence. The longer the video goes on, the more mistakes and weird tangents it takes.
That issue aside, it had a hard time visualizing what I put in my prompts. For example, I asked it for a clip of a starship going into warp, which is a pretty common sci-fi trope.
Well, it’s sort of what I had in mind, but I wouldn’t put that in my half-baked YouTube talking head video.
At other times, it’s pretty spot-on. Such as when I asked for a spinning chrome HTG logo.
The last bit of trouble Sora currently has is with any sort of physics. I’ve seen plenty of videos featuring animals that just don’t move in a believable way, and when I asked for something simple—a ball bearing running on a rail, it gave me this strange video.
Even when videos are visually perfect, it’s usually the motion that gives it away as an AI-generated clip.
Sora Feels Much Less Mature Than Image Generation
I don’t want to create the impression that Sora isn’t impressive. It’s a major achievement, but actually using it feels like the early days of image generation. This wouldn’t be so apparent if not for Google’s precisely-timed announcement ofVeo 2.
The videos from that system look so much better than Sora, particularly when it comes to the physics of moving objects looking correct.
Just check out this official compilation from Google.
While one might argue these are cherry-picked, a few YouTubers have had access to Veo 2, and the opinion seems to be thatVeo 2 comes out on top by quite a margin.
For Now, It’s Just a Fun Toy
Getting to play around with Sora for a bit thanks to a subscription I already have was fun, but I certainly wouldn’t want to pay the $200 a month fee for this product in its current state. You’d be much better off simply subscribing to a stock video service.
Looking at what Google’s cooked up, and considering that there are other competitors in this space likeHeyGenandRunway ML, I expect updates and improvements to be fast and frequent. If for no other reason than OpenAI being relentless in its improvement of ChatGPT.
I still see a medium-term future where AI video generation will be capable of so much more, and even allow for longer-form content to be generated with precise prompt adherence, and the ability to edit elements within a scene. However, that day is still likely a few years away, and for now it’s an interesting if impractical curiosity.