Update on Sora, Luma, Runway Gen-2 Ai text and image machines.
Recently I posted this:
From Richard: We knew it was coming, but it's still hard to believe
In that blog Richard introduced me to SORA which I found astonishing. Some of you may also have been as astonished as I was.
Since then Richard sent me a new lead on "LUMA DREAM MACHINE" which offers in some respects more than what SORA did.
Here's a link to their main page which has been going viral:
https://lumalabs.ai/dream-machine
Well I 've been playing with Luma for a few days now, and to-ing and fro-ing with Richard and Alix... some of the results took my breath away, some were totally disappointing.
Let's start with an early test piece... from one of my interests in Cave Art.
An elderly neolithic man and two younger companions enter a dark cave in fur skin costumes and with ropes carried over their shoulders. The older man is carrying a lit torch formed of bound sticks.
Here's what Luma created from that prompt:
Then I used a single frame from that video to add into a second prompt saying:
add another young man behind left shoulder of older man all men carrying ropes.
Well, what came back was a great surprise, they added quite a few young men, not just "another" young man. No ropes!
Later I tried some text modifications to try and improve the prompts and see what results I might get back.
To cut a long story short, I received many videos with stupid errors which were total misreads of what I was asking Luma for. No matter how precisely I framed the prompts I got back a lot of rubbish, but it was interesting rubbish and made me imagine more than I had started off with.
Then I edited a short video from all those tests and here it is:
Well, it was a lot of fun to go through all that, it has given me plenty to think about for the future and together with Alix and Richard I'm contemplating new work using this Ai imaging for movies. But although Luma is astonishing the engine simply cannot read precise text directions... if morphs when you don't want it to, and when you do want it to, it doesn't morph!
I wanted the dancing Shaman at the end of that clip to be a morphed character, combining Swirling Dervish and Salome type dance, but what Luma gave me was this silly fellow doing a soft shoe shuffle.
A basic two foot routine!
Here's another sample of Luma's "issues"...
I wanted to use an image of a stately ancestral home with "women coming and going" as per Eliot's wonderful Prufrock poem:
"in the room the women come and go"
so from the net I selected this stately interior:
At first I asked for just one woman and I got that, then I asked Luma for a second woman who would be walking from those chairs which are in mid shot, towards the camera, central in frame but instead I received a panning shot to an entirely different room, outside of the original shot I had offered!
You see it just goes crazy from the 5 second mark!
No matter how hard I tried I simply could not get anything close to what I was asking for, and most frustrating of all, I was not asking for Luma to pan outside the static frame of that lovely image I gave it to start with.
Richard had also mentioned Runway Gen-2 and Gen-3 Alpha.
Gen-3 Alpha is not here yet, but I've now tried Gen-2.
I had similar issues with my Gen-2 tests, but at least I could give them feedback, which I could not do with LUMA. At least Gen-2 offered me a 1-5 star rating choice and I gave them 1 star, then they asked why?
I ticked, "Didn't give me what I had asked for in the the prompt".
Waiting to hear back from them!
But it sure helps to be able to communicate with the provider and let them know what's going wrong.
I wish they could teach these AI bots to read!
Now, for an entirely different view of all this, my friend Dainis sent me a lovely work which he must have picked up from the net, and maybe you have all seen it, but if you have not, here it is.
I love the possibilities which these AI engines offer us.
I've been working on some material from footage I inherited of a Melbourne couple who took a world trip in 1950/51. The man recorded the trip which he and his wife had embarked upon using an old fashioned Bolex type 16mm camera, the earlier Kodachrome stock not the improved KII which came out a few years later when I was about 18.
That footage is now more than 70 years old and seriously degraded... the film has become warped and shrunken, discoloured towards very pale colours, nothing like the intensity of the two Kodachrome stocks.
I offer you a very small sample of this 1951 MOOMBA parade footage, first the original footage as copied for me by Callum at Memorylab
Second: my restoration of that clip:
It has taken me many years to get that restoration working for me and I simply could not have done it without the stunning AI bots like DaVinci and Topaz.
Many thanks to Richard Leigh, Alix Jackson, Dainis, Fred Harden and all who have assisted me in these recent times.
Hoping you will send me plenty of commentary.
pt
ATTACHMENTS:
Facial distortions I mentioned in my response to Richard's commentary:
3.
It's all such a fascinating new technology to discover. The gallery short is terrific, amazing to see all the artworks come to life.
ReplyDeleteYour restoration comparison of the Moomba clip is also fantastic!! Looks beautiful and such an excellent use of the tech, to breathe such fresh life into what sounds like, very degraded source material.
I think the possibilities for creatives is only really limited by our creativity, render speed of the A.I. and it's ability to understand simple directions!!
It'll only get better at it realises, when we're saying come from the left, we mean our left, not stage left 😁
I did a little test, animating a still of myself from 20 afternoon yrs ago, I was going for an android dreaming of electric sheep. https://drive.google.com/file/d/13pKSjmBUebx1ztLdfatX95DgwLoKVyrQ/view?usp=drivesdk
Using it in a variety of ways, blending with editing, Photoshop, VFX, SFX, great acting, nice prompts, we will see brand new ways of telling stories, I believe 🙂
It's a delight to have you onboard Alix, and we will do great new work. These Ai bots have just got to learn to read!
ReplyDeleteThe more I see you do Peter, the more I see just how much work it takes to make it work for you.
ReplyDeleteIt's actually quite comforting knowing that it's not easy. It's enough hassle for people wanting it to do 'serious work' for them, that they'll surely give up and go, "meh, we'll ask (=pay) for someone to do it for me who knows what they're doing". That's the cynical, job-worrying part of me!
The more creative part of me thinks, well it is really what they call it - a 'dream machine' - full of wild chance and creative oddities, not unlike a dream. Some fresh ideas, some random associations, the spark of connections you might not have thoughts of, and all not much more controllable than an actual dream!
Fascinating, for sure.
How good can it get in terms of realism, and what even IS 'good' in this context?
What are the developers aiming for?
One thing you haven't tried yet Peter is the "in the style of..."
I'm not sure if Luma does it, but I remember Sora being sold on the idea that you can specify different looks, genres (including 'animation' rather than simply 'realism'). Could be worth a play.
And with all these new questions and limitations, I'm left feeling something like this:
YES, the pandoras box is open, but
NO, the world will not suddenly end because what's inside the box is quite a mess!
I can keep my job making videos... for now, at least.
And for that I am thankful. Grateful to you, too, for showing the tough road ahead for anyone expecting and easy ride on the text-to-video path.
Thanks Richard, your thoughts definitely summarise the issues and quandaries we all have to face here. There are some issues of a socio-political nature, ethical problems, aside from the overarching question "what is real?" If the Ai bot creates a new image which is apparently realistic but has glitches such as a missing foot or a hand with a thumb and seven fingers, which we might consider "accidental", it is nevertheless creating a new reality, the "reality" which we attach to that new image. In the case of the people on the float from "Moomba 1951" which I have "upgraded" and "upscaled", sharpened and made more rich in colour, there are "artefacts" which were not present in the original footage. The worst one for me occurs when the young woman addresses someone out of picture and I think she says "Thank you" as if to a compliment from that person we can't see. But as she says this her face distorts. Now this upsets me greatly because everything else in the shot is so "real" and the facial distortion occurs at a most important moment in the shot. For the AI bot that moment is neither more or less important than any other moment, but for me it is a crucial moment!
ReplyDeleteThere are so many other issues I could talk about but I don't want to prattle on here. My biggest concern is that these new AI imaging tools are being used to harm other people, many of them quite innocent and even not aware that their deep-faked images are being used on porn sites!
I'll leave it at that for now Richard... I'm hoping that we might hear from others on ways they are dealing with these conundrums!
pt
Some samples of her facial distortion which occur within 1 second of the footage are attached at the end of the original post,
ReplyDeletept
I'm posting this on behalf of our friend David King because he has been unable to post it on his own, who knows why that might be? Possibly a confusion with someone else who goes by the name David King who has been blocked by Blogger? Anyhow, here is what our friend DJK tried to post:
ReplyDeleteThe gallery shots posted by Dainis are frankly incredible, mind-bending and gob-smacking and certainly make me as an artist interested in using AI.
However, I have refrained from indulging myself for one major reason - the massive negative impact AI technology is having on the environment.
I was first made aware of this issue by Terry Flaxton, a UK-based video artist who worked with the likes of Bill Viola and Nam June Paik and is still working today. Ever since he opened my eyes to the enormous damage AI technology is doing to our environment (thousands of times the damage being done by personal computing technology), the issue has refused to go away and made me feel deeply uncomfortable about the direction technology is taking us.
Earth.org has published articles about the type and amount of damage being done by AI on top of other technologies and it makes for very uncomfortable reading.
This in spite of having recently had some video clips saved by Peter's use of AI in the form of Topaz (which is a very basic form of AI and probably not a major culprit regarding environmental damage), and having been deeply tempted to try out Gen-2 with a script for an animated film which never got made due to my inability to find an animator interested in helping me make it. It's obvious that AI could do the job straight from the script. There might be a few glitches such as hands not looking very realistic or facial distortions etc., but the idea would finally live on the screen for all to see. So yes, very tempting indeed.
But are we just going to ignore the massive damage being inflicted on the environment in pursuit of our artistic dreams and ambitions?
It doesn't seem like anyone is talking much about this issue so I felt it was important to bring it up.
A link to an earth.org article is: "The Real Environmental Impact of AI" Earth.Org
The Real Environmental Impact of AI | Earth.Org
Alokya Kanungo
In this article, we delve into the environmental impact of AI, exploring associated problems including emissions...
Cheers
David.