It’s like Photoshop’s Warp tool, but far more powerful. You’re not just smushing pixels around, but using AI to re-generate the underlying object. You can even rotate images as if they were 3D.
Share this story
No, it’s not over yet: the ability of AI tools to manipulate images continues to grow. The latest example is only a research paper for now, but a very impressive one, letting users simply drag elements of a picture to change their appearance.
This doesn’t sound too exciting on the face of it, but take a look at the examples below to get an idea of what this system can do.
Not only can you change the dimensions of a car or manipulate a smile into a frown with a simple click and drag, but you can rotate a picture’s subject as if it were a 3D model — changing the direction someone is facing, for example. One demo even shows the user adjusting the reflections on a lake and height of a mountain range with a few clicks.
Here’s an overview on various subjects:
Here’s a closer look at landscape manipulation:
And just for fun, messing about with lions:
These videos come from the research team’s homepage, though this has been crashing due to the amount of traffic sent to the site by Twitter (mainly by user @_akhaliq, who does a fantastic job highlighting interesting AI papers and is well worth a follow if that interests you). You can also read the research paper on arXiv right here.
As the team responsible note, what’s really interesting about this work is not necessarily the image-manipulation per se, but the user interface. We’ve been able to use AI tools like GANs to generate realistic images for a while now, but most methods lack flexibility and precision. You can tell an AI image generator to “make a picture of a lion stalking through the savannah,” and you’ll get one, but it might not be the exact pose you want or need.
This model, named DragGAN, offers a clear solution to this. The interface is exactly the same as traditional image-warping, but rather than simply smudging and mushing existing pixels, the model generates the subject anew. As the researchers write: “[O]ur approach can hallucinate occluded content, like the teeth inside a lion’s mouth, and can deform following the object’s rigidity, like the bending of a horse leg.”
Obviously this is just a demo for now, and it’s impossible to evaluate the tech completely. (How realistic are the end images, for example? It’s hard to say based on the low res videos available.) But it’s another example of making image manipulation more accessible.