In short: Imagine being able to describe an AI image and turn it into a photorealistic image. This is one of the claims of the updated version of the program that we first saw last year, and the results look impressive.
DALL-E 2 comes from the OpenAI Research Lab in San Francisco, developing artificial intelligence models like GPT-2 and GPT-3 that can write fake news and beat the best human opponents in games like DOTA 2.
DALL-E 2, a name that comes from wallet artist Salvador Dali and Disney robot WALL-E, is the second iteration of the neural network we first saw last January, but it offers higher resolution and lower latency than the original version. The images it generates now have a much better resolution of 1024 x 1024 pixels, which is noticeably larger than the original 256 x 256.
Thanks to OpenAI’s revamped CLIP image recognition engine, now called unCLIP, DALL-E 2 can turn custom text into vivid images, even ones so surreal that they can rival Dali himself. For example, if you ask a koala playing basketball or a tax-paying monkey, you will see how the AI creates frighteningly realistic images of these descriptions.
The latter system switched to a process called diffusion, which starts with a pattern of random dots and gradually changes that pattern towards the image as it recognizes certain aspects.
DALL-E 2 can do more than create new pictures from text. It can also change parts of images; you can, for example, highlight someone’s head and ask them to add a funny hat. There’s even the ability to create variations of the same image, each with different styles, content, or angles.
“This is another example of what I think will be the next trend in the computer interface: you say what you want, in natural language or with contextual cues, and the computer does it,” said Sam Altman, CEO of OpenAI. “We can imagine an ‘artificial intelligence office worker’ who takes in requests in natural language, just like a human does.”
These types of imaging AI come with an inherent risk of being misused. OpenAI has some security measures in place, including not being able to generate faces based on a name and not allowing inappropriate content to be uploaded or created – for family viewing only. Some of the prohibited topics include hate, harassment, violence, self-harm, explicit/shocking images, illegal activities, deceit such as fake news, political figures or situations, medical or disease-related images, or general spam.
Users must also indicate that the images are created by artificial intelligence, and each will have a watermark indicating this fact.
edge writes that researchers can register to preview the system online. It is not published directly, although OpenAI hopes to make it available for use in third party applications at some point in the future.