Tuesday, February 7, 2023
HomeBig DataOpenAI's DALL·E 2 Is Surreal

OpenAI’s DALL·E 2 Is Surreal


About 15 months ago, OpenAI—famed for its eerily effective GPT-3 large language model—introduced a child system to that language model: the cleverly named “DALL·E,” a 12-billion parameter neural network that generates images from provided prompts. Now, OpenAI has introduced a new version of DALL·E. DALL·E 2 promises higher resolution, better caption-matching, improved photorealism, and reduced harmful outputs.

First, the user provides a text description of the image they want: a combination of concepts, attributes, and styles. DALL·E 2’s website offers a simple illustration of the process: a user could, for instance, ask for an astronaut riding a horse as a pencil drawing; or, if they were feeling more adventurous, they could ask for a bowl of soup that is a portal to another dimension drawn on a cave wall.

DALL·E 2 outputs based on the prompts above. Images courtesy of OpenAI.

DALL·E 2, like its predecessor, was trained with a large dataset of captioned images. Using this understanding, DALL·E 2 then generates an image to the best of its ability to match the provided caption—and as seen above, the results are breathtakingly accurate. DALL·E 2 can also edit existing images given a caption (mimicking the shadows, lighting, and textures of the original image) or even create variations “inspired” by an original image.

Compared to DALL·E 1, DALL·E 2 offers 4× greater resolution. OpenAI also reports that the successor is 71.7% preferred for caption-matching and 88.8% preferred for photorealism.

However, as with any generative model like GPT-3 or DALL·E 2, the unfortunate reality exists that sometimes, the model will produce outputs that mirror biases found in the training dataset or which are otherwise harmful. “Without sufficient guardrails, models like DALL·E 2 could be used to generate a wide range of deceptive and otherwise harmful content, and could affect how people perceive the authenticity of content more generally,” OpenAI wrote on GitHub. “DALL·E 2 additionally inherits various biases from its training data, and its outputs sometimes reinforce societal stereotypes.”

To combat this, DALL·E 2—which has a content policy that forbids violent, adult, or political content, among other categories—was trained with a new dataset that excluded “the most explicit content.” OpenAI has also been working with select early users for over a month to identify other areas for improvement. Over the course of that time, those users have created more than three million images, and OpenAI says that less than 0.05% of downloaded or publicly shared images were flagged for potential content policy violations, with 30% of those confirmed as policy violations.

DALL·E 2 is still in that early testing phase, but some of the tool’s creations can be found on its Instagram, along with their associated captions.


Source link



Please enter your comment!
Please enter your name here

Most Popular

Recent Comments