Uber and Lyft operating in US cities linked to rises in car ownership

The model produces 512 images for each prompt, which are then filtered using a separate computer model developed by OpenAI, called CLIP, into what CLIP believes are the 32 “best” results.

CLIP is trained on 400 million images available online. “We find image-text pairs across the internet and train a system to predict which pieces of text will be paired with which images,” says Alec Radford of OpenAI, who developed CLIP.

“This is really impressive work,” says Serge Belongie at Cornell University, New York. He says further work is required to look at the ethical implications of such a model, such as the risk of creating completely faked images, for example ones involving real people.

Effie Le Moignan at Newcastle University, UK, also calls the work impressive. “But the thing with natural language is although it’s clever, it’s very cultural and context-appropriate,” she says.

For instance, Le Moignan wonders whether DALL-E, confronted by a request to produce an image of Admiral Nelson wearing gold lamé pants, would put the military hero in leggings or underpants – potential evidence of the gap between British and American English.

The more ideas that a framework can reasonably mix together, the almost certain the AI framework both comprehends the semantics of the ask for and can exhibit that seeing innovatively,”

says Mark Riedl at the Georgia Institute of Technology in the US.

ettyimages-1192912768_web

“I’m not entirely certain how to characterize what inventiveness is,” says Ramesh, who concedes he was intrigued with the scope of pictures DALL-E delivered.

It does this by attempting to comprehend the content brief, at that point delivering a proper picture. It assembles the picture component by-component dependent on what has been perceived from the content. On the off chance that it has been given pieces of a previous picture close by the content, it likewise thinks about the visual components in that picture.

“We can give the model a brief, similar to ‘a pentagonal green clock’, and given the former [elements], the model is attempting to foresee the following one,” says Aditya Ramesh of OpenAI.

For example, whenever given a picture of the top of a T. rex, and the content brief “a T. rex wearing a tuxedo”, DALL-E can draw the body of the T. rex under the head and add fitting apparel.

It does this by attempting to comprehend the content brief, at that point creating a fitting picture. It constructs the picture component by-component dependent on what has been perceived from the content. In the event that it has been given pieces of a prior picture close by the content, it likewise thinks about the visual components in that picture.

“We can give the model a brief, similar to ‘a pentagonal green clock’, and given the previous [elements], the model is attempting to foresee the following one,” says Aditya Ramesh of OpenAI.

For example, whenever given a picture of the top of a T. rex, and the content brief “a T. rex wearing a tuxedo”, DALL-E can draw the body of the T. rex under the head and add proper dress.

 

About admin

Leave a Reply

Your email address will not be published. Required fields are marked *