How Ai Like Chatgpt And Dall-E Got Frighteningly Good So Quickly

Artificial intelligence has become shockingly capable in the past year. The latest chatbots can conduct fluid conversations, craft poems, even write lines of computer code while the latest image-makers can create fake “photos” that are virtually indistinguishable from the real thing.

It wasn’t always this way. As recently as two years ago, AI created robotic text riddled with errors. Images were tiny, pixelated and lacked artistic appeal. The mere suggestion that AI might one day rival human capability and talent drew ridicule from academics.

A confluence of innovations has spurred growth. Breakthroughs in mathematical modeling, improvements in hardware and computing power, and the emergence of massive high-quality data sets have supercharged generative AI tools.

[Quiz: Did AI make this? Test your knowledge.]

While artificial intelligence is likely to improve even further, experts say the past two years have been uniquely fertile. Here’s how it all happened so fast.

AI-generated responses

How three AI systems responded to the following text prompt

A clever name for a play about dinosaurs is

2018

“to quote the most brilliant of examples : ‘who can say for sure they weren’t aliens or aliens or aliens? they had the appearance of aliens ; they wore black leather pants and they walked with much”

This model does not follow the sentence structure. (OpenAI GPT)

2019

“Dino-Brawl. Dinosaur-Brawl — which can be viewed in the player’s perspective — starts with Mr. D, a dino-hunter, chasing down a lone”

This model gave a name but followed it with a confusing sentence. (GPT-2)

2023

“Dino-Mite: The Mesozoic Extravaganza!”

This model used a pun for the name and provided a subtitle. (Chat-GPT)

A training transformation

Much of this recent growth stems from a new way of training AI, called the Transformers model. This method allows the technology to process large blocks of language quickly and to test the fluency of the outcome.

It originated in a 2017 Google study that quickly became one of the field’s most influential pieces of research.

To understand how the model works, consider a simple sentence: “The cat went to the litter box.”

Previously, artificial intelligence models would analyze the sentence sequentially, processing the word “the” before moving onto “cat” and so on. This took time, and the software would often forget its earlier learning as it read new sentences, said Mark Riedl, a professor of computing at Georgia Tech.

The transformers model immediately processes the relationships between words — a method called attention. New AI models can examine “cat” alongside “litter” and “box.”

To make sure the AI performs correctly, the transformers model builds in a testing step. It masks a word in the sentence to see if the AI can predict what’s missing. Additionally, companies such as OpenAI have humans rate the quality of the response. For example, if the word “cat” is masked and the computer offers “the dog went to the litter box,” it’s likely to get a thumbs down.

The model allows AI tools to ingest billions of sentences and quickly recognize patterns, resulting in more natural-sounding responses.

Another new training method, called diffusion, has also improved AI image generators such as Dall-E and Midjourney, allowing nearly anyone to create hyper-realistic photos with simple, even nonsensical, text prompts, such as: “Draw me a picture of a rabbit in outer space.”

Researchers feed these AI models billions of images, each paired with a text description, teaching the computer to identify relationships between images and words.

The diffusion method then layers “noise” — visual clutter that looks like TV static — over the images. The AI system learns to recognize the noise and subtract it until the image is once again clear.

[ AI can now create images out of thin air. See how it works.]

This process of corrupting and regenerating images teaches the AI to remove imperfections, fine tuning each response until it is crisp and sharp. It also learns the relationship between neighboring pixels, making the generated image more realistic.

AI-generated images

Images that three AI systems generated from the following prompt

A picture of a very clean living room

2016

This model generates an image so small the details are impossible to see. (Reed et al.)

2022

This model generates an image that resembles a living room, but the furniture and mirror are disfigured. (Stable Diffusion)

2023

This model generates an image with clean lines and reflections, but details like the coffee table’s legs are deformed. (Dall-E 2)

The rise of the supercomputer

These bigger, more complicated AI models require advanced computer systems. This hardware, too, has become much more powerful.

Graphics processing units, or “GPUs” — circuits commonly used for video games — offer immense computing power and excel at running tasks simultaneously, a key requirement for training artificial intelligence models.

Researchers can string thousands of GPUs together in super computers that operate in the cloud. This lets people access GPUs without building their own supercomputer — a system that makes high-powered computing cheaper and readily available to the masses.

Better quality data

Over the past several years, the data sets used to train AI have gotten bigger and better, too.

Previous data sets had lower quality content, Riedl said, like addresses or phone numbers, which aren’t helpful for learning natural speech.

Now, data sets include more news articles and book transcripts — content that shows how humans write and talk — which helps teach the AI to “speak” more fluidly.

As AI companies have trained models, they have gotten better at knowing how to create high-quality data, Riedl said. Entire organizations are now dedicated to building these data sets.

[Inside the secret list of websites that make AI like ChatGPT sound smart]

Scale AI, a San Francisco company, curates and labels training data sets for AI software. ImageNet, a data set started by Stanford University researchers, contains millions of labeled images. Common Crawl, a nonprofit that creates an open repository of web data, has developed text data sets from more than 12 years of internet scrapes. Some data sets have grown to contain billions of words.

More data sets are multimodal, meaning they combine images with text, allowing artificial intelligence systems to understand the relationship between the two. There also has been a rise in medical imaging data sets to fuel AI’s ability to spot diseases and ailments.

AI-generated responses

How three AI systems responded to the following text prompt

This is a haiku about baseball:

2018

“a man who’s going to lose, a man who doesn’t need to lose. he wants the people to win by doing something he wants, rather than because they’re being manipulated. they’re playing the game”

This model begins in a somewhat poetic place but quickly stops making sense. (OpenAI GPT)

2019

“Fantasy Baseball: In the late 1800s the Yankees and his friends had a tournament where pitcher Francisco Gohara came to San Francisco from Texas at the height of what was called the Great Depression.”

This model invents a sentence about a fake baseball player, seemingly ignoring the word “haiku.” (GPT-2)

2023

“In summer’s embrace,

Bats crack, balls soar through the air,

Baseball’s timeless grace.”

This model’s output follows the traditional 5-7-5 syllable style, though the first and third line rhyme. (Chat-GPT)

What’s next?

Experts say it’s hard to predict how much better AI will get. Major obstacles stand in the way of further development. These models are expensive to run and exact a staggering environmental toll. They confidently churn out wrong, nonsensical and sometimes biased answers, while creating lifelike images that could sow confusion.

As tech giants such as Google and Microsoft race to incorporate AI into their products, a slew of companies are trying to expand AI’s capabilities to generate video, music and create detection tools to screen artificially generated content. Most people are likely to interact with this new technology in the near future. But how useful it will be and what impact it will have on society remains to be seen.

About this story

For each AI comparison graphic, we fed AI image and text generators the same prompt and used the first result. The 2016 image model was too old to run ourselves, so we used images from the Reed paper.

The image models were: Reed et al. (2016); Stable Diffusion v1.4 (first released in late 2021 but published in 2022); and Dall-E 2 (first released in 2022 but used in 2023). The text models were OpenAI-GPT (2018); GPT-2 Large (2019); and ChatGPT (first released in 2022 but used in 2023).

Editing by Alexis Sobel Fitts, Reuben Fischer-Baum, Karly Domb Sadof and Kate Rabinowitz.

Source link

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

What's Hot

How Mobility Payment Solutions Simplify Urban Transportation

Skip Hire Mistakes To Avoid When Clearing Waste In Windsor

Tights for Women: Comfort, Style, and Everyday Confidence

How Ai Like Chatgpt And Dall-E Got Frighteningly Good So Quickly

Benefits Of Installing Awnings For Your Patio Area

Stelrad Electric Radiator: Stylish, Efficient Home Heating

Emergency Glass Repair London: Fast, Reliable, and 24/7 Service When You Need It Most

Aruna Vijay’s Inspiring Journey: Exclusive Femest Magazine Interview

How Mobility Payment Solutions Simplify Urban Transportation

Skip Hire Mistakes To Avoid When Clearing Waste In Windsor

Tights for Women: Comfort, Style, and Everyday Confidence

Are Sim Racing Steering Wheels Worth The Investment?

How Mobility Payment Solutions Simplify Urban Transportation

Skip Hire Mistakes To Avoid When Clearing Waste In Windsor

Tights for Women: Comfort, Style, and Everyday Confidence

Are Sim Racing Steering Wheels Worth The Investment?

What's Hot

How Ai Like Chatgpt And Dall-E Got Frighteningly Good So Quickly

A training transformation

The rise of the supercomputer

Better quality data

What’s next?

About this story

Related posts:

Related Posts