❤️ Become The AI Epiphany Patreon ❤️ ►

In this video I cover DALL-E or “Zero-Shot Text-to-Image Generation” paper by OpenAI team.

They train a VQ-VAE to learn compressed image representations and then they train an autoregressive transformer on top of that discrete latent space and BPEd text.

The model learns to combine distinct concepts in a plausible way, image to image capabilities emerge, etc.

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
✅ Paper:

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

⌚️ Timetable:

00:00 What is DALL-E?
03:25 VQ-VAE blur problems
05:15 transformers, transformers, transformers!
07:10 Stage 1 and Stage 2 explained
07:30 Stage 1 VQ-VAE recap
10:00 Stage 2 autoregressive transformer
10:45 Some notes on ELBO
13:05 VQ-VAE modifications
17:20 Stage 2 in-depth
23:00 Results
24:25 Engineering, engineering,…

Create Content Fast with AI: read more about Frase

Artificial intelligence makes it fast & easy to create content for your blog, social media, website, and more! Jasper is the AI Content Platform that helps you and your team break through creative blocks to create amazing, original content 10X faster.

Special Offer: Get 10,000 bonus credits

WooCommerce B2B Marketplace Sp… Previous post WooCommerce B2B Marketplace Sp…
Joomla gallery, joomla lightbo… Next post Joomla gallery, joomla lightbo…

12 thoughts on “DALL-E: Zero-Shot Text-to-Imag…

  1. Are they planning to release the code for this? The github they have up doesn't allow text input (not sure what the point of even having it without any form of input is). I'd like to try this for myself.

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.

error: Content is protected !!