How DALL·E works

Mark Chen on building AI models for image generation.

SubscribeApple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.

Mark Chen is a Research Scientist at OpenAI and part of the team behind DALL·E 2, a new AI system that can create realistic images and art based on natural language descriptions.  We discussed novel applications of DALL·E, key research developments that led them to DALL·E 2. We also delve into how DALL·E is built including data they use, the safety and quality assurance tests they have in place, and the ML models needed to make DALL·E 2 work.

To learn more about Ray and how to scale machine learning applications, attend the Ray Summit (San Francisco / Aug 23-24)

Mark Chen:

In general, at OpenAI, we train on models on a large number of GPUs. … Ray is an amazing tool. I don’t know too much about the specifics of how Ray is integrated but I think we have parts of their system built into certain components that we built on our own infrastructure. But yes, we think highly of Ray.

Active Learning lifecycle (from “DALL·E 2 Pre-Training Mitigations”):

Highlights in the video version:

Related content:

Attend Ray Summit: Save 25% on your pass with the discount code Ben25



[Image: A high-level overview of unCLIP, from “Hierarchical Text-Conditional Image Generation with CLIP Latents”.]