Mark Chen on building AI models for image generation.
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Mark Chen is a Research Scientist at OpenAI and part of the team behind DALL·E 2, a new AI system that can create realistic images and art based on natural language descriptions. We discussed novel applications of DALL·E, key research developments that led them to DALL·E 2. We also delve into how DALL·E is built including data they use, the safety and quality assurance tests they have in place, and the ML models needed to make DALL·E 2 work.
Mark Chen:
In general, at OpenAI, we train on models on a large number of GPUs. … Ray is an amazing tool. I don’t know too much about the specifics of how Ray is integrated but I think we have parts of their system built into certain components that we built on our own infrastructure. But yes, we think highly of Ray.
Active Learning lifecycle (from “DALL·E 2 Pre-Training Mitigations”):
Highlights in the video version:
- Introduction to Mark Chen
What is DALL·E and who are the target users?
Use cases, evaluating models, training data, and quality checks
Evolution from DALL·E 1 to Clip to DALL·E 2
DALL·E , Clip introduced, and diffusion models
Origin story of DALL·E and describe it now
Do you build your own tools for training?
Importance of clean and fair training data
Broader trends and transformers
Related content:
- A video version of this conversation is available on our YouTube channel.
- Jack Clark: The 2022 AI Index
- Hilary Mason: Narrative AI
- Nic Hohn and Max Pumperla: Reinforcement Learning in Real-World Applications
- Leo Meyerovich: The Graph Intelligence Stack
- Connor Leahy and Yoav Shoham: Large Language Models
Attend Ray Summit: Save 25% on your pass with the discount code Ben25
[Image: A high-level overview of unCLIP, from “Hierarchical Text-Conditional Image Generation with CLIP Latents”.]