OpenAI’s Mark Chen on building an AI System for Generating Realistic Images and Art.
Given the growing interest in Generative AI, we revisit a conversation with Mark Chen, Research Scientist at OpenAI and part of the team behind DALL·E 2, a new AI system that can create realistic images and art based on natural language descriptions. We discussed novel applications of DALL·E, CLIP and other key research developments that led them to DALL·E 2. In addition, we explore the construction of DALL·E, including the data sources utilized, the implemented safety and quality assurance measures, and the machine learning models needed for DALL·E 2.
In general, at OpenAI, we train on models on a large number of GPUs. … Ray is an amazing tool. I don’t know too much about the specifics of how Ray is integrated but I think we have parts of their system built into certain components that we built on our own infrastructure. But yes, we think highly of Ray.
Highlights in the video version:
- Introduction to Mark Chen
What is DALL·E and who are the target users?
Use cases, evaluating models, training data, and quality checks
Evolution from DALL·E 1 to Clip to DALL·E 2
DALL·E , Clip introduced, and diffusion models
Origin story of DALL·E and describe it now
Do you build your own tools for training?
Importance of clean and fair training data
Broader trends and transformers
- A video version of this conversation is available on our YouTube channel.
- Holistic Evaluation of Language Models
- Roy Schwartz: Efficient Methods for Natural Language Processing
- Barret Zoph and Liam Fedus: Efficient Scaling of Language Models
- Connor Leahy and Yoav Shoham: Large Language Models
- Foundation Models: A Primer for Investors and Builders
- Jack Clark: The 2022 AI Index
- Piotr Żelasko: The Unreasonable Effectiveness of Speech Data
- fastdup: Introducing a new free tool for curating image datasets at scale
If you enjoyed this episode, please support our work by encouraging your friends and colleagues to subscribe to our newsletter:
[Image: Cliff Notes by Ben Lorica, generated with images from DALL-E.]