
Y Combinator Startup Podcast · July 1, 2025
Fei-Fei Li: Spatial Intelligence is the Next Frontier in AI
Highlights from the Episode
Fei-Fei Ligodmother of AI
00:01:23 - 00:04:32
ImageNet's origin and data-driven paradigm shift →
“
ImageNet was conceived almost 18 years ago. Time truly flies. I was a first-year assistant professor at Princeton. The world of AI and machine learning was vastly different then. There was very little data, and algorithms, especially in computer vision, didn't work well. There was no industry, and the public wasn't even aware of the term AI. Yet, a group of us, from the founding fathers of AI like John McCarthy to people like Jeff Hinton, shared an AI dream. We truly wanted machines to think and work. My personal dream was to make machines see, as seeing is a cornerstone of intelligence.
Fei-Fei Ligodmother of AI
00:05:00 - 00:08:29
The AlexNet breakthrough and convergence of AI elements →
“
Between 2009 and 2012, we published a small CVPR poster. During those three years, leading up to AlexNet, we strongly believed that data would drive AI. However, we had very little indication if our approach was working. To address this, we took two key actions. First, we open-sourced our work from the beginning, believing it was crucial for the entire research community to collaborate on this. Second, we created the ImageNet challenge. Our goal was to engage the world's brightest students and researchers in solving this problem. Each year, we released a new testing dataset for the challenge.
Fei-Fei Ligodmother of AI
00:08:55 - 00:12:37
Evolution of AI: From object recognition to scene understanding →
“
Image recognition solved the problem of identifying objects within a visual scene, like a cat or a chair. This is fundamental to visual recognition. However, since I was a graduate student entering the AI field, I've had a dream—what I once thought was a 100-year dream: storytelling of the world. When humans open their eyes, they don't just see individual objects. For example, if you open your eyes in this room, you don't just see "person, person, chair." Instead, you perceive a conference room with a screen, a stage, people, a crowd, and cameras. You can describe the entire scene. This human ability is at the foundation of visual intelligence.
Fei-Fei Ligodmother of AI
00:13:07 - 00:17:49
Spatial intelligence as the next frontier for AGI →
“
Consider vision: the ability to understand, navigate, interact with, comprehend, and communicate about the 3D world. This journey took evolution 540 million years. The first trilobite developed vision underwater 540 million years ago. Vision then triggered an evolutionary arms race. Before vision, animals were simple for half a billion years. But the next 540 million years, with the ability to see and understand the world, sparked this evolutionary arms race. Animal intelligence began to escalate. For me, solving the problem of spatial intelligence—understanding, generating, reasoning about, and acting within the 3D world—is a fundamental AI challenge.
Fei-Fei Ligodmother of AI
00:18:41 - 00:21:48
Challenges of 3D spatial intelligence versus 1D language models →
“
I really appreciate Diana emphasizing how challenging our problem is. Language is fundamentally one-dimensional; syllables come in sequence. This is why sequence-to-sequence modeling is so classic. There's something else about language that people often overlook: it's purely generative. Language doesn't exist in nature; you can't touch or see it. It literally originates from within each person's mind, a purely generative signal. While you can write it down, the generation, construction, and utility of language are inherently generative. The real world, however, is far more complex. It's three-dimensional, and with time, it becomes four-dimensional.
Fei-Fei Ligodmother of AI
00:26:35 - 00:29:25
Entrepreneurial spirit and intellectual fearlessness →
“
I love being an entrepreneur. I particularly enjoy the feeling of starting from ground zero. It's about forgetting past achievements and external opinions, then simply hunkering down to build. That's my comfort zone, and I truly love it.
Fei-Fei Ligodmother of AI
00:29:54 - 00:31:39
Hiring criteria: The importance of intellectual fearlessness →
“
One thing unifies successful individuals, and I encourage everyone to consider this. For founders hiring, this is also my primary hiring criterion: I look for intellectual fearlessness. It doesn't matter your background or the problem we're solving. The courage and fearlessness to embrace difficult challenges, commit fully, and strive to solve them is a core characteristic of successful people. I learned this from them. I specifically seek young individuals who possess this quality. As CEO at WorldApps, I prioritize this trait in my hiring process.