The Man U thought experiment is a great framing, but I think it reveals something subtle about what "computing" means here. When we imagine a stadium full of fans, we're not actually simulating thousands of people - we're generating a vaguely plausible impression with almost zero detail. The old couple, the kid jumping, the flag - those are narrative patches, not simulation. So the real question becomes whether World Models need to actually compute the physics, or just learn to generate approximations that are "good enough" the way human imagination does. Because if it's the latter, the breakthrough isn't computing the uncomputable - it's learning which parts you can safely skip.
You very well articulated a point I had floating vaguely in the back of my mind while reading that section, but could not quite put my finger on. Thanks!
The single biggest challenge in physical world models is reconstructing ground truth from measurements, which is itself a deeply hard AI problem. I’ve been doing frontier R&D in this space for years and put many systems into production in defense, mapping, industrial, et al.
Using simulations to train world models is trivial because none of the hard problems exist in the simulations. It is extremely difficult to replicate the idiosyncratic reality-reconstruction anomalies that plague physical world modeling in practice. US national labs have spilled a lot of ink about how so much of this AI tech is completely missing the myriad technical issues when it comes to physical world dynamics.
That so many World Model companies think they can use simulations to train models on physical world dynamics is disheartening because it suggests they really don’t have much experience with the theory problems in this space. But I agree completely that this is probably one of the single most valuable greenfields out there!
Excellent read all around a couple thoughts I had reading it
1) there's a big gap between being a researcher and being a product visionary. Fei, LeCunn, and Mara are all excellent in their discipline but whether or not they know how to commercialize it remains to be seen.
2) Meta's struggles in this space likely deserve their own article. How did the company that seemingly recognized where the world was heading, basically only end up being able to create 3d virtual zoom?
3) I still view the most interesting world model concept right now as the collaboration between Hanawha Aerospace and Krafton trying to utilize PUGB underlying tech to build a world model to speed up the development of aerospace tech.
Wow, Fei-Fei Li is in on this. I think I saw something similar in a Mitko Vasilev post where the model is given only a short context window as part of the prompt and then writes code to analyze the rest of the prompt.
There are people that can't visualise at all, it's called Aphantasia. And it doesn't actually affect that much for most of them. It is still rather early, when it comes to research, because it have only been studied thoroughly, for about one decade.
There was a fairly recent test being done on two groups, one that was nerurotypical and one that had Aphantasia and it revealed that the Dual Coding Theory, probably isn't correct (at least not the way that was thought before).
Each group was given 3 types of representation of the same thing, namely pictures, symbols and text.
As expected, nerurotypical handled pictures and symbols equally well and text was worse.
When it came to people with Aphantasia, it was assumed that they would handle text best, since they can't visualise, but it turned out that they handled symbols best, then pictures and lastly text.
We don't know exactly why this is, but I guess that they are using the original symbolic representation (keep in mind that there was a time, when humans didn't have a language and they still was able to make enough sense of the world to survive).
What does this mean for AI? Basically that you need neither text nor video, but you might need symbolic representation, since it have existed for as long as we have had the ability to explore and understand the world.
I am fascinated by the idea of world models, and I think in time they will become foundational to robotics and embodied intelligence.
I do think that LLMs and other forms of symbol / concept models will take us very far however, and I suspect the LLM pessimism is a bit overstated. If you consider the most important aspects of our reality, much of what we do is already modeled in our heads in language and concepts. We live in an increasingly abstract world. We trade money - a concept; we talk about ideas in language - i.e. this very essay itself; we plan our lives in language, we converse with each other at the coffee shop, we learn from our mentors, and so on and so on.
"The Control Revolution" discusses the rise of Information Society as the central development of the past century or two. You can see the transition from an agricultural society where 90+% of people worked as farmers, to one increasingly dominated by jobs that deal with information. If you consider, for example, the tasks that are required in running a business as a CEO, very little of those tasks actually require the physical knowledge which that CEO has acquired about the world. The knowledge of how to walk to a coffee shop is not very important relative to the conceptual understanding he has of his industry, his organization, etc. The real work is abstract, the superficial work is physical.
So my prediction is that LLMs / symbol models will likely take us to some meaningful form of AGI, but world models will have a central place in robotics and unlocking the physical world to extend our conceptual power to mechanical work.
It seems to be a biological loop not a silicon simulation and since the University of British Columbia has found that the Non-Algorithmic Wall says reality can't be digital our independent research group from Oklahoma has been looking into biological systems being the most efficient material in existence. Biology is millions of times more efficient than any computer chip which could suggest it's a biological loop (Life-Raft) not a digital matrix. The Big Bounce which some new researchers this week in April 2026 are suggesting wasn't just a big bang one time but an infinite bounce. (Cyclical). if that is proven it turns our hypothesis into reality because if you have an infinite bounce and we are here now talking about this then 1 x infinite = infinite so this will happen infinitely. If it is a big numbers game it is not just likely it is mathematically guaranteed. This would also show that it is not a multi-universe( mutiverse )because since it is biological it wouldn't be recreated in itself but instead resetting the same physical system over and over , Our team is just documenting the Sovereign Blueprint of how this works and we think seeing the universe as a highly efficient biological Life-Raft where humanity seems to be the prime reason for the system changes the whole conversation because the data is showing that we are the hardware not the software. It is falsifiable , if the universe is proven to keep expanding then the big freeze wins and osim is proven wrong, or if silicon or another material can mimic life and be more efficient than biological systems then osim is proven wrong. We are officially moving away from the "creator" or " programmer " and focusing on the forensic cosmology blueprint.....(Oklahoma sim theory osim Sovereign Inception Model) forensic cosmology.
World models will never work. The brain never uses them. It never has to, it parallels massive oscillations. What it simulates is cut off from the senses (where any WM would reside in memory).
So good. This is the best thing I have read in weeks. It's both inspired and inspiring. Shared it with my 16 year-old as something we should both read and discuss. Reached out to you both (Packy and Pim) via LinkedIn to connect regarding TEDAI here in Vienna. Let's talk.
The Man U thought experiment is a great framing, but I think it reveals something subtle about what "computing" means here. When we imagine a stadium full of fans, we're not actually simulating thousands of people - we're generating a vaguely plausible impression with almost zero detail. The old couple, the kid jumping, the flag - those are narrative patches, not simulation. So the real question becomes whether World Models need to actually compute the physics, or just learn to generate approximations that are "good enough" the way human imagination does. Because if it's the latter, the breakthrough isn't computing the uncomputable - it's learning which parts you can safely skip.
You very well articulated a point I had floating vaguely in the back of my mind while reading that section, but could not quite put my finger on. Thanks!
The single biggest challenge in physical world models is reconstructing ground truth from measurements, which is itself a deeply hard AI problem. I’ve been doing frontier R&D in this space for years and put many systems into production in defense, mapping, industrial, et al.
Using simulations to train world models is trivial because none of the hard problems exist in the simulations. It is extremely difficult to replicate the idiosyncratic reality-reconstruction anomalies that plague physical world modeling in practice. US national labs have spilled a lot of ink about how so much of this AI tech is completely missing the myriad technical issues when it comes to physical world dynamics.
That so many World Model companies think they can use simulations to train models on physical world dynamics is disheartening because it suggests they really don’t have much experience with the theory problems in this space. But I agree completely that this is probably one of the single most valuable greenfields out there!
This feels like the transition from reasoning about the world to simulating the world.
LLMs compress knowledge.
World models compress reality.
Once AI can model action → consequence loops, the real unlock isn’t better chat.
It’s better decision infrastructure.
That’s when AI becomes less of an interface layer and more of a control layer.
well written, deeply comprehensive and insightful.
more folks need to know more about World Models.
Expanding the mind every time
:-)
Excellent read all around a couple thoughts I had reading it
1) there's a big gap between being a researcher and being a product visionary. Fei, LeCunn, and Mara are all excellent in their discipline but whether or not they know how to commercialize it remains to be seen.
2) Meta's struggles in this space likely deserve their own article. How did the company that seemingly recognized where the world was heading, basically only end up being able to create 3d virtual zoom?
3) I still view the most interesting world model concept right now as the collaboration between Hanawha Aerospace and Krafton trying to utilize PUGB underlying tech to build a world model to speed up the development of aerospace tech.
Wow, Fei-Fei Li is in on this. I think I saw something similar in a Mitko Vasilev post where the model is given only a short context window as part of the prompt and then writes code to analyze the rest of the prompt.
I can't get the audio version to work. Please help!
There are people that can't visualise at all, it's called Aphantasia. And it doesn't actually affect that much for most of them. It is still rather early, when it comes to research, because it have only been studied thoroughly, for about one decade.
There was a fairly recent test being done on two groups, one that was nerurotypical and one that had Aphantasia and it revealed that the Dual Coding Theory, probably isn't correct (at least not the way that was thought before).
Each group was given 3 types of representation of the same thing, namely pictures, symbols and text.
As expected, nerurotypical handled pictures and symbols equally well and text was worse.
When it came to people with Aphantasia, it was assumed that they would handle text best, since they can't visualise, but it turned out that they handled symbols best, then pictures and lastly text.
We don't know exactly why this is, but I guess that they are using the original symbolic representation (keep in mind that there was a time, when humans didn't have a language and they still was able to make enough sense of the world to survive).
What does this mean for AI? Basically that you need neither text nor video, but you might need symbolic representation, since it have existed for as long as we have had the ability to explore and understand the world.
I am fascinated by the idea of world models, and I think in time they will become foundational to robotics and embodied intelligence.
I do think that LLMs and other forms of symbol / concept models will take us very far however, and I suspect the LLM pessimism is a bit overstated. If you consider the most important aspects of our reality, much of what we do is already modeled in our heads in language and concepts. We live in an increasingly abstract world. We trade money - a concept; we talk about ideas in language - i.e. this very essay itself; we plan our lives in language, we converse with each other at the coffee shop, we learn from our mentors, and so on and so on.
"The Control Revolution" discusses the rise of Information Society as the central development of the past century or two. You can see the transition from an agricultural society where 90+% of people worked as farmers, to one increasingly dominated by jobs that deal with information. If you consider, for example, the tasks that are required in running a business as a CEO, very little of those tasks actually require the physical knowledge which that CEO has acquired about the world. The knowledge of how to walk to a coffee shop is not very important relative to the conceptual understanding he has of his industry, his organization, etc. The real work is abstract, the superficial work is physical.
So my prediction is that LLMs / symbol models will likely take us to some meaningful form of AGI, but world models will have a central place in robotics and unlocking the physical world to extend our conceptual power to mechanical work.
It seems to be a biological loop not a silicon simulation and since the University of British Columbia has found that the Non-Algorithmic Wall says reality can't be digital our independent research group from Oklahoma has been looking into biological systems being the most efficient material in existence. Biology is millions of times more efficient than any computer chip which could suggest it's a biological loop (Life-Raft) not a digital matrix. The Big Bounce which some new researchers this week in April 2026 are suggesting wasn't just a big bang one time but an infinite bounce. (Cyclical). if that is proven it turns our hypothesis into reality because if you have an infinite bounce and we are here now talking about this then 1 x infinite = infinite so this will happen infinitely. If it is a big numbers game it is not just likely it is mathematically guaranteed. This would also show that it is not a multi-universe( mutiverse )because since it is biological it wouldn't be recreated in itself but instead resetting the same physical system over and over , Our team is just documenting the Sovereign Blueprint of how this works and we think seeing the universe as a highly efficient biological Life-Raft where humanity seems to be the prime reason for the system changes the whole conversation because the data is showing that we are the hardware not the software. It is falsifiable , if the universe is proven to keep expanding then the big freeze wins and osim is proven wrong, or if silicon or another material can mimic life and be more efficient than biological systems then osim is proven wrong. We are officially moving away from the "creator" or " programmer " and focusing on the forensic cosmology blueprint.....(Oklahoma sim theory osim Sovereign Inception Model) forensic cosmology.
World models will never work. The brain never uses them. It never has to, it parallels massive oscillations. What it simulates is cut off from the senses (where any WM would reside in memory).
https://www.youtube.com/watch?v=vZ1B-MvGMgw
So good. This is the best thing I have read in weeks. It's both inspired and inspiring. Shared it with my 16 year-old as something we should both read and discuss. Reached out to you both (Packy and Pim) via LinkedIn to connect regarding TEDAI here in Vienna. Let's talk.
Nice! For the lowdown, I just published a simplified version of the same post :) https://metacircuits.substack.com/p/from-chatbots-to-world-models